Hacker Newsnew | past | comments | ask | show | jobs | submit | more random3's commentslogin

I built a flash crawler to index all Flash while at Adobe. It started with Alexa top 1M I think then crawled. This was 2008-2010 I think so we had to do a lot of custom stuff, but we basically crawled then ran a headless Firefox with a custom headless Flash player that dumped a ton of data so also analyzed every flash at runtime and indexed all of that.

We built a dedicated cluster in a colocation center in Bucharest to handle all of this. Had issues with max floor weights and what not. Then had to upgrade the RAM on on the cluster. No remote hands. Every operation was a trip to a really cold place.

Used a lot of early stage stuff like Nutch, Hadoop, HBase etc. Everything was then processed and dumped to an SQL database with a nice UI on top. It took a few weeks to set it up, then we passed it to a team of interns that built the SQL database and UI on top. They learned a ton of stuff. Some are now in the Bay Area.

The tool uncovered a ton of security issues.

It was fun building it. I wonder if Adobe kept the data. It could be useful and/or good donation for the Computer History Museum.


Thanks for sharing. It's stories like these I've read since childhood that got me into this. Those little adventures into remote places to work on some computers. This was my version of Indiana jones.

But everyone's in an AWS world right now.


It looks like there's a a bit of reversal in some areas (e.g. ML) and it may make sense to have more geographically distributed (edge) compute so maybe we'll get more diversity in the currently cloud-dominated space.

This said, it was always cool when we could control the entire stack, but the reality was that once we scaled things up, we had to throw things over the fence to IT, DevOps, SRE and whatever name evolutions there were and the reality is AWS/GCE/Azure made things easier than dealing with these teams internally.


>we had to throw things over the fence to IT, DevOps, SRE and whatever name evolutions there were and the reality is AWS/GCE/Azure made things easier than dealing with these teams internally

Anyone who was a dev during the "everyone is devops" fad for a while knows the pain of building something with these kinds of dependencies. Being able to claw back my time from operations on my company's dime is enticing.


Very interesting. What was the objective?


This was around when we were trying to get Flash to work on the first iPhone, so we had a hackathon for a week. Since I was a distributed systems "hacker", I ended up doing what was needed :) and there were lots of questions related to the sizing of flash on web pages and what not. That's what started it - I simple python script that I refined during the hackathon to get the embed parameters etc.

But once I started processing the data, it became a thing and we made a small cross-team team to get this going. We eventually expanded the effort in a few different directions and wanted to do a Flash analytics, but ended up with the internal tool only due to privacy concerns.


I remember using that tool internally! Personally I think I only used it to get stats of which features/APIs were popular. But I think other teams used it for QA/conformance, like finding content that occurred in the wild but wasn't covered by test cases.


Hahaha. Always cool to find users of the tools/products you build, including the obscure ones, and on HN no less :))


This has been a thought theme throughout my career and have a good set of scenarios I never ended up publishing.

It's not just the most "elaborate system". The same thing happens in so many other ways. For example a good/simple solution is one and done. Whereas a complex one will be an interminable cause of indirect issued down the road. With the second engineer being the one fixing them.

Then there's another pattern of the 10x (not the case with all 10x-ers) seeding or asked to "fix" other projects, then moving on to the next, leaving all the debt to the team.

It's really an amazing dynamic that can be studied from a game theoretical perspective. It's perhaps one of the adjacent behaviors that support the Gervais principle.

It's also likely going to be over soon, now that AI is normalizing a lot of this work.


maybe you're overthinking it a little. You could make it of a default setup like the one you use for the sandbox, or some curated fast-loading one


Its called “agreeing in principle” - useful for negotiations :)


He'll bank the goodwill and take his time then quietly sign the deal the administration wants.


Can you elaborate what you mean?



He means that he agrees just to bait a new, better deal.


Gotta love PR embracing the many definitions of "made in"


It’s worked for the automotive industry for decades.


When the system works against you, why not


Surely, someone high up asked, "What is the least amount of work we have to do in order to not pay tariffs?"


you say that as if they were supposed to do something else.


and everythign ended in "this is the way!"


likely not. Being able to read and understand is a matter of skill though. There are many technical terms there that may make it unreadable for non-technical audience. But you can solve that by having an AI explain it to you.


It's not my skills. I could decipher it if I spent enough time (and had plain text).

the presentation is bad.

verbosity.

it takes many words for the writer to make a point.

that darn cat.


I didn't find this to be the case at all. It's quite concise and clear. There's just a lot of information presented.


Are you going to ignore the whole operating system emulation which plays audio when you enter it? I think the article itself is fine too but if this guy wanted to reach more people this should have been plain text .


What irked me was the main text was grey on grey, low contrast. While the code boxes were high contrast. And on the phone screen that stupid cat.

Hey, I got my first downvotes ever for my nasty comment!


"there is no moat" usually mean "we have no moat" or "we want you to believe we have no moat". There are always moats, like being directly in front of eyes and thumbs (Apple) or having extensive data (Google) along hardware production capabilities, datacenters, and tons of money.


Code is cheaper. Simple code is cheap. More complex code may not be cheaper.

The reason you pay attention to details is because complexity compounds and the cheapest cleanup is when you write something, not when it breaks.

This last part is still not fully fleshed out.

For now. Is there any reason to not expect things to improve further?

Regardless, a lot of code is cheap now and building products is fun regardless, but I doubt this will translate into more than very short-term benefits. When you lower the bar you get 10x more stuff, 10x more noise, etc. You lower it more you get 100x and so on.


That is a strange (dumb) framing. It does read business as usual when people either get overloaded or get the opportunity to be lazy with things they don’t want to do in the first place but have to in order to earn a wage.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: