Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wonder if I will be able to read this post in 3 years. I'm sure Facebook will have deleted it by then and the link will get dead in 6 month already for sure (first break it, maybe fix it later)

Is archive.org even allowed (by Facebook's robots.txt) to archive this?

It's sad if technical people post important information on Facebook

Edit: facebook's robots.txt has this: User-agent: *, Disallow: /. :-(



Now that Facebook use clearly designed URLs (/bram.cohen/posts/10152387480820183 in this case) I have a lot more faith in them keeping this kind of thing working. It's extremely valuable content for them, especially with their recent emphasis on the timeline and capturing the story of people's entire lives.


One possibility is that Facebook stops existing. Now, I won't speculate what the probability is or isn't - but none the less, it's a possiblity.

On the other hand, there's also possiblities such as; Facebook losing or deleting the content, Bram Cohen deleting the content or his account.


If you try to manually crawl the post with Internet Archive's "liveweb" project (which inserts stuff into Wayback Machine for live lookups) you'll get:

"Page cannot be crawled or displayed due to robots.txt."

So no, the Internet Archive will most likely not archive public Facebook posts - because of Facebooks robots.txt. The IA respects and honours the domains robots.txt.

I've taken a snapshot of the post with my own script which will put it into a IA Wayback Machine friendly format (WARC).




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: