One of the things that a lot of LLM scrapers are fetching are git repositories. They could just use git clone to fetch everything at once. But instead, they fetch them commit by commit. That's about as static as you can get, and it is absolutely NOT a non-issue.
No... Basically all git servers have to generate the file contents, diffs etc. on-demand because they don't store static pages for every single possible combination of view parameters. Git repositories also typically don't store full copies of all versions of a file that have ever existed either; they're incremental. You could pre-render everything statically, but that could take up gigabytes or more for any repo of non-trivial size.
I did not imply that it does, I meant to have a budget allocated for 'unauthenticated deep history queries', when it's over it's over and you only handle dynamic fetching for authorized users until cooldown.
Is it pretty? No, but it also is a pretty niche thing overall (git repo storage).
You don't lower the cost of killing by improved targeting, you lower it by thugs shooting people in broad daylight with no consequences.
I understand the argument that moving the decision making power to a black box would clear conscience of the operator, yadda yadda yadda, but newsflash, price of human life is falling so quick, that I think we're far beyond the point where it matters.
maybe because they are trying to act ethically toward a murderous neighbor that is conducting asymmetric warfare and those are the best tools to accomplish that.
or, maybe because they came to the conclusion that the repercussions on the world stage of even more horrific media coming out of Gaza is too steep of a price to pay.
i don't know which, but i do know it is naive to conclude that because they COULD end the war in a day and did not, they are driven by morality and ethical concerns rather than pragmatic ones.
I didn’t say they were driven by morality, though I’m sure they are more so than Hamas. I just think what they’re doing is ethnic cleansing (which is not a compliment) rather than genocide. I’m actually pretty sure that most of the people who call it by “genocide” don’t know the difference between the two.
because it would be admitting to the world that it has said weapons.
Israel has always said it doesn't have nuclear weapons. They would have absolutely zero sympathy going forward from any major nation if they decided to drop a nuclear bomb on Gaza, and they want that land so rendering that land uninhabitable might not be a good idea.
by dumb munitions I mean older bombs vs JDAM and alike.
Anyone who seriously speaks words 'nuclear weapon' and 'gaza' together is basically admitting he has 0 clue about the situation and is uninformed larper for either side.
I populated my Instagram/FB Account with my interests (I mainly have the accounts to follow local racing leagues / marketplaces), and feeds are mostly cars and tech stuff, seldom do I see any thirst traps in it (including reels).
>As a contrast, in the early web, plenty of people were hosting their own website, and messing around with all the basic tools available to see what novel thing they could create
I'm hoping that the already full of slop centralized platforms now with LLM fueled implosion will overflow and lead to a renaissance of sorts for small and open web, niche communities and decoupling from big tech.
It's already gaining traction among the young, as far as I can see.
Ask HN: How does one archive websites like this without being a d-ck?
I want to save this for offline use, but I think recursive wget is a bit poor manners, is there established way one should approach it, get it from archive somehow?
As long as you don't mirror daily and use rate limit there is no reason you would be a dick doing it.
FWIW I have a local copy of sheldown brown's website I mirrored a few years back when they announced the shop would close as I expected they would eventually shutdown the website too. I don't know if his wife is still alive, she had her own space nor if someone has taken over the maintenance.
A single user's one-off recursive wget seems fine? Browsers also support it iirc, individual pages at very least (and saved to the same place, the links will work).
No doubt it's already in many archive sites though, you could just fetch from them instead of the original?
In the old-web days, I just used wget with slow pacing (and by "pacing" I mean: I don't need it to be done today or even this week, so if it takes a rather long time then that's fine. Slow helped keep me from mucking up the latency on my dial-up connection, too.)
I don't think that's being a dick for old-web sites that still exist today. Most of the information is text, the photos tend to be small, it's all generally static (ie, light-weight to serve), and the implicit intent is for people to use it.
But it's pretty slow-moving, so getting it from archive.org would probably suffice if being zero-impact is the goal.
(Or, you know: Just email the dude that runs it like it's 1998 again, say hi, and ask. In this particular instance, it's still being maintained.)
>Blaming DPRK's "economic mismanagement" while making no mention of the Western sanctions on DPRK which are the cause of its catastrophic economic and humanitarian situation
The catastrophic humanitarian situation IS the cause for the sanctions.
Depends where you at