I believe D. A. Jimenez and C. Lin, "Dynamic branch prediction with perceptrons" is the paper which introduced the idea. It's been significantly refined since and I'm not too familiar with modern improvements, but B. Grayson et al., "Evolution of the Samsung Exynos CPU Microarchitecture" has a section on the branch predictor design which would talk about/reference some of those modern improvements.
> Several commenters suggested the original essay was written by an LLM. They were half right. Both that essay and this one were written with Claude as a drafting partner. I directed the argument; the LLM helped with prose. I mention this not as confession but as demonstration: the human brought the utility function, the machine brought the compute. If that division of labour bothers you, I’d suggest the discomfort says more about the Bitter Lesson than about my writing process.
Dammit. “Helping with prose” sounds like “getting a better grade from my English teacher”.
The quality of your prose is important because it increases the effective bandwidth between your thoughts and the reader.
Either the coherent thoughts are there or they’re not. Using an LLM to tune your prose is very much akin to those awful AI-assisted conversions of standard def television to 4K: Inventing details and nonsense structure to fill space.
Very cool idea. Interested to see how this progresses.
One question: how worried are you about over-training on this particular dataset? i.e. instead of generalizing you lean more toward memorization? Obviously you leave out a validation set but since you're meta-optimizing the model itself by its performance on the validation dataset you're still at risk of over-fitting.
yes, good point. right now, it's somewhat hard to overfit because the meta-optimization extracts tiny bits of information. but over time, we will switch the validation set to some other random subset of the FineWeb or even entirely OOD datasets!
The question is not if but when. I hope the project authors acknowledge the problem directly: it is not merely a risk; it is a statistical certainty given enough time. So, what's the plan?
At the very least, track it. How will the project maintainers instrument this?
I got a cheap Chinese one (no camera, wifi) in 2024 and it's been a game changer. Yeah it's kind of dumb but it runs every day and picks up an unholy amount of dust, cat hair, and the like. Maybe if you were already vacuuming every day they're pretty useless but for me it's been night and day. As another commenter said, they're also surprisingly repairable, and I bought a ton of spare parts before the tariffs went in.
While I don't think you're wrong about the orientalist elements in Western cyberpunk, consider that Japan also produced two of the seminal and genre-defining works of cyberpunk (Akira and Ghost in the Shell).
reply