Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Fundamentally, I’m more optimistic on how far current approaches can scale. I see no reason why RL could not be used to train models to use memory, and fine-tuning already works, it’s just expensive.

The continual learning we get may be a bit hamfisted, and not fit into a neat architecture, but I think we could actually see it work at scale in the next few years. Whereas new techniques like what Yann Lecun have demonstrated still live heavily in the realm of research. Cool, but not useful yet.

Fine tuning is also not so limited as you suggest. For one, we don’t need to fine tune the same model over and over, you can just start with a frontier model each time. And two, modern models are much better at generating synthetic data or environments for RL. This could definitely work, but it might require a lot of work in data collection and curation, and the ROI is not clear. But if large companies continue to allocate more and more resources to AI in the next few years, I could see this happening.

OpenAI already has a custom model service, and labs have stated they already have custom models built for the military (although how custom those models are is unclear). It doesn’t seem like a huge leap to also fine-tune models over a companies internal codebases and tooling. Especially for large companies like Google, Amazon, or Stripe that employ tens of thousands of software engineers.

 help



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: