That's how I feel about clones in general. Ok, I owned a real Commodore 64, but all my PCs during my formative years were clones.
Actually, this wasn't such a good example since I believe PC clones were legal. Let me change it to something more controversial:
I feel the same way about software piracy. All my games and software growing up were pirated. I didn't even understand this, because you got software by going to a store and buying it, e.g. C64 games... but it was all warez. Same with DOS or Windows (which one usually got from someone else). All of my early programming languages were pirated too: QuickBasic, GW Basic, Turbo C, Turbo Pascal, etc.
And this is how people got acquainted with computers, and then got into programming (games, systems, business software) as a job. So piracy was a net win.
I do recall the assistant at the store when I first showed up said wait for the upcoming Commodore 64 more stuff for much less money. But as a 14 year old I wasn't ready to wait after being exposed to Apple the summer before. That professor really advocated for the Atari 800 and I really considered it, but the Apple's easier to copy floppies along with a much larger user base won me over.
As an Unreal game dev what I’ve wanted to remake in QT is the Epic Games Launcher.
I think Epic may be underway on this now but if you did a good enough job I feel like there may still be a window to pitch them on acquiring your work.
You will still get hallucinations. With RAG you use the vectors to aid in finding things that are relevant, and then you typically also have the raw text data stored as well. This allows you to theoretically have LLM outputs grounded in the truth of the documents. Depending on implementation, you can also make the LLM cite the sources (filename, chunk, etc).
The approach that has worked for us in production is correction during generation, not after.
The model verifies its output against the rules in the prompt as it generates and corrects itself within the same API call — no retries, no external validator. If there are still failures the model cannot fix at runtime, those are explicitly flagged instead of silently producing wrong output.
This does not mean hallucinations are completely solved. It turns them into a measurable engineering problem. You know your error rate, you know which outputs failed, and you can drive that rate down over time with better rules. The system can also self-learn and self-improve over time to deliver better accuracy.
I think generally, SFT is like giving the LLM increased intuition in specific areas. If you combine this with RAG, it should improve the performance or accuracy. Sort of like being a lawyer and knowing something is against the law by intuition, but needing the library to cite a specific case or statute as to why.
Absolutely not sounds like a be careful what you wish for Black Mirror episode where you wake up trapped in some simulation you can’t break free from but it’s ok because you signed on the dotted line to donate your mind and body to science.
reply