Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I encourage everyone to RTFA and not just respond to the headline. This really is a glimpse into where the future is going.

I've been saying "the last job to be automated will be QA" and it feels more true every day. It's one thing to be a product engineer in this era. It's another to be working at the level the author is, where code needs to be verifiable. However, once people stop vibing apps and start vibing kernels, it really does fundamentally change the game.

I also have another saying: "any sufficiently advanced agent is indistinguishable from a DSL." I hadn't considered Lean in this equation, but I put these two ideas together and I feel like we're approaching some world where Lean eats the entire agentic framework stack and the entire operating system disappears.

If you're thinking about building something today that will still be relevant in 10 years, this is insightful.



There are still no successful useful vibe codes apps. Kernels are pretty far away I think.


This is a very strange statement. People don't always announce when they use AI for writing their software since it's a controversial topic. And it's a sliding scale. I'm pretty sure a large fraction of new software has some AI involved in its development.


> new software has some AI involved in its development.

A large part of it is probably just using it as a better search. Like "How do I define a new data type in go?".


I strongly agree with this. The only place where AI is uncontroversial is web search summaries.

The real blockers and time sinks were always bad/missing docs and examples. LLMs bridge that gap pretty well, and of course they do. That's what they're designed to be (language models), not an AGI!

I find it baffling how many workplaces are chasing perceived productivity gains that their customers will never notice instead of building out their next gen apps. Anyone who fails to modernize their UI/UX for the massive shift in accessibility about to happen with WebMCP will become irrelevant. Content presentation is so much higher value to the user. People expect things to be reliable and simple. Especially new users don't want your annoying onboarding flow and complicated menus and controls. They'll just find another app that gives them what they want faster.


Apps are a strange measure because there aren't really any new, groundbreaking ones. PCs and smartphones have mostly done what people have wanted them to do for a while.


There are plenty of ground breaking apps but they aren't making billions of advertising revenue, nor do they have large numbers. I honestly think torrent applications (and most peer to peer type of stuff) are very cool and very useful for small-medium groups but it'll never scale to a billion user thing.

Do agree it's a weird metric to have, but can't think of a better one outside of "business" but that still seems like a poor rubric because the vast majority of people care about things that aren't businesses and if this "life altering" technology basically amounts to creating digital slaves then maybe we as a species shouldn't explore the stars.


I think this might miss the point. We put off upgrading to an new RMM at work because I was able to hack together some dashboards in a couple days. It's not novel and does exactly what we need it to do, no more. We don't need to pay 1000's of dollars a month for the bloated Solarwinds stack. We aren't saving lives, we're saving PDFs so any arguments about 5 9s and maintainability are irrelevant. LLMs are going to give us on demand, one off software. I think the SaaS market is terrified right now because for decades they've gouged customers for continual bloat and lock in that now we can escape from. In a single day I was able to build an RMM that fits our needs exactly. We don't need to hire anyone to maintain it because it's simple, like most business applications should be, but SV needs to keep complicating their offerings with bloat to justify crazy monthly costs that should have been a one time purchase from the start. SV shot itself in the face with AI.


Define "successful"?

Does it need to be HN-popular or a household name? Be in the news?

Or something that saves 50% of time by automating inane manual work from a team?


Name 3 apps that are

1. widely considered successful 2. made by humans from scratch in 2025

It looks like humans and AI are on par in this realm.


To be fair, Claude Code is vibe-coded. It's a terrible piece of software from an engineering (and often usability) standpoint, and the problems run deeper than just the choice of JavaScript. But it is good enough for people to get what they want out of it.


But also, based on what I have heard of their headcount, they are not necessarily saving any money by vibecoding it - it seems like their productivity per programmer is still well within the historical range.

That isn’t necessarily a hit against them - they make an LLM coding tool and they should absolutely be dogfooding it as hard as they can. They need to be the ones to figure out how to achieve this sought-after productivity boost. But so far it seems to me like AI coding is more similar to past trends in industry practice (OOP, Scrum, TDD, whatever) than it is different in the only way that’s ever been particularly noteworthy to me: it massively changes where people spend their time, without necessarily living up to the hype about how much gets done in that time.


> But it is good enough for people to get what they want out of it.

This is the ONLY point of software unless you’re doing it for fun.


> I encourage everyone to RTFA and not just respond to the headline.

This is an example of an article which 'buries the lede'†.

It should have started with the announcement of the new zlib autoformalization (!) https://leodemoura.github.io/blog/2026/02/28/when-ai-writes-... to get you excited.

Then it should have talked about the rest - instead of starting with rather graceless and ugly LLM-written generic prose about AI topics that to many readers is already tiresomely familiar and doubtless was tldr for even the readers who aren't repelled automatically by that.

† or in my terms, fails to 'make you care': https://gwern.net/blog/2026/make-me-care


I am as enthusiastic about formal methods as the next guy, but I very much doubt any LLM-based technique will make it economical to write a substantial fraction of application software in Lean. The LLM can play a powerful heuristic role in searching for proof-bearing code in areas where there is good training data. Unfortunately those areas are few and far between.

Moreover, humans will still need to read even rigorously proved code if only to suss out performance issues. And training people to read Lean will continue to be costly.

Though, as the OP says, this is a very exciting time for developing provably correct systems programming.


LLMs are writing non-trivial math proofs in Lean, and software proofs tend to be individually easier than proofs in math, just more tedious because there's so much more of them in any non-trivial development.

Some performance issues (asymptotics) can be addressed via proof, others are routinely verified by benchmarking.


This assumes everything about current capabilities stay static, and it wasn't long ago before LLMs couldn't do math. Many were predicting the genAI hype had peaked this time last year.

If you want it to be a question of economics, I think the answer is in whether this approach is more economical than the alternative, which is having people run this substrate. There's a lot of enthusiasm here and you can't deny there has been progress.

I wouldn't be so quick to doubt. It costs nothing to be optimistic.


> and it wasn't long ago before LLMs couldn't do math

They still can't do math.


Pro models won gold at the international math olympiads?


[*] According to cloud LLM provider benchmarks.


They have trouble adding two numbers accurately though


Why are they expected to?


If you believe the "AGI is just around the corner" hype...


Going to be hard to convince normies it can do harder things than that if it can't do that


> "any sufficiently advanced agent is indistinguishable from a DSL."

I don't quite follow but I'd love to hear more about that.


If you give an agent a task, the typical agentic pattern is that it calls tools in some non-deterministic loop, feeding the tool output back into the LLM, until it deems the task complete. The LLM internalizes an algorithm.

Another way of doing it is the agent just writes an algorithm to perform the task and runs it. In this world, tools are just APIs and the agent has to think through its entire process end to end before it even begins and account for all cases.

Only latter is turing complete, but the former approaches the latter as it improves.



No i get the clarke reference. But how is an agent a dsl?


Maybe not an agent exactly but I can see an agentic application is kind of like a dsl because the user space has a set of queries and commands they want to direct the computer to take action but they will describe those queries and commands in English and not with normal programming function calls


My read was roughly that agents require constraining scaffolding (CLAUDE.md) and careful phrasing (prompt engineering) which together is vaguely like working in a DSL?


If the llm is able to code it, there is enough training data that youight be better off in a different language that removes the boilerplate.


> RTFA

Sigh. Is there any LLM solution for HN reader to filter out all top-level commenters that hadn't RTFA? I don't need the (micro-)shitstorms that these people spawn, even if the general HN algo scores these as "interesting".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: