I encourage everyone to RTFA and not just respond to the headline. This really i...

fmbb · 2026-03-03T22:51:51 1772578311

There are still no successful useful vibe codes apps. Kernels are pretty far away I think.

bonoboTP · 2026-03-03T23:37:39 1772581059

This is a very strange statement. People don't always announce when they use AI for writing their software since it's a controversial topic. And it's a sliding scale. I'm pretty sure a large fraction of new software has some AI involved in its development.

qsera · 2026-03-04T01:46:19 1772588779

> new software has some AI involved in its development.

A large part of it is probably just using it as a better search. Like "How do I define a new data type in go?".

sublinear · 2026-03-04T03:14:46 1772594086

I strongly agree with this. The only place where AI is uncontroversial is web search summaries.

The real blockers and time sinks were always bad/missing docs and examples. LLMs bridge that gap pretty well, and of course they do. That's what they're designed to be (language models), not an AGI!

I find it baffling how many workplaces are chasing perceived productivity gains that their customers will never notice instead of building out their next gen apps. Anyone who fails to modernize their UI/UX for the massive shift in accessibility about to happen with WebMCP will become irrelevant. Content presentation is so much higher value to the user. People expect things to be reliable and simple. Especially new users don't want your annoying onboarding flow and complicated menus and controls. They'll just find another app that gives them what they want faster.

dehrmann · 2026-03-04T01:38:02 1772588282

Apps are a strange measure because there aren't really any new, groundbreaking ones. PCs and smartphones have mostly done what people have wanted them to do for a while.

shimman · 2026-03-04T01:54:59 1772589299

There are plenty of ground breaking apps but they aren't making billions of advertising revenue, nor do they have large numbers. I honestly think torrent applications (and most peer to peer type of stuff) are very cool and very useful for small-medium groups but it'll never scale to a billion user thing.

Do agree it's a weird metric to have, but can't think of a better one outside of "business" but that still seems like a poor rubric because the vast majority of people care about things that aren't businesses and if this "life altering" technology basically amounts to creating digital slaves then maybe we as a species shouldn't explore the stars.

tempaccount5050 · 2026-03-03T23:41:56 1772581316

I think this might miss the point. We put off upgrading to an new RMM at work because I was able to hack together some dashboards in a couple days. It's not novel and does exactly what we need it to do, no more. We don't need to pay 1000's of dollars a month for the bloated Solarwinds stack. We aren't saving lives, we're saving PDFs so any arguments about 5 9s and maintainability are irrelevant. LLMs are going to give us on demand, one off software. I think the SaaS market is terrified right now because for decades they've gouged customers for continual bloat and lock in that now we can escape from. In a single day I was able to build an RMM that fits our needs exactly. We don't need to hire anyone to maintain it because it's simple, like most business applications should be, but SV needs to keep complicating their offerings with bloat to justify crazy monthly costs that should have been a one time purchase from the start. SV shot itself in the face with AI.

theshrike79 · 2026-03-04T13:31:43 1772631103

Define "successful"?

Does it need to be HN-popular or a household name? Be in the news?

Or something that saves 50% of time by automating inane manual work from a team?

raincole · 2026-03-04T10:18:43 1772619523

Name 3 apps that are

1. widely considered successful 2. made by humans from scratch in 2025

It looks like humans and AI are on par in this realm.

GoatInGrey · 2026-03-03T22:55:08 1772578508

To be fair, Claude Code is vibe-coded. It's a terrible piece of software from an engineering (and often usability) standpoint, and the problems run deeper than just the choice of JavaScript. But it is good enough for people to get what they want out of it.

bunderbunder · 2026-03-03T23:08:27 1772579307

But also, based on what I have heard of their headcount, they are not necessarily saving any money by vibecoding it - it seems like their productivity per programmer is still well within the historical range.

That isn’t necessarily a hit against them - they make an LLM coding tool and they should absolutely be dogfooding it as hard as they can. They need to be the ones to figure out how to achieve this sought-after productivity boost. But so far it seems to me like AI coding is more similar to past trends in industry practice (OOP, Scrum, TDD, whatever) than it is different in the only way that’s ever been particularly noteworthy to me: it massively changes where people spend their time, without necessarily living up to the hype about how much gets done in that time.

jmathai · 2026-03-04T03:01:10 1772593270

> But it is good enough for people to get what they want out of it.

This is the ONLY point of software unless you’re doing it for fun.

gwern · 2026-03-04T04:29:47 1772598587

> I encourage everyone to RTFA and not just respond to the headline.

This is an example of an article which 'buries the lede'†.

It should have started with the announcement of the new zlib autoformalization (!) https://leodemoura.github.io/blog/2026/02/28/when-ai-writes-... to get you excited.

Then it should have talked about the rest - instead of starting with rather graceless and ugly LLM-written generic prose about AI topics that to many readers is already tiresomely familiar and doubtless was tldr for even the readers who aren't repelled automatically by that.

† or in my terms, fails to 'make you care': https://gwern.net/blog/2026/make-me-care

bwestergard · 2026-03-03T22:36:10 1772577370

I am as enthusiastic about formal methods as the next guy, but I very much doubt any LLM-based technique will make it economical to write a substantial fraction of application software in Lean. The LLM can play a powerful heuristic role in searching for proof-bearing code in areas where there is good training data. Unfortunately those areas are few and far between.

Moreover, humans will still need to read even rigorously proved code if only to suss out performance issues. And training people to read Lean will continue to be costly.

Though, as the OP says, this is a very exciting time for developing provably correct systems programming.

zozbot234 · 2026-03-03T23:06:55 1772579215

LLMs are writing non-trivial math proofs in Lean, and software proofs tend to be individually easier than proofs in math, just more tedious because there's so much more of them in any non-trivial development.

Some performance issues (asymptotics) can be addressed via proof, others are routinely verified by benchmarking.

madrox · 2026-03-03T23:53:08 1772581988

This assumes everything about current capabilities stay static, and it wasn't long ago before LLMs couldn't do math. Many were predicting the genAI hype had peaked this time last year.

If you want it to be a question of economics, I think the answer is in whether this approach is more economical than the alternative, which is having people run this substrate. There's a lot of enthusiasm here and you can't deny there has been progress.

I wouldn't be so quick to doubt. It costs nothing to be optimistic.

candiddevmike · 2026-03-04T00:45:54 1772585154

> and it wasn't long ago before LLMs couldn't do math

They still can't do math.

Hammershaft · 2026-03-04T01:04:45 1772586285

Pro models won gold at the international math olympiads?

otabdeveloper4 · 2026-03-04T10:00:10 1772618410

[*] According to cloud LLM provider benchmarks.

bandrami · 2026-03-04T02:53:54 1772592834

They have trouble adding two numbers accurately though

evrimoztamur · 2026-03-04T07:16:46 1772608606

Why are they expected to?

haspok · 2026-03-04T13:08:28 1772629708

If you believe the "AGI is just around the corner" hype...

bandrami · 2026-03-05T00:23:28 1772670208

Going to be hard to convince normies it can do harder things than that if it can't do that

charlieflowers · 2026-03-03T21:08:32 1772572112

> "any sufficiently advanced agent is indistinguishable from a DSL."

I don't quite follow but I'd love to hear more about that.

madrox · 2026-03-03T23:41:58 1772581318

If you give an agent a task, the typical agentic pattern is that it calls tools in some non-deterministic loop, feeding the tool output back into the LLM, until it deems the task complete. The LLM internalizes an algorithm.

Another way of doing it is the agent just writes an algorithm to perform the task and runs it. In this world, tools are just APIs and the agent has to think through its entire process end to end before it even begins and account for all cases.

Only latter is turing complete, but the former approaches the latter as it improves.

esafak · 2026-03-03T22:15:14 1772576114

https://en.wikipedia.org/wiki/Clarke's_three_laws

charlieflowers · 2026-03-03T23:31:43 1772580703

No i get the clarke reference. But how is an agent a dsl?

whattheheckheck · 2026-03-03T23:55:24 1772582124

Maybe not an agent exactly but I can see an agentic application is kind of like a dsl because the user space has a set of queries and commands they want to direct the computer to take action but they will describe those queries and commands in English and not with normal programming function calls

thinkling · 2026-03-03T23:49:22 1772581762

My read was roughly that agents require constraining scaffolding (CLAUDE.md) and careful phrasing (prompt engineering) which together is vaguely like working in a DSL?

jpollock · 2026-03-03T22:51:41 1772578301

If the llm is able to code it, there is enough training data that youight be better off in a different language that removes the boilerplate.

kubanczyk · 2026-03-04T12:25:46 1772627146

> RTFA

Sigh. Is there any LLM solution for HN reader to filter out all top-level commenters that hadn't RTFA? I don't need the (micro-)shitstorms that these people spawn, even if the general HN algo scores these as "interesting".