Hacker Newsnew | past | comments | ask | show | jobs | submit | lancebeet's commentslogin

I think what you're describing is what people working with recommender systems call serendipity. Maximizing serendipity, while maintaining relatively high relevance/recommendation success rate, is supposedly a pretty difficult problem to solve. I'm not sure if LLMs have changed that.

This will sound snarky, so forgive me, but I honestly don't know the answer. Is this actually true? Is there a reliable source containing statistics on LLM compute usage that includes training vs inference for the whole market?

I don’t understand why people don’t just use Gemini or some other AI web search to get an answer to these kinds of questions quickly (I excluded the sources, you can get them if you ask the same question).

> While AI training is often the most intense and expensive process for a single model, the majority of total AI compute usage (approximately 90%) is used for inference.

> Here is the breakdown of why this is the case: > Inference as High-Volume

> Activity: Inference occurs every time a user interacts with an AI model (e.g., asking ChatGPT a question, using image recognition, or generating code). While a model is trained once (or updated infrequently), it runs millions or billions of inferences continuously.

> Cost Scaling: Training is a massive, one-time upfront cost, while inference is an ongoing, daily operational cost. As the number of AI users grows, the demand for inference compute scales faster than the need for training new, large models.

> The Shift to Efficiency: While early AI hype focused on the immense compute needed for training, the industry has shifted toward making inference cheaper and faster through specialized hardware and techniques like optimization, quantization, and small language models (SLMs).


Gemini is not a reliable source. You posted the only part of the AI response that isn't useful in verifying whether it is true.

Sure, I guess. I asked Gemini to give me some markdown of citations and the claims made that address the question:

https://share.google/aimode/v3Y9P3rYIx1oj9VI2

And I finally figured out how to get links to answers instead of just inlining the content as before. Anyways, there it is. We live in a time where questions like "Does inference or training use more compute?" can be answered quickly by just pasting it into a search box.


The revenue numbers are public for the major AI companies. That's probably the best estimate for "inference for the whole market" we have, since most of that inference is billed in either API usage or subscriptions, and it won't include any in-house usage such as training.

You obviously don't believe that AGI is coming in two release cycles, and you also don't seem to have much faith in the new models containing massive improvements over the last ones. So the answer to who is going to pay for these custom chips seems to be you.


Why would I buy chips to run handicapped models when the 10+ llms players all offer free tier access to their 1t+ parameters models ?


Do you think the free gravy train will run forever?


Not all applications are chatbots. Many potential uses for LLMs/VLAMs are latency constrained.


If benchmarks are fishy, it seems their bias would be to produce better scores than expected for proprietary models, since they have more incentives to game the benchmarks.


Is "the end of the exponential" an established expression? There's no singularity in an exponential so the expression doesn't make sense to me. To me, it sounds like "the end of the exponential part", meaning it's a sigmoid, but that's obviously not what he means.


I’m guessing that Amodei meant it as a humorous inside joke.

It’s also shorthand for “the end of massive R&D capex” and “the transition to market capture”. The final stage, what McKinsey types call “harvesting”, is probably not on Amodei’s radar. Based on what I’ve seen of his public personality, he would see it as too philistine and will hand it off to another custodial exec.


Cool insight!


Why should it be obvious that this is not what he means? I struggle to think how he could mean anything else.


Well, he says

>To me, it is absolutely wild that you have people — within the bubble and outside the bubble — talking about the same tired, old hot-button political issues, when we are near the end of the exponential.

My interpretation is "It's pointless to discuss the old political issues, because they're not going to be relevant once AGI is achieved". So if he does believe in a plateau, it either contradicts his other prediction (that AGI will be reached in a year or two), or he believes it will plateau after AGI is already reached, which means it's kind of a pointless statement. The important thing w.r.t. all our problems being solved would the advent of AGI, not the plateau.


I think he believe in a plateau on the y axis instead of the x axis… which is AGI.


I took the “end” to mean the part of the exponential where it quickly trends towards infinity. So let’s say the x axis is time (by which you get more training data and more compute) and the y axis is model ability. So far, if we think we are in the beginning of the exponential, adding data/compute looks almost linear to the untrained eye in terms of model capability. But once you hit a threshold, where he thinks the model will start to generalize, a small amount of data/compute will result in a massive increase in model ability.


Exactly. If you “plateau” on the y axis you increase model capability to infinity in no time.


>Maybe what surprised me most is that the mistakes NanoBananna made are simple enough that I'm absolutely positive Karpathy could have caught them. Even if his physics is very rusty. I'm often left wondering if people really are true believers and becoming blind to the mistakes or if they don't care.

I've seen this interesting phenomenon many times. I think it's a kind of subconscious bias. I call it "GeLLMann amnesia".


That naming works better than it should lol. Crichton would be proud.


Given the abysmal market share of Firefox today I think a large percentage of the remaining users do actually care.


This is really striking, isn't it? We've all certainly seen demos of things on this list or very similar things, and there are startups that have spent years and billions of dollars attempting to exploit existing LLMs to develop useful products. Yet most of the products don't seem to exist. The ones that you see in everyday life never seem to work nearly as well as the demos suggest.

So what's going on here? Do the products exist but nobody (or very few) uses them? Is it too expensive to use the models that work sufficiently well to produce a useful product? Is it much easier to create a convincing demo than it is to develop a useful product?


It is too expensive to reach the right audience. I remember talking to agencies about ads for a fintech app, and all of them said the same thing:

You need to burn around 20k a month on ads for 3 months, so we can learn what works, then the CAC will start decreasing, and you can get more targeted users.

Once you turn ads off, there is no awareness, no new users, and people will not be aware of the product's existence.


I'm not entirely convinced by the artists' argument, but this argument is also unconvincing to me. If someone steals from you, but it's a negligible amount, or you don't even notice it, does that make it not stealing? If the thief then starts selling the things they stole from you, directly competing with you, are your grievances less valid now since you didn't complain about the theft before?


Nothing was stolen from the artists but instead used without their permission. The thing being used is an idea, not anything the artist loses access to when someone else has it. What is there to complain about? Why should others listen to the complaints (disregarding copyright law because that is circular reasoning)?


> Nothing was stolen from the artists but instead used without their permission.

Which is equally illegal.

> disregarding copyright law because that is circular reasoning

This is not circular, copyright is non-negotiable.


So many problems with your reasoning.

"Nothing was stolen from the artists but instead used without their permission"

Yes and no. Sure, the artist didn't loose anything physical, but neither did music or movie producers when people downloaded and shared MP3s and videos. They still won in court based on the profits they determined the "theft" cost them, and the settlements were absurdly high. How is this different? An artist's work is essentially their resume. AI companies use their work without permission to create programs specifically intended to generate similar work in seconds, this substantially impacts an artist's ability to profit from their work. You seem to be suggesting that artists have no right to control the profits their work can generate - an argument I can't imagine you would extend to corporations.

"The thing being used is an idea"

This is profoundly absurd. AI companies aren't taking ideas directly from artist's heads... yet. They're not training their models on ideas. They're training them on the actual images artists create with skills honed over decades of work.

"not anything the artist loses access to when someone else has it"

Again, see point #1. The courts have long established that what's lost in IP theft is the potential for future profits, not something directly physical. By your reasoning here, there should be no such things as patents. I should be able to take anyone or any corporation's "ideas" and use them to produce my own products to sell. And this is a perfect analogy - why would any corporation invest millions or billions of dollars developing a product if anyone could just take the "ideas" they came up with and immediately undercut the corporation with clones or variants of their products? Exactly similar, why would an artist invest years or decades of time honing the skills needed to create imagery if massive corporations can just take that work, feed it into their programs and generate similar work in seconds for pennies?

"What is there to complain about"

The loss of income potential, which is precisely what courts have agreed with when corporations are on the receiving end of IP theft.

"Why should others listen to the complaints"

Because what's happening is objectively wrong. You are exactly the kind of person the corporatocracy wants - someone who just say "Ehhh, I wasn't personally impacted, so I don't care". And not only don't you care, you actively argue in favor of the corporations. Is it any wonder society is what it is today?


It's piracy, not theft. Those aren't the same thing but they are both against the law and the court will assess damages for both.

The person you replied to derailed the conversation by misconstruing an analogy.

> what's happening is objectively wrong.

Doesn't seem like a defensible claim to me. Clearly plenty of people don't feel that way, myself included.

Aside, you appear to be banned. Just in case you aren't aware.


> The person you replied to derailed the conversation by misconstruing an analogy.

Curious why you say this. They seem to have made the copyright infringement analogous to theft and I addressed that directly in the comment.


It was an analogy, ie a comparison of the differences between pairs. The relevant bit then is the damages suffered by the party stolen from. If you fail to pursue when the damages are small or nonexistent (image classifiers, employee stealing a single apple, individual reproduction for personal use) why should that undermine a case you bring when the damages become noticeable (generative models, employee stealing 500 lbs of apples, bulk reproduction for commercial sale)?


This is precisely where the analogy breaks down. The victim suffers damages in any theft, independent of any value the perpetrator gains. Damages due to copyright infringement don't work this way. Copyright exists to motivate the creation of valuable works; damages for copyright are an invented thing meant to support this.


That would only be a relevant distinction if the discussion were specifically about realized damages. It is not.

The discussion is about whether or not ignoring something that is of little consequence to you diminishes a later case you might bring when something substantially similar causes you noticeable problems. The question at hand had nothing to do with damages due to piracy (direct, perceived, hypothetical, legal fiction, or otherwise).

It's confusing because the basis for the legal claim is damages due to piracy and the size of that claim probably hasn't shifted all that much. But the motivating interest is not the damages. It is the impact of the thing on their employment. That impact was not present before so no one was inclined to pursue a protracted uphill battle.


Oh, I agree with all that, I had sort of ignored the middle post in this chain.


I dunno, man. Re-read your comment but change one assumption:

> They still won in court based on the profits they determined the "theft" cost them, and the settlements were absurdly high.

Such court determinations are wrong. At least hopefully you can see how perhaps there is not so much wrong with the reasoning, even if you ultimately disagree.

> They're training them on the actual images artists create with skills honed over decades of work.

This is very similar to a human studying different artists and practicing; it’s pretty inarguable that art generated by such humans is not the product of copyright infringement, unless the image copies an artist’s style. Studio Ghibli-style AI images come to mind, to be fair, which should be a liability to whoever is running the AI because they’re distributing the image after producing it.

If one doesn’t think that it’s wrong for, e.g., Meta to torrent everything they can, as I do not, then it is not inconsistent to think their ML training and LLM deployment is simply something that happened and changed market conditions.


> This is very similar to a human...

A machine, software, hardware, whatever, as much as a corporation, _is not a human person_.


I hear this type of statement often, but people rarely mention the scope or who the brain drainees are. In my experience, it's exceptionally rare that American talent comes to Europe compared to the opposite, and I see little reason why that would change in the near future. When it comes to Chinese individuals returning to China from the US, this isn't exactly traditional brain drain, and it's also something China has actively, sometimes aggressively, been pursuing the past decade or so.


> and I see little reason why that would change in the near future

Really? There have been 2 months of reasons accumulating by now. One of which being that the government is made of fascists who make nazi salutes and oligarchs.


Imagine 30 years ago if someone said the country of Wittgenstein would not have a large language model of its own, let alone the EU.

It is insane.

Many people just talk nonsense on this topic with China vs the US. Anyone who hasn't read America Against America by Wang Huning basically has no idea what they are talking about on this subject. Of course, total ignorance on a topic has never been something to slow down the opinion of a westerner.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: