More

thepasch · 2026-03-30T08:20:29 1774858829

I never use an LLM to paraphrase my own voice as a matter of principle, but I’ve still been repeatedly accused of doing so because I happen to always have written structured posts, used “smart quotes,” and done that negative comparison thing (it’s genuinely not just fluff, it’s a genuinely useful way to— ah god damn it). Sigh.

Freak_NL · 2026-03-30T09:12:28 1774861948

I feel ya. I've never been accused of using an LLM, fortunately, but depending on the context I do use “smart quotes” (even in „Dutch” or »German«) and the em-dash obviously… (And that ellips fella there. It's just so simple to type with a compose key set up.)

internet_points · 2026-03-30T09:10:13 1774861813

Same here, I've always used em dashes and have been called out on negative comparisons – I didn't even know they were an LLM thing. Should I read more LLM to know what phraseology to avoid, or will doing that nudge me towards sounding more LLM? :-(

thepasch · 2026-03-29T12:22:39 1774786959

> Yes, I’m going to refrain from airing out my dirty laundry. I made a bad decision, now I’m living with it, and more context doesn’t actually change the intent behind my message

That’s not entirely true, as it’s currently impossible to actually gauge the severity of what the LLM seemingly enabled you into doing. There’s a difference between “I uncritically accepted everything it told me because it lined up with what I was hoping to hear” and “it subtly nudged me towards a course of action that was going to be obviously unwise after some consideration, but managed to convince me to skip this”; and also between that and “I took a risk, which I knew to be a risk, and which I knew to potentially expect to go bad, and the LLM convinced me to take it where I otherwise wouldn’t have”, and ALSO between that and “I took a risk, which I knew to be a risk, and which I knew to potentially expect to go bad, and if I’m perfectly honest, I might’ve taken it anyway without the LLM”.

Without any indication as to how your situation maps to any of these (or more), the warning is, functionally, not particularly useful.

rcfox · 2026-03-29T15:26:45 1774798005

Yeah, my first thought (admittedly an absurd one) went to something along the lines of:

"I flipped a coin and the LLM called heads. I should have gone with tails..."

thepasch · 2026-03-17T20:01:50 1773777710

> Someone approves a PR they didn’t really read. We’ve all done it (don’t look at me like that). It merges. CI takes 45 minutes, fails on a flaky test, gets re-run, passes on the second attempt (the flaky test is fine, it’s always fine, until it isn’t and you’re debugging production at 2am on a Saturday in your underwear wondering where your life went wrong. Ask me how I know… actually, don’t). The deploy pipeline requires a manual approval from someone who’s in a meeting about meetings. The feature sits in staging for three days because nobody owns the “get it to production” step with any urgency.

This is the company I (soon no longer) work at (anyone hiring?).

The thing is that they don’t even allow the use of AI. I’ve been assured that the vast majority of the code was human-written. I have my doubts but the timeline does check out.

Apart from that, this article uses a lot of words to completely miss the fact that (A) “use agents to generate code” and “optimize your processes” are not mutually exclusive things; (B) sometimes, for some tickets - particularly ones stakeholders like to slide in unrefined a week before the sprint ends - the code IS the bottleneck, and the sooner you can get the hell off of that trivial but code-heavy ticket, the sooner you can get back to spending time on the actual problems; and (C) doing all of this is a good idea completely regardless of whether you use LLMs or not; and anyone who doesn’t do any of it and thinks the solution is to just hire more devs will run into the exact same roadblocks.

thepasch · 2026-03-16T09:18:53 1773652733

That would be a lot easier to believe if this law in question actually, you know, helped society. Or did anything to affect how it runs, let alone “effectively.”

As it stands, it reads more like “I’ve used my free will to decide to suspend all critical thinking and accept that anything that anyone with authority decides should be a rule must be unquestioningly accepted.”

thepasch · 2026-03-15T08:53:56 1773564836

> so that when the subsidies end and subscription costs shoot up

Subscription costs are capped to API rates as their ceiling (and, realistically, way lower than that - why would you even subscribe if you could just go pay-what-you-use instead), and those are already at a big margin for Anthropic. What still costs them a fuckton of money comparatively is training, but that is only going to get more efficient with more purpose-built hardware on the way.

Basicallly, I don’t see much of a reason to hike subscription prices dramatically. I don’t think they’ll stay at $100/$200 but anyone who’s paying that already knows how much value they’re getting out of that and probably wouldn’t mind paying more.

OJFord · 2026-03-15T10:08:22 1773569302

I'm not sure what you mean, if you max out your subscription perhaps? If you pay $100 and don't use it, you don't get refunded $100 because it's 'capped to API rates' which would've been 0.

ffsm8 · 2026-03-15T11:38:16 1773574696

He means that anthropic cannot increase the price of the sub because the users can just switch to the regular API pricing which consequently puts a ceiling on the cost of the sub.

Nobody would use a $1k sub if using the API pricing would only cost $500 for comparative service.

For the record, I'm only explaining what he put forward.

I don't agree with the opinion, mainly for two reasons:

The API cost can be increased in conjunction, hence the ceiling is just as variable

The harness is even more important then the model ime, and Claude Code is getting better every month. Even though the alternatives are getting better too, they're at least currently significantly worse IME - I'd say at least 3-6 months behind (compounded by the model, ofc).

And as a third point, unrelated to the original argument: there is no way anthropic is actually treating the sub as a loss leader. It is not cheap. It's only cheap compared to their API pricing, which they can freely set however they want. Compare their pricing to free models like Kimi k2.5 etc. I sincerely doubt anthropics model costs more to run then theirs, and they're profitable at 30% of the price anthropic charges.

thepasch · 2026-03-16T10:20:51 1773656451

> He means that anthropic cannot increase the price of the sub because the users can just switch to the regular API pricing

Not that they cannot increase the price, just that there's a cap on how high they realistically can go. Sure, they can always hike API prices to compensate, but I think people are seriously sleeping on open models these days, because…

> *The harness is even more important then the model ime*, and Claude Code is getting better every month.

…I fully agree with this, and that’s actually the other reason why I don’t think we’ll approach predatory pricing. Right now, the moat is still mostly the model, but as open models improve and become more capable, this is quickly going to shift.

And the truth is that Claude Code just isn’t that great of a harness. Anyone who uses an open-source harness and optimizes it for their personal, individual workflow will quickly realize this. And I’m not even blaming Anthropic or the CC team or calling them incompetent; they are in the unenviable position to have been trailblazers. There weren’t any comparable tools before CC that they could’ve learned from.

The future lies in harnesses that are multi-model, extensible, and have full access to and control over the model’s API, context, and system prompt. Claude Code has none of those things. You can only ever bend it into a shape that approximates your workflow; you can never use it as a tool that natively supports it.

ffsm8 · 2026-03-16T19:59:25 1773691165

Oh, on that we can agree on! I was using opencode for the last few months, the main reason I went back to cc was for opus, and me preferring the sub over regular API pricing as I'm not using it professionally, only as a hobby. (At work I'm constrained to Copilot. Which is fine at this point, not great but definitely improving - esp. when run as CLI)

I am still hoping for a local first model approach with voice command to generate the main prompt which starts of the plan mode.

Like interactively going through the project while pointing at files or in the UI and possibly browser via the mouse and explaining while "talking" with a dumber but super quick model that acts as a questioner, to wrap things up with higher latency over the wire with the highly capable models.

I suspect that approach is still a few months to years away from viability for latency reasons, but I'm definitely looking forward to that UX

blks · 2026-03-15T15:36:41 1773589001

Now huge amount of investment pays for training. This investment expects some returns, to be able to both turn profit and continue the training, rates must be much, much higher.

thepasch · 2026-03-13T15:32:25 1773415945

I’m pretty sure whoever made this didn’t read the website they asked their LLM to generate for them.

thepasch · 2026-03-13T08:55:33 1773392133

The point is that if the harness’ workflow gives contradictory and confusing instructions to the model, it’s a harness issue, not necessarily a model issue.

daveguy · 2026-03-13T15:59:24 1773417564

First it was a model issue, then it was a prompting issue, then it was a context issue, then it was an agent issue, now it's a harness issue. AI advocates keep accusing AI skeptics of moving goalposts. But it seems like every 3-6 months another goalpost is added.

thepasch · 2026-03-15T09:08:17 1773565697

Your comment doesn’t make as strong of a point as you think it does; it might make the opposite point.

Because, yes, first, it was a model issue, and then more advanced models started appearing and prompting them correctly became more important. Then models learned through RLHF to deal with vague prompting better, and context management became more important. Then models became better (though not great) at inherent context recollection and attention distribution, so now, you need to be careful what instructions a model receives and at what points because it’s literally better at following them. It’s not so much that the goalposts are being moved, it’s that they’re literally being, like, *cleared*.

This isn’t a tech that’s already fully explored and we just need to make it good now, it’s effectively an entirely new field of computing. When ChatGPT came out years ago no one would have DREAMT of an LLM ever autonomously using CLI tools to write entire projects worth of code off of a single text prompt. We’d only just figured out how to turn them into proper chatbots. The point is that we have no idea where the ceiling is right now, so demanding well-defined goalposts is like saying we need to have a full geological map of Mars before we can set foot on it, when part of the point of going to Mars is to find out about that.

As a side point, the agent is the harness; or, rather, an agent is a model called on a loop, and the harness is where that loop lives (and where it can be influenced/stopped). So what I can say about most - not all, but most, including you, seemingly - AI skeptics is that they tend to not actually be particularly up-to-date and/or engaged with how these systems actually work and how capable they actually are at this point. Which is not supposed to be a dig or shade, because I’m pretty sure we’ve never had any tech move this fast before. But the general public is so woefully underinformed about this. I’ve recently had someone tell me in awe about how ChatGPT was able to read their handwritten note and solve a few math equations.

thepasch · 2026-03-10T20:34:50 1773174890

> the headline deliberately tries to blow this up into a big deal

I do not understand how “company that runs half the internet has had major recent outages and now explicitly names lax/non-existent LLM usage guidelines as a major reason” can possibly not be a big deal in the midst of an industry-wide hype wave over how the world’s biggest companies now run agent teams shipping 150 pull requests an hour.

The chain of events is “AWS has been having a pretty awful time as far as outages go”, and now “result of an operational meeting is that the company will cut down on the use of autonomous AI.” You don’t need CoT-level reasoning to come to the natural conclusion here.

If we could, as a species, collectively, stop measuring the relevance of a piece of news proportionally by how much we like hearing it, please?

mattgreenrocks · 2026-03-10T20:53:25 1773176005

The defensiveness is almost as interesting as the meeting itself.

emp17344 · 2026-03-10T22:08:26 1773180506

Way too many people have tied their egos to the success of AI.

cobolcomesback · 2026-03-11T00:05:51 1773187551

And too many people have their egos tied to its failure, too.

Im a massive AI skeptic. If anyone were to be jumping up and down on the corpse of AI and this incessant drive to use it everywhere, it’d be me. But I also work at Amazon. I got the email. I attended the meeting. I can personally attest that there are no new requirements for AI-generated code. The articles about this in the meeting at extremely misleading, if not outright wrong. But instead of believing the person that was actually there in the room, this thread is full of people dismissing my first-hand account of the situation because it doesn’t align with the “haha AI failed” viewpoint.

autoexec · 2026-03-10T22:52:17 1773183137

Not just their egos, but their paychecks. This place is either going to get very quiet or really weird when the hype train derails and the AI bubble bursts.

shermantanktop · 2026-03-11T02:40:26 1773196826

The subject of the media coverage is not AWS, it is a peer organization to AWS that runs using significant amounts of non-AWS infrastructure. They are both part of an umbrella called Amazon but are not at all the same thing.

Maybe your CoT-level reasoning isn’t so robust.

saghm · 2026-03-11T06:13:57 1773209637

It's hard to that this objection seriously. The publication is literally called the Financial Times. It's not exactly crazy for them to think that their readers might care about the entity that shows up the stock ticker rather than how the company happens to divide up things internally.

Even if it weren't a finance publication, I have trouble imagining you making this argument if a headline said something like "Google deals with outages in the cloud" because of the idea that it's misleading to refer to it as anything other than GCP. I think you're fundamentally not understanding how people communicate about this sort of thing if you actually think that someone saying "Amazon" is misleading in any meaningful way.

shermantanktop · 2026-03-12T03:49:14 1773287354

You’re describing reasonable misunderstandings, but they are still misunderstandings.

The cause and effect statements just don’t correspond to reality.

I guess I’m stuck on the idea that the actual facts are relevant. If the question instead is how the dance of optics and PR is going in the minds of people who don’t know enough to doubt what they read, I don’t know what to say about that.

cobolcomesback · 2026-03-10T21:00:35 1773176435

The message and meeting being discussed here have nothing to do with AWS or any outages AWS has faced recently. I think you’re missing the point of the discussion.

I don’t blame you, because this is just bad reporting (and potentially intentionally malicious to make you think it’s about AWS). But the meeting and discussion was with the Amazon retail teams, talking about Amazon retail processes, and Amazon retail services. The teams and processes that handle this are entirely separate from any AWS outages you are thinking of.

The outages that Amazon retail has faced also have nothing to do with AI, and there was no “explicit call out” about AI causing anything.

thepasch · 2026-03-10T14:14:34 1773152074

> while taking the joyful bits of software development away from you

Quick question: by "joyful bits of software development," do you mean the bit where you design robust architectures, services, and their communication/data concepts to solve specific problems, or the part where you have to assault a keyboard for extended periods of time _after_ all that interesting work so that it all actually does anything?

Because I sure know which of these has been "taken from me," and it's certainly not the joyful one.

repstosb · 2026-03-12T18:57:20 1773341840

I guess I enjoy solving problems, and recognize that the devil is always in the details, so I don't get much satisfaction until I see the whole stack working in concert. I never had much esteem for "architects" who sketch some blobs on the whiteboard and then disappear. I certainly wouldn't want to be "that guy" for anyone else, and I'm not even sure I could do it to an LLM.

thepasch · 2026-03-03T15:56:05 1772553365

It’s perplexing; like the majority of people who insist using AI coding assistance is guaranteed going to rob you of application understanding and business context aren’t considering that not every prompt has to be an instruction to write code. You can, like, ask the agent questions. “What auth stack is in use? Where does the event bus live? Does the project follow SoC or are we dealing with pasta here? Can you trace these call chains and let me know where they’re initiated?”

If anything, I know more about the code I work on than ever before, and at a fraction of the effort, lol.

DadBase · 2026-03-03T17:07:48 1772557668

The project managers and CEOs who are vibe-coding apps on the weekend don't know what an "auth stack" is, much less that they should consider which auth stack is in use. Then when it breaks, they hand their vibe-coded black box to their engineers and say "fix this, no mistakes"