More

bearjaws · 2026-03-31T12:49:06 1774961346

reasoning is just more tokens that come out first wrapped in <thinking></thinking>

bearjaws · 2026-03-31T12:43:30 1774961010

My super uninformed theory is that local LLM will trail foundation models by about 2 years for practical use.

For example right now a lot of work is being done on improving tool calling and agentic workflows, which tool calling was first popping up around end of 2023 for local LLMs.

This is putting aside the standard benchmarks which get "benchmaxxed" by local LLMs and show impressive numbers, but when used with OpenCode rarely meet expectations. In theory Qwen3.5-397B-A17B should be nearly a Sonnet 4.6 model but it is not.

bearjaws · 2026-03-24T12:27:28 1774355248

Why use JS at all for SSR?

It's not a great language for it.

azangru · 2026-03-24T12:47:53 1774356473

Where does the article say anything about js for ssr?

bearjaws · 2026-03-23T15:00:50 1774278050

We need a rule that if your LLM benchmark is running under 20t/s it's simply unusable in any real workflow.

6t/s is unbearable, if you used it with OpenCode you would be waiting 20+ minutes per turn.

zozbot234 · 2026-03-23T15:30:30 1774279830

This is not an ordinary LLM benchmark, it's streaming experts' weights from storage. It opens up running very large (near-SOTA, potentially SOTA) MoE models on very limited hardware, since you no longer need enough RAM for the entirety of the model's parameters. The comparison to 20 t/s local AI models is simply not fair.

bearjaws · 2026-03-23T17:09:51 1774285791

I understand that. I am saying there is a clear cliff where the value of an LLM reaches 0.

At 1t/s you are never going to get anything done. At 6t/s, it's significantly degrades the experience, one mistake setting you back 20-30 minutes.

At ~20t/s it's much more usable.

bearjaws · 2026-03-22T21:10:09 1774213809

SOC2 has been in trouble for a while now. Completely gamified. I was managing an acquisition of a healthtech company and asked if they did an internal risk assessment as part of their audit. Nope.

SOC2 certified, has never actually put to paper "here's what we know we're doing wrong, here is how we plan to remediate it."

bearjaws · 2026-03-20T14:34:22 1774017262

JavaScript can be fast too, it's just the ecosystem and decisions devs make that slow it down.

Same for Java, I have yet to in my entire career see enterprise Java be performant and not memory intensive.

At the end of the day, if you care about performance at the app layer, you will use a language better suited to that.

maccard · 2026-03-20T14:46:52 1774018012

My experience with the defaults in JavaScript is that they’re pretty slow. It’s really, really easy to hit the limits of an express app and for those limits to be in your app code. I’ve worked on JVM backed apps and they’re memory hungry (well, they require a reallocation for the JVM) and they’re slow to boot but once they’re going they are absolutely ripping fast and your far more likely to be bottlenecked by your DB long before you need to start doing any horizontal scaling.

wiradikusuma · 2026-03-20T14:59:23 1774018763

Compile it to native (GraalVM) and you can get it fast while consuming less memory. But now your build is slow :)

maccard · 2026-03-20T15:13:01 1774019581

The minute a project has maven in it the build is slow. Don’t even get me started on Gradle…

j-vogel · 2026-03-20T14:39:55 1774017595

Fair point on ecosystem decisions, that's basically the thesis of the post. These patterns aren't Java being slow, they're developers (myself included) writing code that looks fine but works against the JVM. Enterprise Java gets a bad rap partly because these patterns compound silently across large codebases and nobody profiles until something breaks.

FatherOfCurses · 2026-03-20T15:12:32 1774019552

"Enterprise Java"

Factories! Factories everywhere!

whattheheckheck · 2026-03-20T15:42:42 1774021362

Why do you think this plays out over and over again? What's the causal mechanisms of this strange attractor

wood_spirit · 2026-03-20T15:24:06 1774020246

Yes! Obligatory link to the seminal work on the subject:

https://gwern.net/doc/cs/2005-09-30-smith-whyihateframeworks...

pron · 2026-03-20T20:18:00 1774037880

Well, JS is fast and Go is faster, but Java is C++-fast.

Mawr · 2026-03-21T01:51:47 1774057907

What a ridiculous claim. You're either deluded or outright lying.

pron · 2026-03-21T14:21:22 1774102882

No, just a 20+ year C++ and Java developer, while you clearly haven't used modern Java. Now, I admit that because I have a lot of experience in low-level programming, I am often able to beat Java's performance in C++, but not without a lot of effort. I can do better in Zig when arenas fit, but I wouldn't use it (or C++ for that matter) for a huge program that needs to be maintained by a large team over many years.

bearjaws · 2026-03-20T14:08:08 1774015688

Mostly good advice other than "Run Ad-Hoc Claude Commands Inside Scripts"

I barely trust Claude Code as it is, and neither should anyone to run arbitrary commands unless you run it in a strong sandbox.

gibuloto · 2026-03-21T01:33:28 1774056808

Yes, so I run it with `claude --tools ""` with no tool use.

bearjaws · 2026-03-20T14:05:16 1774015516

As a command line junkie, what is the main thing Claude Code needs to catch up with cursor?

I haven't dove into using a LLM in my editor, so I am less familiar with workflows there.

lubujackson · 2026-03-20T14:21:37 1774016497

I use both pretty heavily. Cursor has an "Ask" mode that is useful when I don't want it to touch files or ask a non-sequitur. Claude may have an easy way to do this, but I haven't seeked it.

Cursor also has an interesting Debug mode that actively adds specific debug logging logic to your code, runs through several hypotheses in a loop to narrow down the cause, then cleans up the logging. It can be super useful.

Finally, when making peecise changes I can select a function, hit cmd-L and add certain ljnes of code to the context. Hard to do that in Claude. Cursor tends to be much faster for quicker, more precise work in general, and rarely goes "searching through the codebase" for things.

Most importantly, I'm cheap. a If I leave Cursor on Auto I can use it full time, 8 hours a day, and never go past the $20 monthly charge. Yes, it is probably just using free models but they are quite decent now, quick and great for inline work.

nsingh2 · 2026-03-20T15:25:08 1774020308

The majority of Ask/Debug mode can be reproduced using skills. For copying code references, if you're using VS Code, you can look at plugins like [1], or even make your own.

Cursor's auto mode is flaky because you don't know which model they're routing you to, and it could be a smaller, worse model.

It's hard to see why paying a middleman for access to models would be cheaper than going directly to the model providers. I was a heavy Cursor user, and I've completely switched to Codex CLI or Claude Code. I don't have to deal with an older, potentially buggier version of VS Code, and I also have the option of not using VS Code at all.

One nice thing about Cursor is its code and documentation embedding. I don't know how much code embedding really helps, but documentation embedding is useful.

[1] https://marketplace.visualstudio.com/items?itemName=ezforo.c...

dmix · 2026-03-20T22:16:57 1774045017

Mostly saying "include this line from x file and this block from y file" which keyboard shortcuts. Claude's VSCode plugin only does one selection. Claude Code requires explicitly telling it what to reference.

That plus Cursor's integration into VSCode feels very deep and part of the IDE, including how it indexes file efficiently and links to changed files, opens plans. Using Claude Code's VScode extension loads into a panel like a file which feels like a hack, not a dedicated sidebar. The output doesn't always properly link to files you can click on. Lots of small stuff like that which significantly improves the DX without swapping tabs or loading a terminal.

I also use Code from terminal sometimes but it feels very isolated unless you're vibecoding something new. I also tried others: Zed is only like 50% of the way there (or less). I also tried to use (Neo)Vim again and it's also nowhere close, probably 25% of the UX of Cursor even with experimental plugins/terminal setups.

physicles · 2026-03-21T02:06:39 1774058799

You’re not missing much.

I used Cursor for the second half of last year. If you’re hand-editing code, its autocomplete is super nice, basically like reading your mind.

But it turns out the people who say we’re moving to a world where programming is automated are pretty much right.

I switched to Claude Code about three weeks ago and haven’t looked back. Being CLI-first is just so much more powerful than IDE-first, because tons of work that isn’t just coding happens there. I use the VSCode extension in maybe 10% of my sessions when I want targeted edits.

So having a good autocomplete story like Cursor is either not useful, or anti-useful because it keeps you from getting your hands off the code.

MintPaw · 2026-03-21T06:22:43 1774074163

In cursor:

You can copy/paste or drag code snippets the chat window and they automatically become context like. (@myFile.cpp:300-310)

You can click any of the generated diffs in the assistant chat window to instantly jump to the code.

Generated code just appears as diffs till you manually approve each snippet or file. (which is fairly easy to do with "jump to next snippet/file" buttons)

These are all features I use constantly as someone who doesn't vibe but wants to just say "pack/unpack this struct into json", "add this new property to the struct, add it to the serialization, and the UI", and other true busywork tasks.

satvikpendem · 2026-03-21T06:38:29 1774075109

This all happens in VSCode now too and it's half the price for way more usage compared to Cursor. That Microsoft money sure does subsidize things.

bearjaws · 2026-03-20T02:29:27 1773973767

What a throw back damn.

I didn't have a Ti-83 so had to ask my friend for his once he got bored with the game.

There was a moment in 2011 I started writing it in "pure" SQL (MySQL) as a joke, but gave up, I'll have to find my DrugQL repo.

bearjaws · 2026-03-16T18:58:22 1773687502

This is a very dumb reason.

100% of them fail because they don't have product market fit, and the coding is the easy part, selling a product is actually really difficult.

If I built an internal chat app for a company of 1,000 people, I never need to worry about scaling to 50k users.

Most companies don't have any facet that needs to scale to the level of Slack, Google etc.