More

nextaccountic · 2026-03-19T07:47:34 1773906454

We already speak in a "technical dialect of English". All we need is some jargon to talk about technical things. (Lawyers have their own jargon too, also chemists, etc)

Some languages don't have this kind of vocabulary, because there aren't enough speakers that deal with technical things in a given area (and those that do, use another language to communicate)

nextaccountic · 2026-03-18T21:02:22 1773867742

Did you consent to this? https://hn.algolia.com/

nextaccountic · 2026-03-18T11:12:35 1773832355

> input size normal hardened speedup w/ hardened

> 1,000 0.7ms 28us 25x

> 5,000 18ms 146us 123x

> 10,000 73ms 303us 241x

> 50,000 1.8s 1.6ms 1,125x

Why is there a normal mode if hardened mode is faster for all input sizes?

ieviev · 2026-03-18T12:53:55 1773838435

Sorry, finished the post just now with more comparisons on other inputs

The reason is just that the normal mode is faster in average non pathological cases

tracnar · 2026-03-18T13:38:15 1773841095

Could you have a heuristics based on the input size and the pattern to decide what to use?

ieviev · 2026-03-18T13:39:33 1773841173

Yes, this is entirely possible. you can even explore the automaton eagerly and detect if it's possible to loop from an accepting state to a nonaccepting one.

Exciting stuff for future work

nextaccountic · 2026-03-18T21:59:15 1773871155

Ripgrep does something like thhis. It has a meta regex engine that switches engine when it finds what looks like pathological cases (or rather, the regex-automata crate does, which is used by the regex crate, which powers ripgrep).

https://docs.rs/regex-automata/latest/regex_automata/meta/st...

Ripgrep in turn exposes some knobs to tweak the heuristics

https://github.com/BurntSushi/ripgrep/blob/master/FAQ.md#how...

nextaccountic · 2026-03-17T02:36:34 1773714994

I would like to know if you plan to open source anything, and how much. https://github.com/orgs/moment-eng/ looks a bit empty

nextaccountic · 2026-03-15T04:59:16 1773550756

what about faulTPM? https://arxiv.org/abs/2304.14717

nextaccountic · 2026-03-14T06:50:40 1773471040

One thing that seems important is to have the agent write down their plan and any useful memory in markdown files, so that further invocations can just read from it

nextaccountic · 2026-03-13T13:49:43 1773409783

> https://news.ycombinator.com/newsguidelines.html#generated

As written, the guidelines talk about AI generated comments, not AI generated submitted articles

In any case, just flag the submission and move on

nextaccountic · 2026-03-11T05:00:44 1773205244

> If you added a simple additional to the problem, such as "Note that in this context, 'if' only means that...", most people would almost certainly answer it correctly.

Agreed. More broadly, classical logic isn't the only logic out there. Many logics will differ on the meaning of implication if x then y. There's multiple ways for x to imply y, and those additional meanings do show up in natural language all the time, and we actually do have logical systems to describe them, they are just lesser known.

Mapping natural language into logic often requires a context that lies outside the words that were written or spoken. We need to represent into formulas what people actually meant, rather than just what they wrote. Indeed the same sentence can be sometimes ambiguous, and a logical formula never is.

As an aside, I wanna say that material implication (that is, the "if x then y" of classical logic) deeply sucks, or rather, an implication in natural language very rarely maps cleanly into material implication. Having an implication if x then y being vacuously true when x is false is something usually associated with people that smirk on clever wordplays, rather than something people actually mean when they say "if x then y"

nextaccountic · 2026-03-11T04:34:45 1773203685

> Kahneman’s whole framework points the same direction. Most of what people call “reasoning” is fast, associative, pattern-based. The slow, deliberate, step-by-step stuff is effortful and error-prone, and people avoid it when they can. And even when they do engage it, they’re often confabulating a logical-sounding justification for a conclusion they already reached by other means.

Some references on that

https://en.wikipedia.org/wiki/Thinking,_Fast_and_Slow

https://thedecisionlab.com/reference-guide/philosophy/system...

System 1 really looks like a LLM (indeed completing a phrase is an example of what it can do, like, "you either die a hero, or you live enough to become the _"). It's largely unconscious and runs all the time, pattern matching on random stuff

System 2 is something else and looks like a supervisor system, a higher level stuff that can be consciously directed through your own will

But the two systems run at the same time and reinforce each other

drdaeman · 2026-03-11T18:26:26 1773253586

In my naive understanding, neither requires any will or consciousness.

S1 is “bare” language production, picking words or concepts to say or think by a fancy pattern prediction. There’s no reasoning at this level, just blabbering. However, language by itself weeds out too obvious nonsense purely statistically (some concepts are rarely in the same room), but we may call that “mindlessly” - that’s why even early LLMs produced semi-meaningful texts.

S2 is a set of patterns inside the language (“logic”), that biases S1 to produce reasoning-like phrases. Doesn’t require any consciousness or will, just concepts pushing S1 towards a special structure, simply backing one keeps them “in mind” and throws in the mix.

I suspect S2 has a spectrum of rigorousness, because one can just throw in some rules (like “if X then Y, not Y therefore not X”) or may do fancier stuff (imposing a larger structure to it all, like formulating and testing a null hypothesis). Either way it all falls down onto S1 for a ultimate decision-making, a sense of what sounds right (allowing us our favorite logical flaws), thus the fancier the rules (patterns of “thought”) the more likely reasoning will be sounder.

S2 doesn’t just rely but is a part of S1-as-language, though, because it’s a phenomena born out (and inside) the language.

Whether it’s willfully “consciously” engaged or if it works just because S1 predicts logical thinking concept as appropriate for certain lines of thinking and starts to involve probably doesn’t even matter - it mainly depends on whatever definition of “will” we would like to pick (there are many).

LLMs and humans can hypothetically do both just fine, but when it comes to checking, humans currently excel because (I suspect) they have a “wider” language in S1, that doesn’t only include word-concepts but also sensory concepts (like visuospatial thinking). Thus, as I get it, the world models idea.

nextaccountic · 2026-03-11T04:12:10 1773202330

> The rumors we hear have to do with projects inundated with more pull requests that they can review, the pull requests are obviously low quality, and the contributors' motives are selfish.

There's a way to handle this: put an automatic AI review of every PR from new contributors. Fight fire with fire.

(Actually, this was the solution for spam even before LLMs. See "A plan for SPAM" by Paul Graham. Basically, if you have a cheap but accurate filter (specially, a filter you can train for your own patterns), it should be enabled as a first line of defense. Anything the filter doesn't catch and the user had to manually mark as spam should become data to improve the filter)

Moreover, if the review detects LLM-generated content but the user didn't disclose it, maybe there should be consequences