Hacker Newsnew | past | comments | ask | show | jobs | submit | RickS's commentslogin

Can't believe there aren't more comments on this. I see you've retried at least once too. Fair IMO, this is a good post. Maybe something like "Claude Code binary reveals intentional plan mode degradation" or similar, since IMO the interesting bits are 1. exploring the binary to find this info and 2. actually having receipts.

With that said... do you have receipts? You have a screenshot of what looks like LLM output. For an accusation in this category, I'd expect more info on what exactly is necessary to replicate these findings, ideally in a way that's not susceptible to LLM hallucination / confirmation bias. As a bonus, that depth will probably make it a more compelling post in general.

Hope this gets some viz.


I'll try posting again today, just because this is an active thing that I'm trying to get fixed.


I (36) feel simultaneously as old and as young as I've felt since being an actual fresh grad / student.

Old because I can see behind the curtain now. Things feel different than they did when I first started working in tech around 2006-2008. So much of the fixation on recurring revenue, rent-seeking, optimization at all costs, dark patterns, manufactured addiction... I watched the industry as it slowly stumbled into these ideas, leaned in, and ultimately perfected them. But when I read accounts from people older than I am, they all have their version of this comment. The early days of the PC era and browser wars had no shortage of dark shit. It's not like corporate fuckery wasn't rampant in the 80s or 90s. I was just too young and dumb and optimistic to understand most of it. I've gotten much more cynical in the last ~15 years.

I feel young because AI tech actually feels new and promising and exciting in ways I haven't seen since the dawn of mobile and web2.0. There's suddenly this vast new surface area for innovation, a bleak geopolitical landscape, and a palpable rush to create. 2008-2012ish kinda sucked, economically. The fallout from GWOT + GFC, OWS, snowden leaks, etc. The nerdosphere had this collective feeling that the jaws were tightening around us. And yet the technology was moving so fast, was enabling whole new ways of interacting between people and machines. You could tell that the future was going to be completely wild, but it was early enough that it had to be built, and there was a frenzy of excitement, like we'd just been set loose across the louisiana purchase to figure out what was possible in a whole new kind of environment.

It was an oasis. A refuge from everything else that felt broken in every other part of the world. You could just duck down and build shit nobody had ever seen before in a week, and people would take it seriously because there was this shared understanding that nobody knew what the new rules were yet, and the next big idea might come from anyone.

It feels like the best time in a long time for technologists who thrive on curiosity, optimism, and inventiveness. We've finally got a gold rush for experimental tinkerers! Crypto is just grift tech, NFTs were transparently stupid, AR/VR has mostly felt like a gimmick, etc. AI is already so useful, and it's only the beginning.

The market's delusional, the US government is horrific, megacorps are squeezing every drop out of anyone they can stuff in their mouth... but did anyone really think we had a future where that doesn't happen? That snowball's been rolling since long before I was born. I'm just stoked at the chance to get to experience technology as magic again for a little while along the way. Maybe that's a cope. Or escapism. IDK, fuck it. So far, it's nice. The bad shit I've been expecting for many years. The good shit surprised me.


I solved this by asking it to make a memory that all answers to me should be brisk, clinical, and to the point. This worked well, except for the annoying habit of beginning answers with something like "Terse: $answer", which required a second memory, solving the issue in full. I've been happy with it since. Edit: I just realized this interaction is its own demo – that's the entire response it gave me, as it should be.

> Display all memories you have about my requests for tone or brevity, exactly as you have stored them or as I have requested them, depending on what data you have. There are at least two.

[2025-11-08]. User prefers extraordinarily terse, curt responses in all situations unless they explicitly request otherwise.

[2025-12-01]. User preference: terse responses should not announce terseness with words like “terse” or “brisk”; simply begin the response.


This didn't work at all for me.

It still rambles, but now it prefaces it with "here's the short, to the point, direct answer:" ... followed by the same a long-winded answer.


Same. I gave up and moved to Claude and haven’t looked back. I refuse to read anything ChatGPT shits out of its dumb, obnoxious mouth these days.


Based on my experience, this is better put into the Settings -> Customizability dialogue, not Memories

Another user mentioned how it will reference the very instruction ("I know you would prefer concise answers, so here's a concise answer..") but that makes sense when you realize that Memories are more for things like "user lives in San Francisco and is new in town and is open to recommendations of third places to meet people" so if it's answering the question about the best coffee places in n SF, it would make sense for ChatGPT to finish with "Also, given that you are new to San Francisco, and your interest in both boardgames and meeting new people, have you considered visiting [place]? It's a local coffee shop that also rents out board games, with a Thursday evening theme where you are partnered with strangers. It might be a good way to make new people that enjoy similar things!"

If you consider adding Memories us adding something to the system prompt, it won't make very much sense a lot of the time, because you might forget what you wrote and then be surprised when your model suddenly suggests jigsaw puzzles when you mentioned that you're stressed building a compiler. Hence it tells the user the context of the memory that it's using and why, whereas if you added to Customizability I've never seen it leak out like that.

If you add to Memories "user is a software engineer and prefers Rust to C/C++" it may say something like "By the way, since you prefer Rust I would recommend [this development path]" but if you put it into Customizability as "do not suggest C/C++ for software projects unless it's the only way, use Rust or Go instead" it will likely start down the path of suggesting and researching Rust from the very beginning without explaining to you your own instructions.

Basically, what I'm trying to say is that the Customizability instructions (mine say "be concise, do not be afraid to correct the user or use occasional dry humor. Speak frankly and tell the user if they may be making a mistake and suggest other courses of action" whereas Memories contain simple facts about me, i e. ("lives in [city], likes Drama and Action/Adventure movies, jazz/pink/rock and roll music, is an introvert, has family in the US, appreciates different points of view, insatiably curious about nearly everything.")

Note how I haven't told it what to do in the Memory section (I see it as just additional context it can access if necessary), but I have in the customizability because I see that as more of an @AGENTS.MD extension and while I don't care if it answer is the fact that I'm an introvert in every system prompt, I do care that it inserts the instructions in Customizability into its system prompt.

Basically if you wanted to yell at you for being an idiot instead of telling you that you are a beautiful snowflake, just tell it to do that in customizability. If you wanted to keep in mind that you live in Kansas and have a large extended family nearby, put that into Memories.

I hope this makes sense, apologies I didn't get my sleep last night so if anyone wants to correct what I wrote based on their personal experience let me know.

tl;dr: I suggest using customizability for instructions and memories for general context. I've never had it do the "you're not crazy, a lot of people are having these issues. Let's work through them together.." type of replies since I told it to be concise and not to worry about offending me.


Do you mean "Personalization -> Custom Instructions"? I don't see Settings -> Customizability as a path

(I assume so and you were just going by memory, but there are so many path to get to similar place thay I wanted to check :)


https://www.elkandelk.com/washington/seattle-car-accident-st...

Since it started in 2015, accidents are down 50%, but deaths up 90%. This analysis leaves a lot to be desired. I didn't see per-capita stats (Seattle had massive growth during a lot of those years), and we don't really enforce traffic laws at all anyway, so IDK what to think without digging in further.


How have average car sizes and weight changed in this period of time?


You're asking the wrong question. The answer is 10%

The interesting question is power-to-weight, which was (apparently) a direct result of EPA regulations that were enacted in 1975. The below article, which I found from a search engine copying your question and looking at a few results, is an interesting read.

Ignoring all that, the actual question would be: how have car sizes and weights changed _in this region_ during this period of time. Sizes and weights of cars in brasil have little bearing to accidents in the PNW, for example.

https://carbuzz.com/new-vehicles-bigger-heavier-more-powerfu...


> Ignoring all that, the actual question would be: how have car sizes and weights changed _in this region_ during this period of time. Sizes and weights of cars in brasil have little bearing to accidents in the PNW, for example.

Sorry that I wasn't clear, that's exactly what I meant. I'm curious because it makes absolutely no sense that a safer urban design with separation of grades for cyclists, lowering speeds through design and engineering rather than just updating speed limit signs, would see an increase in deaths. Nowhere else in the world where those were implemented has had that effect, the Netherlands being the prime benchmark for it.

So there's something else at play, average car sizes in the USA are much larger than Europe (and most of the rest of the world), the urban road design is not changed that much: perhaps stroads just got new speed limits and were left at that, instead of narrowing them, adding trees and other obstacles that naturally makes driving slower and more cautious, so on and so forth.

There's also the added issue that American driving standards for a licence are incredibly low since it's kinda required for you to have a driver's licence to exist and have a life in the majority of the country.


> There's also the added issue that American driving standards for a licence are incredibly low since it's kinda required for you to have a driver's licence to exist and have a life in the majority of the country.

Relative to what?


Relative to developed countries like Germany, Norway, Sweden, Finland, Denmark, the Netherlands, so on and so forth.


First, each state has their own drivers test, lumping “the US driving test” into a single unit displays a clear lack of knowledge on the subject matter.

Second, actually trying to verify if you were right or not, you’re not. Germany, for example, has driving tests similar to the state of maryland in the US.

You are, unfortunately, incorrect and ignoring research/critical thinking skills.


> You are, unfortunately, incorrect and ignoring research/critical thinking skills.

Critical thinking skills? Gimme a break, lol, driving standards, on average, in the USA are abysmal in comparison to these countries. Traffic data don't lie, no matter if tiny Maryland has better standards, it doesn't improve the average.

Also, how exactly have you verified this, I'd like to see the sources, thank you.


this is a great thread

it’s hard to isolate the effect size of policy, covid happened, car weights changed, policing may have decreased, US drivers may have driven differently, population size, etc.


The numbers seem a bit alarmist on the fatality front, seems like it would make more sense to account for fatalities as a proportion of accidents overall. In absolute numbers, we're talking tens of deaths and thousands of accidents.

As a visitor (periodically throughout the whole timespan) it's seemed to me like there's massive growth in population in the metro area and more densification inside the Seattle downtown area. Tough to tell what geography this exactly captures. Assuming the numbers are valid, I do wonder if there's a significant demographic or exurb shift, where older drivers became a higher proportion of all drivers where they already lived, and a bunch of others either stopped entirely or moved outside the city boundaries.

If memory serves, I feel like there's also a tendency to accidentally end up committed to a toll bridge crossing by getting stuck on an exit/on ramp off one of the highways, which might make people panic and bail at the last second erratically, but that idea seems a bit farfetched


this reference does talk about those stats, but doesn't link in any way to adverse affects of attempting to bring down deaths.


I live in Seattle and anecdotally I have seen the number of people running red lights absolutely explode in the last two years. Literally from seeing once or twice pre-COVID to at least one a day. This is not an exaggeration, there's a particular light on my commute that I see at least one driver run per day. My theory is that in an effort to make the intersection safer they adjusted the lights so now there's a period where cars all have a red light while pedestrians are crossing. Meanwhile a certain segment of the population sees all cars in the intersection stopped and decides to slam it. It's a recipe for disaster given there's a middle school down the road from that light...


Traffic behavior, in general in the PNW, has gotten way worse. When I say worse I mean selfish. I think since COVID people are just more selfish.

I don't just mean assholes who do what they want. People just don't give a crap about others on the road at all anymore. A lot of folks who probably think they're driving "safe" are just driving selfishly slow and not following the law(super late blinkers, failing to move predictably with traffic, braking in traffic long before entering a turn lane).

It's definitely worse nowadays. I can think of plenty of reasons why. But really I think our society, generally, has started to reward selfish behavior. Or at least not punish it nearly enough to deter it's spread.


This is the way. It's maddening that we use the term "speed limit" for what is better understood as a "speed request".


Same answers you'd use beyond "we don't want to pay an engineer". 100x shorter iteration speed, and the associated workflow (stream of microrevisions and spaghetti throwing), top quartile outputs in many langs/styles/contexts without having to source, hire, and maintain a fleet of separate specialists who can quit when they feel like it.

I'm torn on the scale thing. It definitely seems net negative. But I think we collectively underestimate just how deeply sick the existing thing already is. We're repulsed by image gen at scale because it breaks our expectation that images are at least somewhat based on reality, that they reflect the natural world or what we can really expect from a product, from a company, from the future. But that was already a bad expectation: when's the last time you saw a mcdonalds meal that looked like the advert? Or a sub-30$ amazon product that wasn't a complete piece of shit? Advertisements were already actively malicious fantasies to exploit the way our brains react to pictures. They're just fantasies that required whole teams of humans doing weird bullshit with lighting and photoshop, and I'm not sure that's much better. It was already slop. All the grieving we do about the loss of truth, or the extent to which corps will gleefully spray us with mind-breaking waterfalls of outright lies, I think those ships sailed a long time ago. The disgust, deceit, the rage we feel about genAI slop is the way we should have felt about all commercials since at least the 80s IMO.


> Advertisements were already actively malicious fantasies to exploit the way our brains react to pictures. They're just fantasies that required whole teams of humans doing weird bullshit with lighting and photoshop, and I'm not sure that's much better.

This is a good point. My gut reaction is “well at least someone was paid to do it and can continue to keep society/the economy going ”.

I can see the other side where that’s a soulless job. Not sure what’s worse. Soulless job where your skills apply or even less jobs in a competitive industry.


Same way it does with nukes. It's Mutually Assured Destruction. If there's a credible promise that attack will result in a total boardwipe, there's strong incentive not to attack, because then China's fucked too. It's crude but it mostly works.

What's interesting is that I don't hear much about China spinning up chip fabs. I haven't gone looking, and I imagine they're doing it, the way we are with the CHIPS act etc. If china could get within a few notches of SOTA (in both nm and throughput), their attack position would be much stronger, but it'd still be a generationally brutal experience for most of humanity.


The first third of this opens with so many delightful, quotable pieces of writing.

And the last third sets up something interesting.

And then it just stops. It lays groundwork for an interesting idea and then immediately abandons it.

Choose your fictions well...

Okay, but how?


An alternative client for Bambu 3D printers that plays nicely with network sandboxing and multiple printers. It's great.

Bambu's printers are functionally best-in-class, but intrusive and proprietary in their approach to software. Their first-time setup "requires" linking to a cloud account or using a bambu app via QR code, and they've been known to disable functionality in updates, making a device-managed "LAN-only" mode unsafe to trust. Their apps also just suck. Camera feed is janky and LAN-only sync often requires knowing an access code, serial, IP, and then it fails most of the time anyway, silently, without saving values to retry. And that's before you start doing things like a custom VLAN/SSID to properly wall them off, at which point you can ping them from terminal but the apps break completely.

Anyway, turns out that at least on A1 and P1S, there's enough functionality available through traditional means to skip the apps entirely. The handshake works fine across VLANs and utils like print status, file upload, and auto-start are available. Even the camera is reliable when pulled as a series of still images.

I had opus vibe out a replacement front end that gives me a simple upload and monitor UI for my A1, and it just kept hitting stretch goals. I added support for multiple printers so you can see them stacked on a single page and manage all of them from one place. And it even works on just-unboxed models that have never been through the official setup. SSID info on the SD card, it joins the network, immediately accessible via IP. Zero association/contact with any cloud or app, fully sandboxed/offline. Wrapped in a lil python launcher so I can run it from the dock instead of in the browser (just my preference).

Will probably open source it soon.

IMO this kind of thing is the answer to "what do you have to show for your LLM use". Cost was about $65 because I was using opus 4.6 with no regard for efficiency, and because there were multiple total refactors of two apps. An annoying problem I deal with almost every day now has a permanent, personalized solution that took me ~3 hours and would never have otherwise happened.

The network itself is also such a project. I previously hobbled together a working unifi setup, but it was primitive and brittle. With LLM guidance, I was able to build something much more robust. TrueNAS scale for file backup that also runs Frigate for POE cam mgmt (similarly sandboxed), raspi running the unifi controller, another for homeassistant, etc. Absolutely miserable few days getting that dialed, but now that we're out the other side, it's very nice. Reminds me of building the house. You suffer more upfront in exchange for something that fits you like a glove. Very rewarding.


Would love to see this. Though I wonder if Bambu will try and shut it down


I'm probably in your target audience.

Capture: notion and twitter have been best, obsidian and regular markdown have been worst.

Notion is good because of how they support a calendar view where you can put documents in a day's cell, and then see a list view that's just a stack of those notes. I keep a daily diary or youarehere type doc, where I'll have checklists and notes on small things that don't merit changes to a dedicated page. There's arguably a "retrieval" breakdown in that I don't really go back through these to update them or collate them into bigger pages.

Twitter is good because it's low friction and I can just go off, which is fun, and because they have decent search, so I can quote-tweet a related thing and sort of thread the graph together. If you're talking about BASB you're probably familiar with this corner of twitter. visakanv etc. This method works well if you use it enough to be able to recall your other notes. I think there's something special about the twitter format here too: it discourages whole-page thoughts in favor of sequential pithy bits, which i think are easier to both link and recall.

Execution: I would like a chat frontend (signal/SMS/etc) where I can just talk to my projects, ask the status of things, get suggestions, etc. Push based, rather than pull based, execution.

Active project context: I've dropped todoist-like things since they're limited in what they can express, and notion/markdown can do todolists etc. I tend to have lists in markdown style that live in two places: my daily diary/todo docs, and the actual projects themselves. This is messy and it would be lovely if notion or similar had the concept of a "todo block" and could collate all of them into a single view where I could understand association, prune and dedupe, etc. Even better if there's an agent that does or suggests cleanup whenever a new block enters.

Larger projects will get docs of their own, lots of sprawl and notes etc, and then some formalization around a spec or something. I move these to an archive folder when I'm done with the notes and the final document is fleshed out, but I'd love an agent review that makes sure I'm not leaving things on the cutting board, and that I've handled all the todos etc in my notes pages.

I don't use bidirectional linking/tagging enough, but I really should, since I want to be able to coin keywords for particular concepts inline, and then be able to access their overview and see everything that mentions them in a graphlike way.

Calendar is definitely a much used component day to day. For planning, etc. But it's not a source of truth. Everything on a calendar should just be a proxy/link to a more robust doc.

Hard nos: My take on privacy policies for things like this is "show me your incentives and I'll show you your outcomes". That is to say, any company that can survive an attempt to profit from data fuckery will do so. Your data retention policy should include technically unambiguous red lines that are not to be crossed, and define specific per-user monetary payout in the event that a breach occurs, to include clauses that cause user payout to occur before eg preferred stockholders get liquidation preference and drain the possible payout pool. Routine third party audits of how user data is handled/retained/distributed etc. I recognize that this is a bit unhinged, but that's what signaling credibility looks like. A company says "we won't sell your data" and I say "or what" and there's hemming and hawing because nothing will happen to them. If the answer is "this company dies on the spot and our investors get completely fucked", now we can talk.

I think AI service pricing applies here: generally, if it seems neat I could be in for $20 easy, and if it's genuinely game changing, $200/mo is completely reasonable to ask.

re Migration cost: I expect to be able to get 100% of my data in a reasonable non-proprietary format. If that's some blend of markdown, json, sqlite, whatever, fine.

But the bottom line for me, where does my second brain break down the most? It doesn't talk back to me. I want it to understand what I've got going on, and my idiosyncracies. I want to present it with new information and have it be like "oh, this relates to X" or, periodically, to pop up with something like "I'm noticing this correlation / related idea in areas X, Y, Z... does that resonate? Is there something here?" Again, push vs pull. My second brain should be a proactive chatbot. "Noise" is so often thought about in terms of frequency, but it's really about insight quality. If my response to 80% of push notis is "damn, good call" then you can send one every 5 minutes.

I also hear no mention of one's personal life. I don't really make the distinction. It's all in there. I should be able to bitch to this chatbot about my manager, have it know about that background, and riff with me to navigate hard convos. I should be able to talk to it about side projects I have going on, and let it thread those into my calendar. Etc. Notion is already an adequate second brain for work. Nobody has yet built an adequate second brain for the home. My house, my relationship(s), my side projects, my own diarying and self reflection... these are the contents of my brain that matter.

Email in bio if you want to talk. I'm a design technologist and happy to riff / give feedback.


This is truly a golden line—thank you so much!

This “push and pull” framework and the perspective that “noise equals insight quality” are precisely the core constraints I wanted to center my design around.

Two follow-up questions:

1.If the connector adopts a chat-first mode (similar to Signal/SMS), could this generate excessive noise? Since human input often carries emotion and subjective bias, my original intent was for the AI to serve as an emotionless, relatively neutral bridge.

2.Regarding trust mechanisms: Before implementing stricter governance measures (auditing/penalties), should we establish foundational safeguards through local-first storage + explicit export (md/json/sqlite)?

If you're open to deeper discussion and love to explore this further. I put additional information and an optional feedback form in my HN profile.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: