Hacker Newsnew | past | comments | ask | show | jobs | submit | lagrange77's commentslogin

> The headline seems pretty misleading.

No it isn't. Performing fingerprinting on user's devices, to ultimately profit of financially or worse is misleading. Especially doing this while knowing the user isn't aware what this really means and just deciding it for them.

The headline is just an exaggerated way of saying what is really happening.


Some CAD systems, i think NX for example, let you give it a reference to an actual Excel (or csv?) file, that you edit in Excel.

> Obviously we shouldnt map existing apis to MCP tools

Why? Isn't obvious to me..


Yeah, this deserves a quick explanation.

When a human is coding against a traditional API, it might be a bit annoying if the API has four or five similar-sounding endpoints that each have a dozen parameters, but it's ultimately not a showstopper. You just spend a little extra time in the API docs, do some Googling to see what people are using for similar use cases, decide which one to use (or try a couple and see which actually gets you what you want), commit it, and your script lives happily ever after.

When an AI is trying to make that decision at runtime, having a set of confusing tools can easily derail it. The MCP protocol doesn't have a step that allows it to say "wait, this MCP server is badly designed, let me do some Googling to figure out which tool people are using for similar use cases". So it'll just pick whichever ones seems most likely to be correct, and if it's wrong, then it's just wasted time and tokens and it needs to try the next option. Scaled up to thousands or millions of times a day, it's pretty significant.

There's a lot of MCP servers out there that are just lazy mappings from OpenAPI/Swagger specs, and it often (not always, to be fair) results in a clunky, confusing mess of tools.


It's really time that mainstream media picks up on 'agentic coding' and the implications of writing software becoming a commodity.

I'm an engineer (not only software) by heart, but after seeing what Opus 4.6 based agents are capable of and especially the rate of improvement, i think the direction is clear.


I like 4.6 and agents based on it but can only qualify it as moderately useful.


Finally! I've been waiting for something like this.


The background ASCII animation is so cool! Is it an actual simulation?


Use the source Luke! It's an "ASCII plasma background" rendered into a canvas element.


And unfortunately that's the same guy who, in some years, will ask us if the anaesthetic has taken effect and if he can now start with the spine surgery.


With checking only the last name. not birthday, photo.


> Reward hacking is very real and hard to guard against.

Is it really about rewards? Im genuinely curious. Because its not a RL model.


I'm noticing terms related to DL/RL/NLP are being used more and more informally as AI takes over more of the cultural zeitgeist and people want to use the fancy new terms of the era, even if inaccurately. A friend told me he "trained and fine tuned a custom agent" for his work when what he meant was he modified a claude.md file.


Respectfully, your friend doesn't know what he is talking about and is saying things that just "feel right" (vibe talking??). Which might be exactly how technical terms lose their meaning so perhaps you're exactly right.


There is a nontrivial amount of RL training (RLHF, RLVR, ...), so it would be reasonable to call it an RL model.

And with that comes reward hacking - which isn't really about looking for more reward but rather that the model has learned patterns of behavior that got reward in the train env.

That is, any kind of vulnerability in the train env manifests as something you'd recognize as reward hacking in the real world: making tests pass _no matter what_ (because the train env rewarded that behavior), being wildly sycophantic (because the human evaluators rewarded that behavior), etc.


> There is a nontrivial amount of RL training (RLHF, RLVR, ...), so it would be reasonable to call it an RL model.

Hm, as i understand it, parts of the training of e.g. ChatGPT could be called RL models. But the subject to be trained/fine tuned is still a seq2seq next token predictor transformer neural net.


RL is simply a broad category of training methods. It's not really an architecture per se: modern GPTs are trained first on reconstruction objective on massive text corpora (the 'large language' part), then on various RL objectives +/- more post-training depending on which lab.


> Is it really about rewards? Im genuinely curious. Because its not a RL model.

Ha, good point. I was using it informally (you could handwave and call it an intrinsic reward if a model is well aligned to completing tasks as requested), but I hadn't really thought about it.

Searching around, it seems like I'm not alone, but it looks like "specification gaming" is also sometimes used, like: https://deepmind.google/blog/specification-gaming-the-flip-s...


They probably meant goal hacking. (I just made that up)


I refer to it as ‘wanking’. It’s doing something that’s unproductive but that’s incentivised by its architecture.


I'll use that term from now on. :D


What is your perspective on the matter from a parent point of view?


I think an humble and open mind is essential. I think that we reap what we sow, but also that struggle makes us robust.

I try to explain stuff to my kids, to the best of my ability, but give them room to make their own conclusions. As an old fart, there is a limit to how relevant my world will be to them - and I have to acknowledge that.

Change is scary and not always for the better, but in my humble opinion; we have nothing to lose and everything to gain.

I, For One, Welcome Our New AI Overlords :]


Imho, the german mentality just doesn't fit today's economy. Too risk averse, too conservative. Creativity is not really embraced.

The state of the german IT sector also shows that.

Most startups have nearly no moat at all and purely live off marketing with some sprinkles of corporate identity.


In Switzerland, we use a lot of German and Switzerland born products, and they mostly suck.


I'm not saying there are no good products. Hetzner Cloud come into my mind for example. It's executed really well.

I'm saying that the number of good software offerings is too low, to have a significant impact on the country's economy.

One of the advantages Germany had though, was a somewhat good and accessible higher education system in regards of computer science.

Now, with software development becoming a commodity, this advantage vanishes.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: