Hacker Newsnew | past | comments | ask | show | jobs | submit | maxrmk's commentslogin

I don't think this is a correct explanation of how things work these days. RL has really changed things.

Models based on RL are still just remixers as defined above, but their distribution can cover things that are unknown to humans due to being present in the synthetic training data, but not present in the corpus of human awareness. AlphaGo's move 37 is an example. It appears creative and new to outside observers, and it is creative and new, but it's not because the model is figuring out something new on the spot, it's because similar new things appeared in the synthetic training data used to train the model, and the model is summoning those patterns at inference time.

> the model is summoning those patterns at inference time.

You can make that claim about anything: "The human isn't being creative when they write a novel, they're just summoning patterns at typing time".

AlphaGo taught itself that move, then recalled it later. That's the bar for human creativity and you're holding AlphaGo to a higher standard without realizing it.


I can't really make that claim about human cognition, because I don't have enough understanding of how human cognition works. But even if I could, why is that relevant? It's still helpful, from both a pedagogical and scientific perspective, to specify precisely why there is seeming novelty in AI outputs. If we understand why, then we can maximize the amount of novelty that AI can produce.

AlphaGo didn't teach itself that move. The verifier taught AlphaGo that move. AlphaGo then recalled the same features during inference when faced with similar inputs.


>AlphaGo didn't teach itself that move. The verifier taught AlphaGo that move.

No. AlphaGo developed a heuristic by playing itself repeatedly, the heuristic then noticed the quality of that move in the moment.

Heuristics are the core of intelligence in terms of discovering novelty, but this is accessible to LLMs in principle.


> The verifier taught AlphaGo that move

Ok so it sounds like you want to give the rules of Go credit for that move, lol.


It feels like you're purposefully ignoring the logical points OP gives and you just really really want to anthropomorphize AlphaGo and make us appreciate how smart it (should I say he/she?) is ... while no one is even criticising the model's capabilities, but analyzing it.

Can you back that up with some logic for me?

I don't really play Go but I play chess, and it seems to me that most of what humans consider creativity in GM level play comes not in prep (studying opening lines/training) but in novel lines in real games (at inference time?). But that creativity absolutely comes from recalling patterns, which is exactly what OP criticizes as not creative(?!)

I guess I'm just having trouble finding a way to move the goalpost away from artificial creativity that doesn't also move it away from human creativity?


How a model is trained is different than how a model is constructed. A model’s construction defines its fundamental limitations, e.g. a linear regressor will never be able to provide meaningful inference on exponential data. Depending on how you train it, though, you can get such a model to provide acceptable results in some scenarios.

Mixing the two (training and construction) is rhetorically convenient (anthropomorphization), but holds us back in critically assessing a model’s capabilities.


Linear regression has well characterized mathematical properties. But we don't know the computational limits of stacked transformers. And so declaring what LLMs can't do is wildly premature.

> And so declaring what LLMs can't do is wildly premature.

The opposite is true as well. Emergent complexity isn’t limitless. Just like early physicists tried to explain the emergent complexity of the universe through experimentation and theory, so should we try to explain the emergent complexity of LLMs through experimentation and theory.

Specifically not pseudoscience, though.


>so should we try to explain the emergent complexity of LLMs through experimentation and theory.

Physicists had the real world to verify theories and explanations against.

So far anyone 'explaining the emergent complexity of LLMs through experimentation and theory' is essentially just making stuff up nobody can verify.


Well that’s why I provided the caveat “specifically not pseudoscience”, which is, as you described, “just making stuff up nobody can verify”.

If you say not pseudoscience and then make up pseudoscience anyway then what's the point? The field has not advanced anywhere enough in understanding for convoluted explanations about how LLMs can never do x to be anything but pseudoscience.

Sure, that's true as well. But I don't see this as a substantive response given that the only people making unsupported claims in this thread are those trying to deflate LLM capabilities.

So, to review this thread

  - OP asked for someone to make a logical argument for the separation of “training” from “model”
  - I made the argument
  - You cherry picked an argument against my specific example and made an appeal to emergent complexity
  - I pointed out that emergent complexity isn’t limitless
  - “the only people making unsupported claims in this thread are those trying to deflate LLM capabilities”

You made a pretty nonsensical argument, pretty much seems like the big standard for these arguments.

What does linear regression have to do with the limitations of a stacked transfer ? Absolutely nothing. This is the problem here. You don't know shit and just make up whatever. You can see people doing the same thing in GPT-1, 2, 3, 4 threads all telling us why LLMs will never be able to do thing it manages to do later.


> You don’t know shit

lol. Why so emotionally charged? Are you perhaps worried that you’ve invested too much time and effort into a technology that may not deliver what influencers have been promising for years? Like a proverbial bagholder?

> What does linear regression have to do with the limitations of a stacked transfer ? Absolutely nothing. This is the problem here.

We’re talking about fundamental concepts of modeling in this subthread. LLMs, despite what influencers may tell you, are simply models. I’ll even throw you a bone and admit they are models for intelligence. But they are still models, and therefore all of the things that we have learned about “models” since Plato are still relevant. Most importantly, since Plato we’ve known that “models” have fundamental limits vs. what they try to represent, otherwise they would be a facsimile, not a model.

> You can see people doing the same thing in GPT-1, 2, 3, 4 threads all telling us why LLMs will never be able to do thing it manages to do later.

I hope you enjoy winning these imaginary arguments against these imaginary comments. The fundamental limitations of LLMs discussed since GPT-1 have never been addressed by changing the architecture of the underlying model. All of the improvements we’ve experienced have been due to (1) improvements in training regime and (2) harnesses / heuristics (e.g. Agents).

Now, care to provide a counterargument that shows you know a little more than “shit”?


>We’re talking about fundamental concepts of modeling in this subthread. LLMs, despite what influencers may tell you, are simply models. I’ll even throw you a bone and admit they are models for intelligence. But they are still models, and therefore all of the things that we have learned about “models” since Plato are still relevant. Most importantly, since Plato we’ve known that “models” have fundamental limits vs. what they try to represent, otherwise they would be a facsimile, not a model.

Okay, but the brain is also “just a model” of the world in any meaningful sense, so that framing does not really get you anywhere. Calling something a model does not, by itself, establish a useful limit on what it can or cannot do. Invoking Plato here just sounds like pseudo-profundity rather than an actual argument.

>I hope you enjoy winning these imaginary arguments against these imaginary comments. The fundamental limitations of LLMs discussed since GPT-1 have never been addressed by changing the architecture of the underlying model. All of the improvements we’ve experienced have been due to (1) improvements in training regime and (2) harnesses / heuristics (e.g. Agents).

If a capability appears once training improves, scale increases, or better inference-time scaffolding is added, then it was not demonstrated to be a 'fundamental impossibility'.

That is the core issue with your argument: you keep presenting provisional limits as permanent ones, and then dressing that up as theory. A lot of people have done that before, and they have repeatedly been wrong.


To be clear, you are confusing me with other commenters in this thread. All I want is for those that liken LLMs to stochastic parrots and other deflationary claims to offer an argument that engages with the actual structure of LLMs and what we know about them. No one seems to be up to that challenge. But then I can't help but wonder where people's confident claims come from. I'm just tired of the half-baked claims and generic handwavy allusions that do nothing but short-circuit the potential for genuine insight.

No. AlphaGo does search, and does so imperfectly. It does come up with creative new patterns not seen before.

How do you know that? We don't have access to the logs to know anything about its training, and it's impossible for it to have trained on every potential position in Go.

Ironic coming from the Guardian. One of their journalists consistently publishes ai slop and the paper is in denial about it.

https://x.com/maxwelltani/status/2023089526445371777?s=46


It doesn't seem AI generated to me. Are we at the point where you have to write in a particularly outrageous style in order to not be accused of using AI?

This is either ChatGPT or the one journalist who influenced all of ChatGPT's writing style.

If you look at the replies[1] to that tweet, many commenters point out his style was entirely different prior to chatgpt.

[1] https://xcancel.com/maxwelltani/status/2023089526445371777?


I was giving this the benefit of the doubt as well and was just looking at his older writings that have a little "This article is more than 5 years old" banner above it. Looks totally different indeed.

>Are we at the point where you have to write in a particularly outrageous style in order to not be accused of using AI?

I don't think we've gotten to the extent that all popular writing styles (eg. hamburger paragraphs) are considered suspect, but the "it's not just X, it's Y" construction[1] attracts particular scrutiny.

[1] https://en.wikipedia.org/wiki/Wikipedia:Signs_of_AI_writing#...


Fair enough. It reads as extremely AI generated to me. But that isn’t completely reliable.

If a company relies on self reported ages, they don't "know" it well enough to satisfy COPPA. Probably. I'm not a lawyer but I do keep up with the latest in privacy enforcement and I think this is the way things are headed.

For the record, I'm against age verification laws. But I think companies are pushing for them because of liabilities they face under other laws, not because they would actually like to have the data.


The bill bans making access to a service contingent on consent. This would kill Gmail, Google Maps, Facebook, Instagram and basically every other ad supported service. Making subscriptions the only consumer business model would be bad imo.

The impacts of the model that BigTech currently follows closely resemble those of product dumping. Effectively banning that model would mean alternative subscription based platforms would stand a much better chance of succeeding than they currently do.

a) It wouldn't kill them. They would have to change their business model though.

b) Shouldn't our laws prioritize natural-persons over corporate desires?

Companies don't have a right to a specific revenue model. Humans should have a right to their own identity.


My desire is to be able to use those products without paying for them. And to use them with friends and family members that can't pay for them.

> My desire is to be able to use those products without paying for them.

You can't actually do that. None of those companies are charities.

You pay one way or another.


If you desired to inject heroin into your veins, that wouldnt mean we should decriminalize it

Sure, and while we're at it how about busses that you don't have to pay for? And food? And housing, why not?

In Europe, more and more public transportation is free, or at least very heavily subsidized

The costs are covered by local taxes, to curb on individual vehicle use and reduce congestion. After some hiccups, some cities manage good economies of scale where everybody, including the environment, wins.

As for housing and food, while there the incentive structure is more fragile, at least, we have homeless shelters that are free, and once again, everybody wins: the costs are very low, and cities are far safer and cleaner.


If transportation, housing, and food were paid for by giving private corporations and they demanded to incade our privacy as payment then i would be against that too actually

But in this occurrence, they are public services, under the public administration's control.

Successful socialism if you want.


So where does the money come from?

You are acting as if that is crazy talk. It IS possible. And desirable, to many of us.

Just not going to happen. But the post above was discussing wants.


From where comes the money?

How is paying for a product instead of being the product a bad thing?

That view is overly simplistic. People find real utility in ad-supported tools and apps.

But the question is if its a net negative for society. People found real utility is leaded gasoline too, but we rightfully had to ban that

I'm ok with that. for too long the parasites have hidden behind "advertising" as a way to collect data.

Say it loud so the kids in the back can here:

- IF IT IS FREE YOU ARE THE PRODUCT -


nonsense.

You could have a mail client with a static banner ad at the top.


Those exist! People choose to use gmail because of the scale, stability, and feature set paid for by targeted advertising.

How often are mongo instances exposed to the internet? I'm more of an SQL person and for those I know it's pretty uncommon, but does happen.


From my experience, Mongo DB's entire raison d'etre is "laziness".

* Don't worry about a schema.

* Don't worry about persistence or durability.

* Don't worry about reads or writes.

* Don't worry about connectivity.

This is basically the entire philosophy, so it's not surprising at all that users would also not worry about basic security.


Although interestingly, for all the mongo deployments I managed, the first time I saw a cluster publicly exposed without SSL was postgres :)


To the extent that any of this was ever true, it hasn’t been true for at least a decade. After the WiredTiger acquisition they really got their engineering shit together. You can argue it was several years too late but it did happen.


I got heavily burned pre-wiredtiger and swore to never use it again. Started a new job which uses it and it’s been… Painless, stable and fast with excellent support and good libraries. They did turn it around for sure.


Not only that, but authentication is much harder than it needs to be to set up (and is off by default).


I'm sure there are publicly exposed MySQLs too


There are many more exposed MySQLs than MongoDBs:

https://www.shodan.io/search?query=mongodb https://www.shodan.io/search?query=mysql https://www.shodan.io/search?query=postgresql

But this must be proportional to the overall popularity.


Most of your points are wrong. Maybe only 1- is valid'ish.


Ultimate webscale!


A highly cited reason for using mongo is that people would rather not figure out a schema. (N=3/3 for “serious” orgs I know using mongo).

That sort of inclination to push off doing the right thing now to save yourself a headache down the line probably overlaps with “let’s just make the db publicly exposed” instead of doing the work of setting up an internal network to save yourself a headache down the line.


> A highly cited reason for using mongo is that people would rather not figure out a schema.

Which is such a cop out, because there is always a schema. The only questions are whether it is designed, documented, and where it's implemented. Mongo requires some very explicit schema decisions, otherwise performance will quickly degrade.


Fowler describes it as Implicit vs Explicit schema, which feels right.

Kleppmann chooses "schema-on-read" vs "schema-on-write" for the same concept, which I find harder to grasp mentally, but describes when schema validation need occur.


I would have hoped that there would be no important data in mongoDB.

But now we can at least be rest assured that the important data in mongoDB is just very hard to read with the lack of schemas.

Probably all of that nasty "schema" work and tech debt will finally be done by hackers trying to make use of that information.


There is a surprising amount of important data in various Mongo instances around the world. Particularly within high finance, with multi-TB setups sprouting up here and there.

I suspect that this is in part due to historical inertia and exposure to SecDB designs.[0] Financial instruments can be hideously complex and they certainly are ever-evolving, so I can imagine a fixed schema for essentially constantly shifting time series universe would be challenging. When financial institutions began to adopt the SecDB model, MongoDB was available as a high-volume, "schemaless" KV store, with a reasonably good scaling story.

Combine that with the relatively incestuous nature of finance (they tend to poach and hire from within their own ranks), the average tenure of an engineer in one organisation being less than 4 years and you have an osmotic process of spreading "this at least works in this type of environment" knowledge. Add the naturally risk-averse nature of finance[ß] and you can see how one successful early adoption will quickly proliferate across the industry.

0: This was discussed at HN back in the day too: https://calpaterson.com/bank-python.html

ß: For an industry that loves to take financial risks - with other people's money of course, they're not stupid - the players in high finance are remarkably risk-averse when it comes to technology choices. Experimentation with something new and unknown carries a potentially unbounded downside with limited, slowly emerging upside.


I'd argue that there's a schema; it's just defined dynamically by the queries themselves. Given how much of the industry seems fine with dynamic typing in languages, it's always been weird to me how diehard people seem to be about this with databases. There have been plenty of legitimate reasons to be skeptical of mongodb over the years (especially in the early days), but this one really isn't any more of a big deal than using Python or JavaScript.


Yes there's a schema, but it's hard to maintain. You end up with 200 separate code locations rechecking that the data is in the expected shape. I've had to fix too many such messes at work after a project grinded to a halt. Ironically some people will do schemaless but use a statically typed lang for regular backend code, which doesn't buy you much. I'd totally do dynamic there. But DB schema is so little effort for the strong foundation it sets for your code.

Sometimes it comes from a misconception that your schema should never have to change as features are added, and so you need to cover all cases with 1-2 omni tables. Often named "node" and "edge."


> Ironically some people will do schemaless but use a statically typed lang for regular backend code, which doesn't buy you much. I'd totally do dynamic there.

I honestly feel like the opposite, at least if you're the only consumer of the data. I'd never really go out of my way to use a dynamically typed language, and at that point, I'm already going to be having to do something to get the data into my own language's types, and at that point, it doesn't really make a huge difference to me what format it used to be in. When there are a variety of clients being used though, this logic might not apply though.


If you're only consuming, yes. It might as well be a totally separate service. If it's your database that you read/write on, it's closely tied to your code.


We just sit a data persistence service infront of mongo and so we can enforce some controls for everything there if we need them, but quite often we don’t.

It’s probably better to check what you’re working on than blindly assuming this thing you’ve gotten from somewhere is the right shape anyway.


The "DAO" way like this is usually how it goes. It tends to become bloated. Best case, you're reimplementing what the schema would've done for you anyway.


The adage I always tell people is that in any successful system, the data will far outlive the code. People throw away front ends and middle layers all the time. This becomes so much harder to do if the schema is defined across a sprawling middle layer like you describe.


As someone who has done a lot of Ruby coding I would say using a statically typed database is almost a must when using a dynamically type language. The database enforces the data model and the Ruby code was mostly just glue on top of that data model.


That's fair, I could see an argument for "either the schema or the language needs to enforce schema". It's not obvious to me that one of the two models of "only one of them is" deserves to much more criticism than the other though.


What's weird to me is when dynamic typers don't acknowledge the tradeoff of quality vs upfront work.

I never said mongodb was wrong in that post, I just said it accumulated tech debt.

Let's stop feeling attacked over the negatives of tradeoffs


It's possible you didn't intend it, but your parent comment definitely came off as snarky, so I don't think you should be surprised that people responded in kind. You're honestly doing it again with the "let's stop feeling attacked" bit; whether you mean it or not, your phrasing comes across as pretty patronizing, and overall combined with the apparent dislike of people disagreeing with you after the snark it comes across as passive-aggressive. In general it's not going to go over well if you dish out criticism but can't take it.

In any case, you quite literally said there was a "lack of schemas", and I disagreed with that characterization. I certainly didn't feel attacked by it; I just didn't think it was the most accurate way to view things from a technical perspective.


Whatever horrors there are with mongo, it's still better than the shitshow that is Zope's ZODB.


The article links to a shodan scan reporting 213K exposed instances https://www.shodan.io/search?query=Product%3A%22MongoDB%22


It could be because when you leave an SQL server exposed it often turns into much worse things. For example, without additional configuration, PostgreSQL will default into a configuration that can own the entire host machine. There is probably some obscure feature that allows system process management, uploading a shell script or something else that isn't disabled by default.

The end result is "everyone" kind of knows that if you put a PostgreSQL instance up publicly facing without a password or with a weak/default password, it will be popped in minutes and you'll find out about it because the attackers are lazy and just running crypto-mine malware, etc.


My university has one exposed to the internet, and it's still not patched. Everyone is on holiday and I have no idea who to contact.


No one, if you aren't in the administration's good graces and something shitty happens unrelated to you, you've put a target on your back to be suspect #1.


For a long time, the default install had it binding to all interfaces and with authentication disabled.


often. lots of data leaks happened because of this. people spin it up in a cloud vm and forget it has a public ip all the time.


[dead]


Because nobody uses mongo for the reasons you listed. They use redis, dynamo, scylla or any number of enriched KV stores.

Mongo has spent its entire existence pretending to be a SQL database by poorly reinventing everything you get for free in postgres or mysql or cockroach.


False. Mongo never pretended to be a SQL database. But some dimwits insisted on using it for transactions, for whatever reason, and so it got transactional support, way later in life, and in non-sharded clusters in the initial release. People that know what they are doing have been using MongoDB for reliable horizontally-scalable document storage basically since 3.4. With proper complex indexing.

Scylla! Yes, it will store and fetch your simple data very quickly with very good operational characteristics. Not so good for complex querying and indexing.


Yeah fair, I was being a bit lazy here when writing my comment. I've used nosql professionally quite a bit, but always set up by others. When working on personal projects I reach for SQL first because I can throw something together and don't need ideal performance. You're absolutely right that they both have their place.

That being said the question was genuine - because I don't keep up with the ecosystem, I don't know it's ever valid practice to have a nosql db exposed to the internet.


What they wrote was pretty benign. They just asked how common it is for Mongo to be exposed. You seem to have taken that as a completely different statement


[dead]


They did not say it's rarely used at all.


Google (currently) pays Mozilla $400-500 million a year to be the default search engine in firefox.

edit: in 2023 they took in $653M in total, $555M of which was from Google. They spent $260M on software development, and $236M on other things.


The "other things" is what most people seem to have problem with.

Mozilla burns a batshit amount of money on feel good fancies.

If it were focused on its core mission -- building great software in key areas -- it would see it can't afford this, because that's the same money that if saved would make them financially independent of Google.


> Mozilla burns a batshit amount of money on feel good fancies.

How much?


  > In 2018, Baker received $2,458,350 in compensation from Mozilla.
  > In 2020, after returning to the position of CEO, Baker's salary was more than $3 million.
  > In 2021, her salary rose again to more than $5.5 million,
  > and again to over $6.9 million in 2022.
  >
  > https://en.wikipedia.org/wiki/Mitchell_Baker#Mozilla_Foundation_and_Mozilla_Corporation


And what percent of revenue was this?


0.55% in 2018, rising to 1.1% in 2022


Saving 1.1% of revenue would make them financially independent of Google?


>$236M on other things This is from another poster. I'm guessing stuff not related to Firefox development.


$236M included facilities, administration, marketing, and so on.


Yes, they should trim most of that fat.


How much is fat?


Took more than a minute to load on my macbook. Ouch!

I really love C# and the .net ecosystem, but they just haven't made it work for web.


Blazor is incredibly productive.

I wouldn't use it for consumer apps because it requires a Websocket connection to maintain state and probably doesn't scale very cheaply... but for business applications or personal tools it's actually kind of insane how much functionality you get out of the box (at least by the standards of statically typed languages).

To replicate this example in Typescript, I'd probably still be installing packages in the time it took to write the 20 lines of code it contains: https://learn.microsoft.com/en-us/aspnet/core/blazor/compone...


.NET works amazingly on the web. This is just not the UI framework you would use.

There is ASP.NET of course and Razor Pages. We all use apps built with these every day without even realizing it. There are other great frameworks as well.

I do not even see Blazor as a real web technology but of course it is positioned that way.

MAUI is a "cross-platform" and frankly mobile first UI framework. It was never meant for the web.


What do you see as net negative about it? I’m familiar with the product but not that aware of how it’s been used.


It's basically a way for people to externalize tasks that require a human but pay fractions of what it would cost to actually employ those humans.

Mechanical Turk was one of the early entrants into "how can we rebrand outsourcing low skill labor to impoverished people and pay them the absolute bare minimum as the gig economy".


Much of the low skill labor were things like writing transcripts and turning receipts into plaintext. It was at a point where OCR wasn't reliable. There were a few specialist tasks.

The gig economy was very much a net positive here. Some people used it to quit factory work and make twice the income; some used it as negotiation terms against the more tyrannical factories. Factories were sometimes a closed ecosystem here - factory workers would live in hostels, eat the free factory food or the cheap street food that cropped up near the area. They'd meet and marry other factory workers, have kids, who'd also work there. They were a modern little serfdom. Same goes for plantations.

Things like gig work and mturk were an exit from that. Not always leaving an unhappy or dangerous life, but making their own life.

If it paid badly, just don't work there. These things push wages down for this kind of work, but this work probably shouldn't be done in service economies anyway.


> If it paid badly, just don't work there. These things push wages down for this kind of work, but this work probably shouldn't be done in service economies anyway.

This paragraph is so tantalizingly close to putting its finger on the issue. The fact that a company found someone willing to do a job for what they want to pay does not mean that it's ethical or moral for them to do so.

In this case (as in many others), one of the predicates was finding groups of people whose existing options, financial literacy, living conditions, or some combination of the three were already so bad that becoming digital serfs was a minor step up.


I got paid $11 an hour to enter handwritten applications into a database, as a temp job back in the early 2010s. It was "low-skill" inasmuch as, "Locking in and moving efficiently through entire filing cabinets of forms, often written by people whose first script was not Latin, for 6-7 hours straight, every weekday, for 2 months, with no prior training," is "low-skill" (and I apparently did it much faster than my supervisors expected). $11/hr was less than it should have paid, and yet I have to commend the company I was working with, because they sourced local labor and paid still multiple times what the job would have commanded through outsourcing via Mturk.

The conditions you're describing were caused by the systemic globalist status quo that Mturk is a part of; Mturk did not fix that, it perpetuated it.


It's not a fraction of what it would cost to actually employ those humans, since there were humans who clearly chose to do that work when presented with the opportunity.

I think this is a very first-world oriented take. It efficiently distributed low-value workloads to people who were willing to do it for the pay provided. The market was efficient, and the wages were clearly on par with those who were doing the work found economical to do, considering they did (and still do) the work for the wages provided.


Yes, and "use the output of MTurk workers to make themselves redundant."


Do you think there _should_ be a legal mechanism for enforcing the kind of rules they're trying to create here? I have mixed feelings about it.


Evil feels strong? Small companies benefit from having the basic feature set subsidized by big cos. It's kind of hard for me to imagine a scenario where pricing of a saas product could be _evil_. you can just choose not to do business with them!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: