Hopefully the AI knows whether 9.11 is greater or less than 9.9, something ChatG...

spencerchubb · on July 16, 2024

that's a tokenization issue. every tool has strengths and weaknesses. why does it matter whether an LLM can compare numbers? that can be done trivially in any programming language

Ylpertnodi · on July 16, 2024

As an AI layman (downloaded Claude for android as a result of hn, just today) "why does it matter whether an LLM can compare numbers?", is rather important to me. Probably others, also.

awwaiid · on July 17, 2024

I was going to say beware because there isn't an Anthropic Claude official app.... but I checked and I guess as of today there is one hah. https://www.anthropic.com/news/android-app

spencerchubb · on July 16, 2024

i can understand that perspective. as an end user, you would like your application to handle math questions correctly. it's true that llms are not the best at math

as an application developer, if we need an llm to be good at math, one solution is to give it access to a python interpret

wrsh07 · on July 16, 2024

It doesn't seem super relevant to karpathy announcing he's created a company so that he can increase the production value of his AI YouTube videos

I mean, sure, the company is ostensibly going to also teach math at some point, but karpathy will not be using gpt 4o for that when it launches (what do you think his timelines are? Do you think he is going to be able to solve trivial things like "having the llm use something like function calling to do math"? If you're unfamiliar with his work, karpathy is a very good engineer, and this is a small problem that anybody working in building production apps on LLMs can easily deal with)

ryandrake · on July 16, 2024

If I'm going to augment my education with AI, I'd at least want to know it could get basic numerical facts right. If a computer program struggles with the concept of a number being greater than another number, how do I have any confidence that it can teach physics?

spencerchubb · on July 16, 2024

your concern would be valid if an llm were the ONLY tool being used. applications use multiple tools so you can use the appropriate tool for the job. if you're doing math, you don't want a standalone llm

ryandrake · on July 17, 2024

As a student, how would I know what things the LLM tutor can provide correct answers for, and what things I will need to "use appropriate tools" for? Should I rely on the LLM to help teach me spelling, or US history, or are there more appropriate tools for these, too?

spencerchubb · on July 17, 2024

if the product is great, hopefully you as the user would not have to worry about which tool is doing which task. the developers would worry about that. it's the same in any app. the user doesn't know or care what tool is used to render the frontend or store the data in the backend

j2kun · on July 17, 2024

Except that the LLM is actively dismissing its discrepancy with the other tool. Just adding to the confusion.

Karrot_Kream · on July 17, 2024

Since we're all just speculating to the wind here, I can see multiple ways LLMs can be used. Maybe it'll help simplify TA triage, maybe it'll just be a Discord bot. Maybe a classifier will sample from multiple models.

I think if anyone can give this idea a fair shot it's Andrew Karpathy, an ML expert and a person known to be passionate about education.

Karrot_Kream · on July 17, 2024

*Andrej

Darn autocorrect

SrslyJosh · on July 16, 2024

AI bros: "Why does it matter whether [a program that people want to use to teach children] can [tell whether one number is bigger than another]?"

=)

spencerchubb · on July 16, 2024

applications can use multiple tools. an llm is just one tool, and it will not cover every use case, such as math. that does not detract from the utility of llms

ryandrake · on July 16, 2024

"But, teacher, it was a tokenization issue!"

freejazz · on July 16, 2024

Are you saying we shouldn't teach children things that "can be done trivially in any programming language?"

How will they know what they are doing in that language?

layer8 · on July 16, 2024

At least LLMs know when to use upper case.