Experts Have World Models. LLMs Have Word Models

www.latent.space - 37 poäng - 38 kommentarer - 17987 sekunder sedan

Kommentarer (9)

nwhnwh - 9819 sekunder sedan
[flagged]
swyx - 12956 sekunder sedan
editor here! all questions welcome - this is a topic i've been pursuing in the podcast for much of the past year... links inside.
D-Machine - 17064 sekunder sedan
Fun play on words. But yes, LLMs are Large Language Models, not Large World Models. This matters because (1) the world cannot be modeled anywhere close to completely with language alone, and (2) language only somewhat models the world (much in language is convention, wrong, or not concerned with modeling the world, but other concerns like persuasion, causing emotions, or fantasy / imagination).
It is somewhat complicated by the fact LLMs (and VLMs) are also trained in some cases on more than simple language found on the internet (e.g. code, math, images / videos), but the same insight remains true. The interesting question is to just see how far we can get with (2) anyway.
akomtu - 7968 sekunder sedan
Llame Word Models.
darepublic - 11241 sekunder sedan
Large embedding model
SecretDreams - 10320 sekunder sedan
Are people really using AI just to write a slack message??
Also, Priya is in the same "world" as everyone else. They have the context that the new person is 3 weeks in and must probably need some help because they're new, are actually reaching out, and impressions matter, even if they said "not urgent". "Not urgent" seldom is taken at face value. It doesn't necessarily mean it's urgent, but it means "I need help, but I'm being polite".
calf - 8057 sekunder sedan
My Sunday morning speculation is that LLMs, and sufficiently complex neural nets in general, are a kind of Frankenstein phenomenon, they are heavily statistical, yet also partly, subtly doing novel computational and cognitive-like processes (such as world models). To dismiss either aspect is a false binary; the scientific question is distinguishing which part of an LLM is which, which by our current level of scientific understanding is virtually like trying to ask when is an electron a wave or a particle.
naasking - 12161 sekunder sedan
I think it's correct to say that LLM have word models, and given words are correlated with the world, they also have degenerate world models, just with lots of inconsistencies and holes. Tokenization issues aside, LLMs will likely also have some limitations due to this. Multimodality should address many of these holes.
measurablefunc - 10733 sekunder sedan
Makes the same mistake as all other prognostications: programming is not like chess. Chess is a finite & closed domain w/ finitely many rules. The same is not true for programming b/c the domain of programs is not finitely axiomatizable like chess. There is also no win condition in programming, there are lots of interesting programs that do not have a clear cut specification (games being one obvious category).