This is a post from Robin Sloan’s lab blog & notebook. You can visit the blog’s homepage, or learn more about me.

Reasons-ing models

February 15, 2025

It often hap­pens for me that, after I write something, the most impor­tant part turns out to be a stop along the way, an inci­dental phrase. Case in point:

In the warmup to my recent post about the foun­da­tional ques­tion of lan­guage models, I wrote about what the models are doing. I think “mod­eling text” under­sells the mech­a­nism. The sys­tems I tin­kered with back in the late 2010s “modeled text”; these new ones, while mechan­i­cally very similar, are qualitatively, vis­cer­ally dif­ferent.

So, instead, I decided to say that

lan­guage models col­late and pre­cip­i­tate all the diverse rea­sons for writing, across a huge swath of human activity and aspi­ra­tion

and I chose the word “rea­sons” with care. In fact, I think the big lan­guage models “see through” the veil of text, into the diverse rea­sons humans had, and have, for pro­ducing it. It’s pre­cisely the diver­sity of those rea­sons that make those models so capable — diver­sity not pri­marily in style, but intention. Why was this text composed? What was its writer trying to accom­plish in the world?

Much has been made of next-token pre­diction, the ham­ster wheel at the heart of everything. (Has a sim­pler mech­a­nism ever attracted richer investments?) But, to pre­dict the next token, a model needs a prob­able word, a likely sentence, a vir­tual reason — a beam run­ning out into the darkness. This ghostly superstructure, which informs every next-token pre­diction, is the model, the thing that grows on the trellis of code; I con­tend it is a map of poten­tial rea­sons.

In this view, the emer­gence of super-capable new models is less about rea­soning and more about “rea­sons-ing”: mod­eling the dif­ferent things humans can want, along with the dif­ferent ways they can pursue them … in writing.

Reasons-ing, not rea­soning. Playful turns of this kind can seem airy and friv­o­lous, entirely linguistic … and, okay, they usu­ally are … but this one has changed the way I think about these models, so I offer it to you, friv­o­lous or not.

Naturally, a lan­guage model’s rea­sons-ing is bounded by its training data. As it hap­pens, a useful frac­tion of human desire and action is encoded in writing, pro­duced by someone, somewhere, sometime. But of course this map of rea­sons is far from complete.

One can easily imagine a vast trove of video, showing humans doing all sorts of dif­ferent things for dif­ferent rea­sons. If it was suf­fi­ciently diverse, and if it could be processed, such a trove could also inform a process of rea­sons-ing, and the rea­sons would be dif­ferent. Presently, both of those “if”s are very far off. What dis­tin­guishes text is its avail­ability and tractability.

A good ques­tion might be, can lan­guage models develop and pursue truly new rea­sons for writing? Prob­ably not. How do humans develop and pursue truly new rea­sons for writing? I’m not sure. I do know it’s one of the most inter­esting and impor­tant things humans can do. Think of the emer­gence of written law, the birth of the novel; think of double-entry bookkeeping, haiku. (It’s inter­esting to note the degree to which those things all con­nect to, and rely on, the phys­ical world. I mean, maybe that’s essential: the rea­sons exist before, and/or closer to ground-floor reality than, the writing itself.)

I think this playful turn cuts both way. On one hand, it grants the lan­guage models richer internal universes, as they “see through” the veil of text into deeper causes, under­lying rea­sons. On the other hand, it cau­tions you that the models can, for sure, fool you into thinking they’re rea­soning, when they are only nimbly rea­sons-ing.

To the blog home page