This is a post from Robin Sloan’s lab blog & notebook. You can visit the blog’s homepage, or learn more about me.

The teacher lies sometimes

March 1, 2025
Student at a Table by Candlelight, 1642-45, Salomon Savery
Student at a Table by Candlelight, 1642-45, Salomon Savery

Worth acknowledging, if one must go on and on about lan­guage models: they still con­stantly make up the most ridicu­lous bullshit.

Looking for books about a semi-obscure sub­ject, I gamely asked Claude for recs and promptly received a list of six, none of which exist. Yet the authors of three were real — and one of those authors spent a whole career studying pre­cisely the semi-obscure sub­ject of my inquiry. Five min­utes later, I had an Abe­Books cart ready to go.

It’s dif­fi­cult to call that expe­ri­ence a success, yet it clearly wasn’t a failure, either.

A sim­ilar story, with code: recently I was trying to render some images using Blender, and I wanted to do so entirely from the com­mand line — all scripts. I knew this was possible, but I’d never used Blender’s Python API, and basic web research was proving dif­fi­cult; there aren’t a ton of exam­ples out there, and those I could find focused on building exten­sions for the app’s GUI rather than side­step­ping it entirely.

Claude got me started. Not without inventing many dozens of nonex­is­tent functions; not without end­lessly jum­bling ver­sions of the API; but with enough sense and struc­ture to help me under­stand the Blender Way. Now, the raw API docs make sense, and I’m off to the races.

Lan­guage models have been framed as insur­gent com­peti­tors to search, but presently the expe­ri­ences are pretty sim­ilar, not in form but in requirement. In both cases, suc­cessful use demands con­fi­dent nav­i­ga­tion and quick triage. Woe unto the Google user who clicks the first search result, and woe unto the Claude user who believes it.

It’s dizzying for a machine to be so powerful, yet so clearly unsuit­able for any kind of decision-making with actual consequences. “Yes, this is the most com­plex and broadly capable com­puter pro­gram ever deployed. No, you can only ask it about silly stuff that doesn’t really matter.”

I have to say, if it was me, I would be too embar­rassed to release a product that so con­fi­dently pro­duces so much bullshit. I sup­pose I’m glad it’s not me, though, because there’s plenty of value to be snatched from the jaws of confabulation.

To the blog home page