This is a post from Robin Sloan’s lab blog & notebook. You can visit the blog’s homepage, or learn more about me.

Phase change

March 10, 2023
An extremely close-up photograph of a snowflake, looking almost architectural.
Snowflake, Wilson Bentley, ca. 1910

Ear­lier this week, in my newsletter, I praised a new project from Matt Webb. Here, I want to come at it from a dif­ferent angle.

Briefly: Matt has built the Braggoscope, a fun and useful appli­ca­tion for exploring the archives of the beloved BBC radio show In Our Time, hosted by the inim­itable Melvyn Bragg.

In Our Time only pro­vides HTML pages for each episode — there’s no struc­tured data, no sense of “episode X is con­nected to episode Y because of shared fea­ture Z”.

As Matt explains in his write-up, he fed the plain-lan­guage con­tent of each episode page into the GPT-3 API, clev­erly prompting it to extract basic metadata, along with a few sub­tler properties — including a Dewey Dec­imal number!?

(Explaining how and why a person might prompt a lan­guage model is beyond the scope of this newsletter; you can read up about it here.)

Here’s a bit of Matt’s prompt:

Extract the description and a list of guests from the supplied episode notes from a podcast.

Also provide a Dewey Decimal Classification code and label for the description

Return valid JSON conforming to the following Typescript type definition:

{
  "description": string,
  "guests": {"name": string, "affiliation": string | null}[]
  "dewey_decimal": {"code": string, "label": string},
}

Episode synopsis (Markdown):

{notes}

Valid JSON:

Impor­tant to say: it doesn’t work perfectly. Matt reports that GPT-3 doesn’t always return valid JSON, and if you browse the Braggoscope, you’ll find plenty of ques­tion­able filing choices.

And yet! What a technique. (Matt credits Noah Brier for the insight.)

It fits into a pat­tern I’ve noticed: while the buzzy appli­ca­tion of the GPT-alikes is chat, the real work­horse might be text trans­for­ma­tion.

As Matt writes:

Sure Google is all-in on AI in products, announcing chat­bots to com­pete with ChatGPT, and syn­the­sised text in the search engine. BUT.

Using GPT-3 as a func­tion call.

Using GPT-3 as a uni­versal coupling.

It brings a lot within reach.

I think the mag­ni­tude of this shift … I would say it’s on the order of the web from the mid 90s? There was a rad­ical sim­pli­fi­ca­tion and democ­ra­ti­sa­tion of soft­ware (architecture, development, deployment, use) that took decades to really unfold.

For me, 2022 and 2023 have pre­sented two thick strands of inquiry: the web and AI, AI and the web. This is evi­denced by the struc­ture of these lab newsletters, which have tended towards birfucation.

Matt’s thinking is inter­esting to me because it brings the strands together.

One of the plea­sures of HTTP (the orig­inal ver­sion) is that it’s almost plain lan­guage, though a very simple kind. You can exe­cute an HTTP request “by hand”: telnet www.google.com 80 fol­lowed by GET /.

Lan­guage models as uni­versal cou­plers begin to sug­gest pro­to­cols that really are plain lan­guage. What if the pro­tocol of the GPT-alikes is just a bare TCP socket car­rying free-form requests and instructions? What if the RSS feed of the future is simply my lan­guage model replying to yours when it asks, “What’s up with Robin lately?”

I like this because I hate it; because it’s weird, and makes me feel uncomfortable.


I think it’s really chal­lenging to find the appro­priate stance towards this stuff.

On one hand, I find crit­ical deflation, of the kind you’ll hear from Ted Chiang, Simon Willison, and Claire Lei­bowicz in this recent episode of KQED Forum, appro­priate and useful. The hype is so pow­erful that any cor­rec­tive is welcome.

However! On the crit­ical side, the eval­u­a­tion of what’s before us isn’t sufficient; not even close. If we demand humility from AI engi­neers, then we ought to match it with imagination.

An impor­tant fact about these lan­guage models — one that sets them apart from, say, the per­sonal com­puter, or the iPhone — is that their capa­bil­i­ties have been sur­prising, often confounding, even to their creators.

AI at this moment feels like a mash-up of pro­gram­ming and biology. The pro­gram­ming part is obvious; the biology part becomes apparent when you see AI engi­neers probing their own creations the way you might probe a mouse in a lab.

The simple fact is: even at the highest levels of theory and practice, no one knows how these lan­guage models are doing what they’re doing.

Over the past few years, in the evo­lu­tion from GPT-2-alikes to GPT-3-alikes and beyond, it’s become clear that the “returns to scale”—both in terms of (1) a model’s size and (2) the scope of its training data — are expo­nen­tial and nonlinear. Simply adding more works better, and works weirder, than it should.

The non­lin­earity is, to me, the most inter­esting part. As these models have grown, they have under­gone widely observed “phase changes” in capability, just as sudden and sur­prising as water frozen or cream whipped.

At the moment, my deepest engage­ment with a lan­guage model is in a channel on a Dis­cord server, where our gal­lant host has set up a ChatGPT-powered bot and laced a simple per­sonality into its prompt. The socia­bility has been a revelation — multiplayer ChatGPT is much, MUCH more fun than single player — and, of course, the con­ver­sa­tion tends towards goading the bot, testing its boundaries, luring it into absurdities.

The bot writes poems, sure, and song lyrics, and movie scenes.

The bot also pro­duces ASCII art, and SVG code, and PICO-8 programs, though they don’t always run.

I find myself deeply ambivalent, in the orig­inal sense of: thinking many things at once. I’m very aware of the bot’s limitations, but/and I find myself stunned by its fluency, its range.

Listen: you can be a skeptic. In some ways, I am! But these phase changes have happened, and that prob­ably means they will keep happening, and no one knows (the AI engi­neers least of all) what might sud­denly become possible.

As ever, Jack Clark is my guide. He’s a jour­nalist turned AI practioner, involved in policy and plan­ning at the highest levels, first at OpenAI, now at Anthropic. And if he’s no longer a dis­in­ter­ested observer, he remains deeply grounded and moral, which makes me trust him when he says, with confidence: this is the biggest thing going, and we had all better brace for weird times ahead.


What does that mean, to brace for it?

I’ve found it helpful, these past few years, to frame my anx­i­eties and dis­sat­is­fac­tions as ques­tions. For example, fed up with the state of social media, I asked: what do I want from the internet, anyway?

It turns out I had an answer to that ques­tion.

Where the GPT-alikes are con­cerned, a ques­tion that’s emerging for me is:

What could I do with a uni­versal func­tion — a tool for turning just about any X into just about any Y with plain lan­guage instructions?

I don’t pose that ques­tion with any sense of wide-eyed expectation; a rea­son­able answer might be, nothing much. Not every­thing in the world depends on the trans­for­ma­tion of symbols. But I think that IS the ques­tion, and I think it takes some legit­i­mate work, some stren­uous imagination, to push your­self to believe it really will be “just about any X” into “just about any Y”.

I help operate a small olive oil company, and I have spent a bit of time lately con­sid­ering this ques­tion in the con­text of our business. What might a GPT-alike do for us? What might an even more capable system do?

My answer, so far, is indeed: nothing much! It’s a phys­ical business, after all, mainly con­cerned with moving and trans­forming matter. The “obvious” appli­ca­tion is cus­tomer sup­port, which I handle myself, and which I am unwilling to cede to a com­puter or, indeed, anyone who isn’t me. The spe­cific quality and char­acter of our sup­port is impor­tant.

(As an aside: every cus­tomer sup­port request I receive is a minia­ture puzzle, usu­ally requiring deduc­tion across sev­eral dif­ferent sys­tems. Many of these puz­zles are chal­lenging even to the gen­eral intel­li­gence that is me; if it comes to pass that a GPT-alike can handle them without breaking a sweat, I will be very, very impressed.)

(Of course, it’s not going to happen like that, is it? Long before GPT-alikes can solve the same prob­lems Robin can, using the tools Robin has, the prob­lems them­selves will change to meet the GPT-alikes halfway. The sys­tems will all learn to “speak GPT”, in some sense.)

The simple act of asking and answering the ques­tion was clar­i­fying and calming. It plucked AI out of the realm of abstract dread and plunked it down on the workbench.


Jack Clark includes, in all of his AI newsletters, a piece of orig­inal micro-fiction. One of them, sent in December, has stayed with me. I’ll repro­duce it here in full:

Reality Authen­ti­ca­tion

[The internet, 2034]

“To login, spit into the bio-API”

I took a sip of water and swirled it around my mouth a bit, then hawked some spit into the little cup on my desk, put its lid on, then flipped over the recep­tacle and plugged it into the bio-API system.

“Authenticating … authentication successful, human-user identified. Enjoy your time on the appli­ca­tion!”

I spent a couple of hours logged-on, doing a mix­ture of work and pleasure. I was part of an all-human gaming league called the No-Centaurs; we came second in a mini tournament. I also talked to my ther­a­pist sans his augment, and I sent a few emails over the BioNet pro­tocol.

When I logged out, I went back to the reg­ular internet. Since the AI models had got mini­tu­ar­ized and pro­lif­er­ated a decade ago, the internet had rad­ically changed. For one thing, it was so much faster now. It was also dan­gerous in ways it hadn’t been before - Atten­tion Har­vesters were every­where and the only reason I was con­fi­dent in my browsing was I’d paid for a few pro­tec­tion programs.

I think “brace for it” might mean imag­ining human-only spaces, online and off. We might be headed, paradoxically, for a golden age of “get that robot out of my face”.

In the extreme case, if AI doesn’t wreck the world, lan­guage models could cer­tainly wreck the internet, like Jack’s Atten­tion Har­vesters above. Maybe we’ll look back at the Web Parenthesis, 1990-2030. It was weird and fun, though no one in the future will quite under­stand the appeal.

We are living and thinking together in an inter­esting time. My rec­om­men­da­tion is to avoid chasing the ball of AI around the field, always a step behind. Instead, set your stance a little wider and form a ques­tion that actu­ally mat­ters to you.

It might be as simple as: is this kind of capability, extrap­o­lated forward, useful to me and my work? If so, how?

It might be as wacky as: what kind of pro­tocol could I build around plain lan­guage, the totally sci-fi vision of com­puters just TALKING to each other?

It might even be my orig­inal ques­tion, or a ver­sion of it: what do I want from the internet, anyway?

To the blog home page