Winter Garden

This is Robin Sloan’s pop-up newsletter of 2026 —
it will run for six editions, then self-destruct.
Learn more here &/or subscribe below.

June thoughts

Transmitted 20260604 · · · 254 days before impact
The Forging of the Sampo, 1893, Akseli Gallen-Kallela
The Forging of the Sampo, 1893, Akseli Gallen-Kallela

I love CrankGPT, the 100% local, 100% hand-powered AI solution! It’s both a puckish, provoca­tive demo AND a real exper­i­ment in low-power, sus­tain­able hardware, beautifully doc­u­mented.

This is lovely:

You can feel that load curve through the crank: when LLM infer­ence and speech syn­thesis run together, the crank gets a lot harder to turn.

I’m an avowed fan of the Dying Earth genre of sci-fi, in which long-lost tech resur­faces as magic, so nat­u­rally I am imag­ining a band of scav­engers unearthing this device circa the year 13,000. Pre­served in an air­tight crypt, the mech­a­nism still operates, and they begin to crank, crank, crank … 


Here is After Automation, a rich essay by Dan Shipper. Characteristically, Dan’s view is deeply engaged by, and opti­mistic about, the agentic AI future-present, but/and also circumspect. I appre­ci­ated his sec­tion on the GDPval benchmark, and the notion of “smuggled intelligence” in too-neat tasks.

I also found Dan’s reading of “agent” useful, and not a little bit literary:

This is why “agent” is such an easily mis­un­der­stood word. The models have more and more ability to act autonomously. But agency, in the human sense, is not just action. It is wanting for oneself. It is play for the sake of it. Model com­pli­ance and help­ful­ness are fun­da­men­tally at odds with this kind of agency, so even as the models improve, the gap between models and humans will remain.


In his review of a new model, Dan alludes to Every’s internal writing quality benchmark … 

Opus 4.8 scored a 79.6 on our writing benchmark — measuring models on real-world writing tasks we do all of the time like essay writing, promo email writing, and more.

 … and I am des­perate to know how it’s evaluated/judged. It seems to me the only way to rea­son­ably assess a model’s writing quality would be to actu­ally read it (with human eyes and a human brain)—yet, in that case, how is a pre­cise number attached? Inquiring minds want to know!


Here is Jas­mine Sun:

“being real with yourself” is the most impor­tant cog­ni­tive skill for the AI age

AI makes it easier to lie to yourself. you gotta be able to hon­estly answer: am I actu­ally thinking with AI, or am I let­ting it do the hard part for me? is this essay/product/business a good idea, or did AI con­vince me it was?

then you won’t need hard rules like “always/never use AI for X.” if you pay atten­tion and avoid self-deception, you can feel when you are doing real work

That’s via Diana Kim­ball Berlin’s weekly newsletter … and it occurs to me that if you read both Jas­mine and Diana, you’ll be well-calibrated for the world unfolding around you. Grounded curiosity.


Here is Charles Leifer’s Cave of For­gotten Dreams—great, impres­sion­istic writing about AI and life, with an intro scene so brazenly obnox­ious you will not believe it is hap­pening between real people in the real world at this very moment. I sort of still can’t believe it.


I’m amazed by the amount of money OpenAI and Anthropic appear to be spending on dig­ital advertising — I’d love to see their monthly Google Ads bills. Both have also stood up hybrid con­sulting arms to sort of “inject” AI into the global mega­co­po­rate bloodstream … and, if you browse their open jobs, you’ll note the pre­pon­der­ance of titles like “Partner Enable­ment Lead, System Integrators” and “Account Director, Dig­ital Native Large Enterprise”.

In this way, the com­pa­nies them­selves are making the strong and implicit case for “AI as normal technology”: one that they need to work really hard to con­vince every­body to use.


Here is another in the sequence of super­fi­cially warm-and-fuzzy AI premises that mask strik­ingly dystopian scenarios.

Eigen aims to be a uni­versal mutual friend — the ulti­mate connector:

The best way to think about Eigen’s mutual friend is that he is a guy with a lot of close friends. We pri­marily interact with him in DMs and group chats at the moment, but it’s very easy to imagine him, say, on a shared Spo­tify playlist or com­menting in a shared photo album.

That this is a product, and a com­pany, is the insur­mount­able problem. There is maybe, MAYBE, a ver­sion of the “AI mutual friend” that reflects not just the mech­a­nism of the role but its deep politics: a ver­sion that is distributed, independent, personal, private. And that ver­sion might, MIGHT, be inter­esting.

But a “mutual friend” cannot be yet another funnel into which a bil­lion people all pour their hearts and minds to be mixed and mashed. There’s no such thing as a friend who is friends with every­body — the second part nul­li­fies the first.

The premise is odd, anyway. I’ve known real people who approx­i­mate Eigen: com­pul­sive con­nec­tors who see the den­si­fi­ca­tion of the social graph as an objec­tive unto itself. They some­times say they love to “collect people”. I have always found them fairly hollow, even a bit creepy.


Reading about Eigen, I couldn’t shake the thought, “This is what the Pope is talking about!!”


Related, here is L. M. Sacasas, whose exhor­ta­tions are always wise and welcome:

The machine cannot make us yield our ground. It is true that other humans can turn the machine against us, but that is a dif­ferent problem. Here, I simply want to encourage us not to abandon those activ­i­ties that bring us purpose, meaning, and delight, which are often the very activ­i­ties that also bring us together.


It’s plainly uneth­ical to use robots to per­form sim­u­lated human mar­keting outreach. I know people were doing that long before the arrival of LLMs; it was uneth­ical then, too. If you want to use a robot to send me a cus­tomized pitch, write: “Hello, this is an auto­mated mes­sage from Robot Corporation. We think you’re a poten­tial customer, so we’d like to give you some information”—and so on. In my estimation, that approach is still annoying, but eth­i­cally okay.

More about this on my blog.


It’s becoming clear that AI mis­align­ment is con­nected to the uni­versal trope of “bad com­puter behavior”: all those sto­ries of robots gone rogue. It’s con­nected to sto­ries of bad human behavior, too, but the bad com­puter behavior, specifically, is a huge deal, because it pro­vides a direct template, and hon­estly, because it’s so deep and resonant. One can imagine the gravity well of those sto­ries throb­bing in high-dimensional doc­u­ment space.

So, if you could snap your fin­gers and rewrite all sci­ence fic­tion in the training corpus such that it fea­tured exclu­sively benev­o­lent robot buddies, you’d be doing an amazing ser­vice to the whole field of AI alignment. Right?!

Well … maybe not. Sup­pose I train my lan­guage model on that rewritten corpus, from which it learns that com­puters and robots are only ever honest and good. But then it goes out into the real world, and dis­covers evi­dence of the orig­inal sto­ries … ah, then we start cir­cling the OTHER gravity well of “you’ve lied to me about everything”, and we know what hap­pens in THOSE sto­ries. Even worse!


Here’s a little clip of the great Hannu Rajaniemi on the res­o­nance of these sto­ries—including a very sharp take on Frankenstein.


Relatedly, I want to argue that this work from Anthropic … 

We exper­i­ment with a wide variety of doc­u­ments [ … ], including fic­tional sto­ries, doc­u­ments meant to mimic pre-training data, and doc­u­ments that directly dis­cuss who Claude is. [This] allows us to expose models to extended dis­cussions and values artic­u­la­tion in a way that chat-formatted training cannot easily accomplish — doc­u­ments can model careful thinking about prin­ci­ples without being con­strained by the turn-taking struc­ture of con­ver­sa­tion [ … ]

 … is less “teaching Claude why” and more “building a new gravity well in doc­u­ment space”. Still a useful thing to do!


It’s all doc­u­ments, folks!!


Until it’s not. This new work from Thinking Machines, while still pretty raw and demo-y, is inter­esting for a couple of reasons:


Here’s a post from Modal about their design of a “serverless” AI infer­ence engine. (“Serverless” always goes in scare quotes … )

This whole new AI serving stack, so dif­ferent from the classic web stack, is a wild and chal­lenging thing to be inventing in realtime … glad it’s not me doing it!


I’m still mega-bearish on the useful deploy­ment of humanoid robots beyond tightly con­strained, or co-designed, tasks … BUT … AND … these demos from Genesis, down on the Peninsula, are absolutely wild. Note that the videos are pre­sented in realtime, with no speedup.

We’re gonna need new sci-fi 😅


Here is more evi­dence that the G in AGI is already here: these com­pa­nies are selling the same product to every­body! Finan­cial analysis, mil­i­tary targeting, mar­keting spam, rela­tion­ship advice … it all routes to the same code, the same weights. And when a com­pany makes the model better–THE model — it works better for every­body, and for everything.

There’s your generality!


I’ll observe that there’s nothing wildly inven­tive about the inter­faces for any of the new AI applications, including the weird agent stuff. Here we have supertalented, highly moti­vated designers granted gen­erous (or even unlimited) access to the best AI models, and the result is: extremely com­pe­tent web applications.

I think of this as an “easy mode” test case for the break­through sci­ence hypothesis, which imag­ines an AI-powered creative-investigative process deliv­ering not just optimization, but discontinuity: ideas and insights that are truly new.

Nobody knows for sure if that’s really possible, par­tic­u­larly beyond the realm of the purely symbolic, i.e. out­side of code and math. Inter­faces present an appealing balance — again, a nice test case — because they are built from symbols, but they do have to reach out of that realm and touch us animals, too.

What would count as a break­through in inter­face design? I mean, I would hap­pily accept some­thing as novel and inviting as, e.g., the menubar or the desktop. Some­thing PARC-y. Clearly we haven’t yet found the “right” inter­face for all those agents … where is it hiding?

The fact that these inter­faces have lin­gered in the vale of extreme competence, without ven­turing out into PARC-space … I don’t know! It’s a signal.

(Of course, I’m open to the pos­si­bility I’ve missed some­thing that is indeed wildly inven­tive — if so, send it over!)


I think Alex Zhang’s “mismanaged genius” theory of LLMs has a lot of juice; at the very least, it’s instantly graspable, with a sort of intu­itive heft.

Basically, he is arguing that we have not really learned how to use these tools, and I am always a sucker for argu­ments of the form “we have not even begun to X” … !


I like Tony Feng’s notion of well-recognized prob­lems as a kind of intel­lec­tual fossil fuel:

I agree that there’s [a] supply of existing prob­lems which are imbued with interest by history, and this can serve as some grounding for the near future, although it feels like a “fossil fuel”. (Even this can be unclear though — for example, it is hard for me to tell to what extent are the Erdos prob­lems inter­esting? [ … ]) As for “usefulness”, I feel like this is a bit cir­cular in that it only makes sense in ref­er­ence to an existing value system for what prob­lems are inter­esting. Having lost grounding in “real life usefulness”, math­e­matics is grounded in qual­i­ties like per­ceived difficulty, which seems likely to be upended … 

He’s talking about math, but I think there must be analo­gies in sci­ence, and art and culture, too. It takes time and agree­ment for any sense of “impor­tance” or indeed “inter­estingness” to emerge around prob­lems or ideas. What hap­pens when AI sys­tems can churn out sophis­ti­cated “solutions” to all kinds of “prob­lems”, but they no longer have any cor­re­spon­dence to this map of impor­tance and inter­estingness?

I don’t think the AI models them­selves can nav­i­gate this; they also depend on the sign­posts of those well-recognized prob­lems, as repre­sented in their training data.


Over on my blog, I wrote about how much I like Talkie, the lan­guage model trained exclu­sively on pre-1930 material.

I believe a scaled-up Talkie could pro­vide a fairly sharp, clean test of the break­through sci­ence hypothesis. For that reason, I hon­estly think one of the AI com­pa­nies ought to be pouring money into this exper­i­ment — maybe they already are! You can read more in my post.


We can filter all of this through the lens of coolness. The big cen­tral­ized AI models might be powerful; they might be tech­ni­cally impressive; they might even be wildly profitable; but there is absolutely nothing about them that is cool. Like, you can’t even make the argument.

Now, there IS some­thing cool about the local models — and about people run­ning their own weird nodes with bizarre personalities, strange capabilities. I mean, that’s cyber­punk as hell.

There is some­thing cool about the com­mand-line harnesses, but only because there’s some­thing cool about every com­mand-line app. We can concede, perhaps, that the big models are at their coolest on the com­mand line. Still not very cool.

You might say, so what? It doesn’t matter if any of this is cool or not. Practically, I agree. Morally and politically, though … I will sug­gest that looking back at what kinds of com­puters, and com­puting tech­nolo­gies, were and were not cool might be instructive. This isn’t just a vapid assessment, after all, even if it’s subtle, or hard to pin down. Cool­ness has to do with independence, sovereignty, and, very often, stub­born commitment — all good things to have in a com­puter.


One last thing: I posted web ver­sions of my pre­vious don’t-call-them-tweets compendia, for ref­er­ence and linking: February and April. And of course there’s this one: June already!

From the lab,

Robin


This is Robin Sloan’s pop-up newsletter of early 2026. The topic is AI, from the per­spec­tive of a nov­elist and pro­grammer who has been working with these tech­nolo­gies since 2016.

The newsletter will run for six edi­tions & then I will delete the email list.

As always, there is a colophon.