This is a post from Robin Sloan’s lab blog & notebook. You can visit the blog’s homepage, or learn more about me.

Ghost faves in the mystery machine

July 21, 2021

Hamlet's friend restrain him while a ghost beckons; a mono­chrome etching, fine-lined, almost like a comic book illustration.

Hamlet, Horatio, Mar­cellus and the Ghost, 1796, Robert Thew

Recently, I was chat­ting with a friend about a problem they’d encoun­tered after run­ning my script to delete their old Twitter faves. (These are tech­ni­cally “likes” now, I know, but I will stub­bornly call them faves, because it is the supe­rior word.)

The problem was that, although the API reported a suc­cessful unfaving, many old faves were still attached to my friend’s profile. Here’s the strange part: when viewed, the tweets them­selves showed hollow hearts, vis­ibly unfaved … yet there they remained, a ghostly list, somehow both fave and not.

Even stranger — at this point it’s get­ting delicious — the ghost faves could be ban­ished at last by refaving and then unfaving them; by power-cycling the little heart.

Cur­sory inves­ti­ga­tion indi­cates this is a wide­spread problem. A search for “twitter phantom likes” will reveal many people describing the same behavior exactly, with no evi­dence of a solu­tion anywhere. Twitter even released a fresh new API end­point for man­aging faves — and still, the ghosts are beyond its reach.

This is clearly a bug — the API is not doing what it’s sup­posed to do, or even what it “thinks it’s doing”—but I am not here to dis­cuss a bug; rather, what the bug made me think about.

My under­standing of large internet sys­tems was trans­formed when I read about Face­book’s Mys­tery Machine. This was a tool doc­u­mented back in 2014; I assume it is not in use anymore, but/and, to me, it’s still exem­plary of the way these sys­tems work, or don’t.

Face­book’s paper lays out the problem:

Cur­rent debug­ging and opti­miza­tion methods scale poorly to deal with the com­plexity of modern Internet services, in which a single request trig­gers par­allel exe­cu­tion of numerous het­ero­ge­neous soft­ware com­po­nents over a dis­trib­uted set of computers.

In other words: how do you ana­lyze or debug a system made up of many dif­ferent com­po­nents, written by many dif­ferent people in many dif­ferent pro­gram­ming languages, run­ning in many dif­ferent places at many dif­ferent times … that activate each other in com­pli­cated cascades? A system that grew organically, and very quickly? That was perhaps, heh, not per­fectly doc­u­mented along the way?

A “rational” answer might be: you write some code that cap­tures an X-ray of what’s happening, and you put that code into all those com­po­nents! I mean, this kind of soft­ware exists. But the reality of Face­book circa 2014, its messi­ness and scale, meant that wasn’t going to happen.

So what did they do instead?

Consequently, we develop a tech­nique that gen­er­ates a causal model of system behavior without the need to add sub­stan­tial new instru­men­ta­tion or man­u­ally gen­erate a schema of appli­ca­tion behavior. Instead, we gen­erate the model via large-scale rea­soning over indi­vidual soft­ware com­po­nent logs.

In plain language: they watched the whole system as it ran, the parts of it they could see — the log messages — and inferred its operation from that activity, the way it played out in time. The Mys­tery Machine worked by lit­er­ally hypothesizing “maybe X causes Y … ” for a few mil­lion dif­ferent Xs and Ys, testing each hypoth­esis against the logs until enough had been “disproven” that a clear-ish pic­ture remained. It sounds to me more like botany or zoology than engi­neering or architecture!

I find this totally cosmic: Face­book in 2014, by then already a VERY large and rich company — a pow­erful force in the world — did not under­stand, in a pretty deep way, how Face­book worked.

Maybe Face­book at the height of its “move fast and break things” era rep­re­sents an extreme case. But the problem described in the Mys­tery Machine paper is not uncommon; the heterogenous, asyn­chro­nous nature of large internet sys­tems seems to pro­duce it almost inevitably. An “appli­ca­tion” like Face­book or Fort­nite is just VERY dif­ferent from an appli­ca­tion like Pho­to­shop or Street Fighter II.

So, at this point, I assume there are sim­ilar uncer­tain­ties at play in every large internet system, and I assume that the people who build and main­tain those sys­tems don’t totally under­stand them. In that way, I think they operate as much like cus­to­dians and care­takers as designers and engineers.

(It’s worth noting that this effect is even more pro­nounced when it comes to AI models, which are sort of nat­u­rally mysterious; for this reason, “AI explainability” is a knotty and impor­tant sub-field.)

Anyway, it feels to me like the ghost fave bug must have some­thing to do with two or more of the many com­po­nents in Twitter’s own mys­tery machine not talking to each other correctly; the API end­point duti­fully sig­nals the database, but, elsewhere, a cache isn’t reset … some­thing along those lines. But that’s just a guess. Honestly, I’d love to know if, inside Twitter HQ, the bug is totally understood, just not a priority … or if there’s some mys­tery to it. I’m rooting for mys­tery.

To the blog home page