Selective Temporal Training

August 23, 2025

Hayk Grigorian is training small language models with a corpus of Victorian text.

This kind of work was widespread in the 2016-2020 era, just before GPT-2. After that, the race was on to pile more text into the training corpora, basically regardless of provenance. That was the path to the emergent general capabilities of modern LLMs … at the expense of interesting, human-scale experiments like this one.

Hayk’s coinage of “Selective Temporal Training” is perhaps a bit puffed-up, and I love it. He writes:

[ … ] If I fine-tune something like GPT-2, it’s already pre-trained and that information won’t go away. If I train from scratch the language model won’t pretend to be old, it just will be. The Goal for this project right now is to create something can reason exclusively using knowledge from London books published between 1800 and 1875.

I fear Hayk won’t get to the “reason” he’s after — that seems to depend on a much larger corpus, some of it synthetic — but/and the project might still produce some interesting outputs. I wish more college students were designing their own personal corpora, rather than tumbling into ChatGPT’s generic embrace.

I spent a lot of my time (too much) circa 2016-2019 training custom/weird AI models, and I can report that it was a fun and interesting activity!

To the blog home page