Selective Temporal Training
Hayk Grigorian is training small language models with a corpus of Victorian text.
This kind of work was widespread in the 2016-2020 era, just before GPT-2. After that, the race was on to pile more text into the training corpora, basically regardless of provenance. That was the path to the emergent general capabilities of modern LLMs … at the expense of interesting, human-scale experiments like this one.
Hayk’s coinage of “Selective Temporal Training” is perhaps a bit puffed-up, and I love it. He writes:
[ … ] If I fine-tune something like GPT-2, it’s already pre-trained and that information won’t go away. If I train from scratch the language model won’t pretend to be old, it just will be. The Goal for this project right now is to create something can reason exclusively using knowledge from London books published between 1800 and 1875.
I fear Hayk won’t get to the “reason” he’s after —
I spent a lot of my time (too much) circa 2016-2019 training custom/weird AI models, and I can report that it was a fun and interesting activity!
To the blog home page