Skip to content

WarmPrompt Journal

Welcome to the WarmPrompt blog. New posts will land here as we share ideas and experiments.

My 16 Bits

The Return of the 32K Ghost

There is something wonderfully, darkly comic about the fact that in 2026—an era in which we train trillion-parameter models, simulate protein folding, and ask machines to hallucinate legal arguments with supreme confidence—we are once again being told:

“Sorry, your input is too long.”

Not philosophically too long.
Not aesthetically indulgent.
But numerically unacceptable.

If you were around computers in the 1990s, this should feel eerily familiar. Like bumping into an old acquaintance in an airport bar who looks exactly the same and still owes you money.

The 32-kilobyte limit—once the calling card of Win16 text boxes and earnest GUI toolkits—has staged a quiet comeback. This time wearing a Patagonia gilet and calling itself context length.


The tyranny of convenient numbers

The original 32K limit was not designed. It emerged.

It came from a perfect storm of:

  • 16-bit signed integers
  • Cursor positions stored as short
  • The charming belief that “no one would ever need more text than this”

In other words, it wasn’t ideology. It was accounting.

And now, decades later, we are doing exactly the same thing—only with embeddings, tokenizers, and attention windows.

The modern equivalent is not:

“Documents should be short”

but:

“We need a number that keeps inference fast, predictable, and cheap.”

So we pick one.

8k tokens.
16k tokens.
32k tokens.

Lovely round numbers. Deeply reassuring. Entirely arbitrary.


AI didn’t resurrect the limit — it rebranded it

What’s fascinating is that AI didn’t inherit the limit from old systems. It reinvented it for the same reasons, which is much more embarrassing.

The constraint now comes from:

  • Quadratic attention costs
  • GPU memory ceilings
  • Latency expectations disguised as “UX”

But the experience is identical.

You paste in something big and meaningful—perhaps a real document, written by a human, with history and context—and the system responds with the digital equivalent of a Victorian footman:

“I’m terribly sorry, sir, but that is quite beyond what we were expecting.”

This is how you know we haven’t escaped the past. We’ve merely wrapped it in better metaphors.


Context is the new clipboard

In the 90s, the clipboard was the choke point.

You could write a long document.
You could store a long document.
But try to move it between systems and—bang—hard stop.

Today, context is the clipboard.

AI systems are astonishingly capable at:

  • Generating text
  • Transforming text
  • Summarising text

But strangely fragile when asked to:

“Hold this entire thing in your head for a moment.”

Which is ironic, because holding context is literally the thing we describe them as doing.


Why this matters more than it seems

Limits don’t just constrain behaviour. They reshape it.

Just as 32K limits trained people to:

  • Split documents unnaturally
  • Abuse file formats
  • Invent baroque workarounds

AI context limits are now teaching people to:

  • Over-summarise too early
  • Strip nuance for token efficiency
  • Optimise for model digestion rather than human meaning

We are, once again, designing for the system instead of for ourselves.

And as before, we will pretend this is good discipline.

(It wasn’t then. It isn’t now.)


The real danger: premature compression

The most damaging thing about hard limits isn’t that they block expression.

It’s that they encourage early loss.

Once you summarise something to fit a window:

  • The discarded details are gone
  • The framing hardens
  • The narrative ossifies

This is exactly what happened with early document systems: structure calcified around constraints that had nothing to do with comprehension and everything to do with implementation convenience.

AI is now doing the same thing—only faster, and at planetary scale.


The punchline

The 32K limit never died.

It simply waited patiently for us to invent systems complex enough that we’d once again say:

“Surely no one will ever need more than this.”

And just like last time, we’re wrong—not because humans have changed, but because meaning doesn’t compress cleanly, no matter how clever your math is.

The lesson from the 90s wasn’t “make limits bigger.”

It was:

Beware of constraints that feel technical but behave psychological.

Because once a number sneaks into the interface, it stops being an implementation detail and starts becoming a belief.

And beliefs, unlike integers, don’t overflow gracefully.

The Heated Hoodie and the Tyranny of Cleverness

(or: how technology keeps solving problems we didn't have, while inventing new ones we do)

I bought a heated hoodie. Not a jet engine. Not a nuclear reactor. A hoodie. The kind of garment whose historical job description has been remarkably stable for several centuries: keep the human vaguely warm while looking mildly dishevelled.

And yet, within minutes, the thing stopped working.

Not because it was broken. Oh no. That would have been honest. It stopped because the power bank decided nothing was happening.

This is modern technology in its purest form: a system so clever it concludes that reality itself is mistaken.


When warmth becomes suspicious

The hoodie warms up nicely for about sixty seconds, then goes cold. The power bank, having performed a brief audit of the universe, shuts itself down. Why? Because the hoodie does not look like a phone.

Phones, apparently, are the moral centre of the electrical universe. Anything else - especially something as unfashionable as heat - is treated with deep suspicion.

The hoodie draws a small, steady current. Sensible. Civilised. Thermodynamically literate. The power bank, meanwhile, is hunting for dramatic spikes of activity - a sort of dopamine-driven attention economy for electrons.

No spike? No scrolling? No frantic signalling? Then clearly nothing of value is occurring.


The manual's finest moment: "warm prompt"

Now let us pause to admire the manual.

It suggests pressing a button to issue a "warm prompt."

This is a phrase of breathtaking ambition.

A hoodie, historically, does not require prompting. It is not a shy intern. You put it on, and it warms you. That was the contract.

But here we are, borrowing language from behavioural psychology and AI interfaces to describe what is, in essence, asking a jumper to please do its job.

"Warm prompt" implies:

  • The hoodie might be reluctant
  • The hoodie might need encouragement
  • The hoodie might be context-aware

This is not clothing. This is a poorly managed middle manager.


Automation that mistrusts humans

What's really happening here is familiar. Someone, somewhere, optimised for the wrong outcome.

The power bank is not designed to deliver energy. It is designed to avoid liability. To shut down. To protect itself from imagined misuse. To assume the user is an idiot unless proven otherwise - and even then, only briefly.

So the human adapts:

  • Pressing buttons periodically to reassure the machine
  • Adding fake electrical loads to trick it
  • Carrying extra cables like ritual offerings

This is not progress. This is cargo cult engineering.


The crime: mistaking intelligence for value

The heated hoodie commits the cardinal sin of modern tech: it assumes intelligence is inherently useful.

But warmth does not need intelligence. Warmth needs reliability.

The most successful heating technologies in history include:

  • Fire
  • Wool
  • Sitting closer to other people

None of these require firmware updates.


The punchline

The hoodie works perfectly when plugged into a wall socket. In other words, when connected to stupid power.

The moment you introduce a "smart" intermediary, it fails - not randomly, but confidently. It fails because it believes it knows better than you whether you are cold.

Which is, in many ways, the defining problem of contemporary technology.

We are no longer warm. But the system is very pleased with itself.

First entry

WarmPrompt is a place to explore how gentle constraints can lead to better prompts. This first entry outlines the direction and the kind of experiments we'll run.