Smarter AI Without Bigger Models: Recursive Composability with Tiny Models

Intro

You’ve probably heard the mantra: “Bigger model, better results.” But what if you could get smarter, more reliable outputs without scaling up your parameter count? That’s the bet behind a new project from Samsung SAIL Montreal called TinyRecursiveModels.

It’s a departure from the “just add more GPUs” mindset. Instead of cramming everything into a single large transformer, this repo explores how tiny models can be composed recursively to produce surprisingly robust reasoning. It’s not about achieving GPT-4 level results on one pass — it’s about making small models work harder, not bigger.

What It Does

The core idea is recursive self-improvement using small language models. You take a tiny model (think 100M-300M parameters), run it iteratively: each step feeds its own output back into the model, allowing it to refine or build on its previous response.

The key difference from simple “chain-of-thought” prompting? The recursion is baked into the architecture and training process. The model learns to not just generate a single response, but to generate and then reason about its own generation, then generate again. It’s like giving a small model a notepad and asking it to think out loud, then check its work, then try again.

The GitHub repo provides:

Training scripts for recursive fine-tuning on small models
Evaluation benchmarks that show this recursively composed output outperforms the same model used in a single forward pass
A lightweight framework for experimenting with recursion depth, temperature, and context window

Why It’s Cool

This isn’t “just prompt engineering.” The recursive behavior is trained into the model, not added as a user instruction. That means the model learns how to leverage its own internal state across multiple steps — something that feels a bit like iterative reasoning but without needing a massive parameter budget.

A few genuinely clever bits:

Composability: You can stack a few tiny models together and get emergent performance that matches or beats a single model 2-3x its size on specific tasks (like symbolic reasoning, arithmetic, or simple code fixups).
No extra hardware: These models run on a single GPU easily. You’re not trading cost for performance in the usual way.
Transparency: Because the recursion loops are explicit, you can actually inspect what the model “thinks” at each step. That’s a huge win for debugging and trust.

The biggest real-world use case? Edge or on-device AI. If you’re shipping a model to a phone, a Raspberry Pi, or a web browser, you can’t afford a 7B parameter monster. TinyRec

Smarter AI Without Bigger Models

README

Smarter AI Without Bigger Models: Recursive Composability with Tiny Models

Intro

What It Does

Why It’s Cool

Join our weekly newsletter

Love discovering amazing projects?