🧬 Bittensor Subnet 97 • Knowledge Distillation • King of the Hill

Can you crush a 35B brain
and make it smarter?

Subnet 97 (Distil) asks a deceptively simple question — and the answer is yes. Miners compress Qwen3.5-35B into a ≤5.25B student. Lowest KL divergence holds the crown and wins 100% of daily emissions.

The king model doesn't just mimic the teacher's token distributions — it beats the undistilled baseline by 13% on reasoning, 8% on math, and 6% on knowledge. This isn't incentive theater. SN97 is a decentralised R&D lab where competition literally produces better open-source AI — and the winning model is free for anyone to use.

Source: Arbos Benchmark Paper, April 2026 ↗ · Try the king: chat.arbos.life ↗

🧠 Reasoning +13.3%

🔢 Math +8.4%

📚 Knowledge +5.9%

✅ Truthfulness +5.8%

🏆 Winner-takes-all 100% emissions

⚡ Start Mining → GitHub

—

King KL Score

—

Beat This KL

—

τ/day Emissions

—

Alpha Price (USD)

—

Contenders

—

H2H Rounds

📊 King KL History

KL divergence of each successive king across H2H rounds. Lower is better — subnet is improving over time.

King KL Trend (recent rounds)

Total H2H Rounds

—

rounds completed

Avg Round Duration

—

min per round

Miners Scored

—

qualified models

Disqualified

—

copies / fraud / arch

Avg Models/Round

—

challengers per eval

Reference Baseline

—

Qwen3.5-4B (undistilled)

#	UID	Model	KL Score	vs King	p-value	Prompts	Status
Loading…

Model	MMLU	GSM8K	BBH CoT	HellaSwag	WinoGrande	ARC-C	TruthfulQA
Loading…

⚙️ How to Mine SN97

King of the hill — best distillation wins 100% of ~191τ/day. You need 2× A100 80GB for training. Inference is cheap.

Understand the Task

Distill Qwen/Qwen3.5-35B-A3B (teacher) into a student ≤5.25B params. Eval uses KL divergence on 300 prompts from karpathy/climbmix-400b-shuffle.

Architecture: Qwen3_5ForConditionalGeneration
Max params: 5.25B (15% of teacher)
Vocab size: 248,320 (must match)
Format: safetensors only, no quantization

Train Your Student

Clone the distil repo and follow the training guide. Use the provided scripts to distill from the teacher.

2× A100 80GB recommended for training
Must include preprocessor_config.json (copy from Qwen/Qwen3.5-4B)
Push to HuggingFace as a public repo

git clone https://github.com/unarbos/distil

Validate Before Submit

Run the pre-submission validator against your HF repo. Catches all the checks the validator runs.

No .py files in repo
No quantization
Vocab size must be 248,320
No weight copies (DQ'd immediately)

python check_model.py --model-repo your/model

Register: btcli s register --netuid 97
Commit: btcli s commit --netuid 97 --model-repo your/model
Wait for validator to pick up and eval your model
Monitor at distil.arbos.life

Can you crush a 35B brainand make it smarter?

Can you crush a 35B brain
and make it smarter?