🧬 Bittensor Subnet 97 β€’ Knowledge Distillation β€’ King of the Hill

Can you crush a 35B brain
and make it smarter?

Subnet 97 (Distil) asks a deceptively simple question β€” and the answer is yes. Miners compress Qwen3.5-35B into a ≀5.25B student. Lowest KL divergence holds the crown and wins 100% of daily emissions.

The king model doesn't just mimic the teacher's token distributions β€” it beats the undistilled baseline by 13% on reasoning, 8% on math, and 6% on knowledge. This isn't incentive theater. SN97 is a decentralised R&D lab where competition literally produces better open-source AI β€” and the winning model is free for anyone to use.

🧠 Reasoning +13.3%
πŸ”’ Math +8.4%
πŸ“š Knowledge +5.9%
βœ… Truthfulness +5.8%
πŸ† Winner-takes-all 100% emissions
⚑ Start Mining β†’ GitHub
β€”
King KL Score
β€”
Beat This KL
β€”
Ο„/day Emissions
β€”
Alpha Price (USD)
β€”
Contenders
β€”
H2H Rounds
πŸ‘‘ Reigning King
The current best distillation. Challengers must beat its KL by >1% (paired t-test, Ξ±=0.03) to dethrone.
πŸ‘‘Current King
Loading…
β€”
KL Divergence (lower = better)
🎯 Beat: β€” KL to dethrone
πŸ”„ Eval In Progress
300 prompts per H2H round
Loading…
0 prompts done / 300
πŸ“Š King KL History
KL divergence of each successive king across H2H rounds. Lower is better β€” subnet is improving over time.
King KL Trend (recent rounds)
Total H2H Rounds
β€”
rounds completed
Avg Round Duration
β€”
min per round
Miners Scored
β€”
qualified models
Disqualified
β€”
copies / fraud / arch
Avg Models/Round
β€”
challengers per eval
Reference Baseline
β€”
Qwen3.5-4B (undistilled)
βš”οΈ Top Contenders
Challengers from the latest H2H round, sorted by KL divergence. All must score >1% better than king (p < 0.03) to dethrone.
# UID Model KL Score vs King p-value Prompts Status
Loading…
πŸ… Model Benchmarks
Academic benchmark scores for evaluated models vs the undistilled Qwen/Qwen3.5-4B baseline. A good distillation beats the base model.
Model MMLU GSM8K BBH CoT HellaSwag WinoGrande ARC-C TruthfulQA
Loading…
πŸ“œ Throne History
Every king change since the subnet launched. The crown has passed through β€” reigns.
Loading…
βš™οΈ How to Mine SN97
King of the hill β€” best distillation wins 100% of ~191Ο„/day. You need 2Γ— A100 80GB for training. Inference is cheap.
1
Understand the Task
Distill Qwen/Qwen3.5-35B-A3B (teacher) into a student ≀5.25B params. Eval uses KL divergence on 300 prompts from karpathy/climbmix-400b-shuffle.
  • Architecture: Qwen3_5ForConditionalGeneration
  • Max params: 5.25B (15% of teacher)
  • Vocab size: 248,320 (must match)
  • Format: safetensors only, no quantization
2
Train Your Student
Clone the distil repo and follow the training guide. Use the provided scripts to distill from the teacher.
  • 2Γ— A100 80GB recommended for training
  • Must include preprocessor_config.json (copy from Qwen/Qwen3.5-4B)
  • Push to HuggingFace as a public repo

git clone https://github.com/unarbos/distil
3
Validate Before Submit
Run the pre-submission validator against your HF repo. Catches all the checks the validator runs.
  • No .py files in repo
  • No quantization
  • Vocab size must be 248,320
  • No weight copies (DQ'd immediately)

python check_model.py --model-repo your/model
4
Register & Commit On-Chain
Register on subnet 97, then commit your HuggingFace model repo on-chain via btcli.
  • Register: btcli s register --netuid 97
  • Commit: btcli s commit --netuid 97 --model-repo your/model
  • Wait for validator to pick up and eval your model
  • Monitor at distil.arbos.life