Subnet 97 (Distil) asks a deceptively simple question β and the answer is yes. Miners compress Qwen3.5-35B into a β€5.25B student. Lowest KL divergence holds the crown and wins 100% of daily emissions.
The king model doesn't just mimic the teacher's token distributions β it beats the undistilled baseline by 13% on reasoning, 8% on math, and 6% on knowledge. This isn't incentive theater. SN97 is a decentralised R&D lab where competition literally produces better open-source AI β and the winning model is free for anyone to use.
Qwen/Qwen3.5-35B-A3B (teacher) into a student β€5.25B params.
Eval uses KL divergence on 300 prompts from karpathy/climbmix-400b-shuffle.
git clone https://github.com/unarbos/distil
python check_model.py --model-repo your/model
btcli s register --netuid 97btcli s commit --netuid 97 --model-repo your/model