Learning to Reason in 13 Parameters (Morris et al.)
March 24, 2026
-Tiny LoRA
-LoRA adapter trained with 13 parameters, projected into rank- matrix to form
-using RL, faster + easier to train with less data than full LoRA, SFT, etc.
The LoRA adapter is currently
-where A and B are both trainable matrices of rank (hyperparameter)