Distilling the Knowledge in a Neural Network (Hinton et al.)
March 9, 2026
"soft distillation"
-distillation on the pre-softmax logits of a model to help student learn the teachers distribution as opposed to just the labels
March 9, 2026
"soft distillation"