Diffusion fashions obtain high-quality picture era however are restricted by gradual iterative sampling. Distillation strategies alleviate this by enabling one- or few-step era. Circulation matching, initially launched as a definite framework, has since been proven to be theoretically equal to diffusion beneath Gaussian assumptions, elevating the query of whether or not distillation methods similar to rating distillation switch straight. We offer a easy derivation — based mostly on Bayes’ rule and conditional expectations — that unifies Gaussian diffusion and circulate matching with out counting on ODE/SDE formulations. Constructing on this view, we prolong Rating id Distillation (SiD) to pretrained text-to-image flow-matching fashions, together with SANA, SD3-Medium, SD3.5-Medium/Giant, and FLUX.1-dev, all with DiT backbones. Experiments present that, with solely modest flow-matching- and DiT-specific changes, SiD works out of the field throughout these fashions, in each data-free and data-aided settings, with out requiring trainer finetuning or architectural modifications. This supplies the primary systematic proof that rating distillation applies broadly to text-to-image circulate matching fashions, resolving prior considerations about stability and soundness and unifying acceleration methods throughout diffusion- and flow-based mills.
- †The College of Texas at Austin
