diffusion (probabilistic model) - Markov chain trained using inference to produce samples matching the data after finite time
-transitions are learned to reverse a diffusion process
-Markov chain gradually adds noise to the data, and the model denoises it
forward process (diffusion) - add Gaussian noise to shrink signal at each step
q(xt∣xt−1)=N(xt;1−βtxt−1,βtI) -βt - noise schedule parameter for time step t, αt=1−βt - amount of signal -this gives the closed form
xt=αtx0+1−αtϵforαt=s=1∏tαs,ϵ∼N(0,1) -a good schedule:
-make signal-to-noise ration decrease smoothly: 1−αtαt -ensures each time step is equally informative to learn
-evaluated using loss per time step
-gradient norms across t to verify reverse process (generative) - produce a cleaner output from a noisy input (invert forward process)
-this is approximated, not analytical
pθ(xt−1∣xt)=N(xt−1;μθ(xt,t),θ∑(xt,t)) -ϵ^θ(xt,t) - given the noisy image xt after t steps, what noise was added to the original image to produce it -plug this into the forward noising equation at step t to get xt=αtx0+1−αtϵ^θ(xt,t) -then, the original image is
x0=αt1(xt−1−αtϵ^θ(xt,t)) -since the x0 estimate will be noisy from larger t, we compute the step change from t→t−1 using the x0 estimate μθ(xt,t)=αt1(xt−1−αt1−αtϵ^) -μθ(xt,t) is the most likely (average) image at the previous time step -the previous time step xt−1 is sampled from μθ(xt,t) xt−1=μθ(xt,t)+σtz,z∼N(0,I) -σt2=βt - noise schedule controls variance -error in noise used to train the noise prediction neural network instead of negative log likelihood between image distributions
L(θ)=∣∣ϵ−ϵ^(xt,t)∣∣2