Notes on AI Diffusion Models

 - Utilizing process of adding noise to images (e.g. in training data), inspired by diffusion in physics.

- Neural network learns to de-noise images and effectively reverse engineer a generative image.

- Useful to sample noise from the normal distribution. Curious if different domains of creative generation can be tapped into by using different probability distributions.

- Important to add additional noise when sampling to prevent a generic reconstruction of average data (blobs). Essentially a symmetry-breaking type of operation.

- Utilizing UNets (taking in image and outputting an image of same size) augmented with embeddings for time and context. *These context embeddings are key, as they provide text descriptions which can be leveraged to produce novel images based on text e.g., "avocado armchair."

- The model is really predicting noise field of image, which is then subtracted from the input image to generate a predicted clean output image of potential user interest.


References:

- Check out the short course on Diffusion Models here DLAI - Learning Platform Beta (deeplearning.ai)

Comments

Popular Posts