Notes on AI Diffusion Models
- Utilizing process of adding noise to images (e.g. in training data), inspired by diffusion in physics.
- Neural network learns to de-noise images and effectively reverse engineer a generative image.
- Useful to sample noise from the normal distribution. Curious if different domains of creative generation can be tapped into by using different probability distributions.
- Important to add additional noise when sampling to prevent a generic reconstruction of average data (blobs). Essentially a symmetry-breaking type of operation.
- Utilizing UNets (taking in image and outputting an image of same size) augmented with embeddings for time and context. *These context embeddings are key, as they provide text descriptions which can be leveraged to produce novel images based on text e.g., "avocado armchair."
- The model is really predicting noise field of image, which is then subtracted from the input image to generate a predicted clean output image of potential user interest.
References:
- Check out the short course on Diffusion Models here DLAI - Learning Platform Beta (deeplearning.ai)
Comments
Post a Comment