Volltext-Downloads (blau) und Frontdoor-Views (grau)

A comparative Study of Deep Generative Models for Image Generation

  • In the last years generative models have gained large public attention due to their high level of quality in generated images. In short, generative models learn a distribution from a finite number of samples and are able then to generate infinite other samples. This can be applied to image data. In the past generative models have not been able to generate realistic images, but nowadays the results are almost indistinguishable from real images. This work provides a comparative study of three generative models: Variational Autoencoder (VAE), Generative Adversarial Network (GAN) and Diffusion Models (DM). The goal is not to provide a definitive ranking indicating which one of them is the best, but to qualitatively and where possible quantitively decide which model is good with respect to a given criterion. Such criteria include realism, generalization and diversity, sampling, training difficulty, parameter efficiency, interpolating and inpainting capabilities, semantic editing as well as implementation difficulty. After a brief introduction of how each model works on the inside, they are compared against each other. The provided images help to see the differences among the models with respect to each criterion. To give a short outlook on the results of the comparison of the three models, DMs generate most realistic images. They seem to generalize best and have a high variation among the generated images. However, they are based on an iterative process, which makes them the slowest of the three models in terms of sample generation time. On the other hand, GANs and VAEs generate their samples using one single forward-pass. The images generated by GANs are comparable to the DM and the images from VAEs are blurry, which makes them less desirable in comparison to GANs or DMs. However, both the VAE and the GAN, stand out from the DMs with respect to the interpolations and semantic editing, as they have a latent space, which makes space-walks possible and the changes are not as chaotic as in the case of DMs. Furthermore, concept-vectors can be found, which transform a given image along a given feature while leaving other features and structures mostly unchanged, which is difficult to archive with DMs.

Download full text files

Export metadata


Author:Albert Szeliga
Advisor:Adrian PigorsGND, Holger Billhardt
Document Type:Master's Thesis
Year of Completion:2023
Publishing Institution:Hochschule Hannover
Granting Institution:Hochschule Hannover, Fakult├Ąt IV - Wirtschaft und Informatik
Date of final exam:2023/10/10
Release Date:2023/10/11
Tag:AI; Computer Vision; Diffusion Models; GAN
GND Keyword:Deep learning; Maschinelles Sehen; Generative Adversarial Network
Link to catalogue:1869393139
Institutes:Fakult├Ąt IV - Wirtschaft und Informatik
DDC classes:004 Informatik
Licence (German):License LogoCreative Commons - CC BY - Namensnennung 4.0 International