Multimodal VAE Comparison Toolkit

This is the official documentation for the Multimodal VAE Comparison Toolkit GitHub repository

The purpose of the Multimodal VAE Comparison toolkit is to offer a systematic and unified way to train, evaluate and compare the state-of-the-art multimodal variational autoencoders. The toolkit can be used with arbitrary datasets and both uni/multimodal settings. By default, we provide implementations of the MVAE (paper), MMVAE (paper), MoPoE (paper) and DMVAE (paper) models, but anyone is free to contribute with their own implementation.

We also provide a custom synthetic bimodal dataset, called CdSprites+, designed specifically for comparison of the joint- and cross-generative capabilities of multimodal VAEs. You can read about the utilities of the dataset in the proposed paper (link will be added soon). This dataset offers 5 levels of difficulty (based on the number of attributes) to find the minimal functioning scenario for each model. Moreover, its rigid structure enables automatic qualitative evaluation of the generated samples. For more info, see below.

The toolkit is using the PyTorch Lightning framework.

Note

This page is currently a work in progress

Sub-Modules:

Paper Results

Supplementary material

Tutorials

Code documentation