diff --git a/dall e b/dall e new file mode 100644 index 0000000..c72d28a --- /dev/null +++ b/dall e @@ -0,0 +1,124 @@ +# Model Card: DALL·E dVAE + +Following [Model Cards for Model Reporting (Mitchell et al.)](https://arxiv.org/abs/1810.03993) and [Lessons from +Archives (Jo & Gebru)](https://arxiv.org/pdf/1912.10389.pdf), we're providing some information about about the discrete +VAE (dVAE) that was used to train DALL·E. + +## Model Details + +The dVAE was developed by researchers at OpenAI to reduce the memory footprint of the transformer trained on the +text-to-image generation task. The details involved in training the dVAE are described in [the paper][dalle_paper]. This +model card describes the first version of the model, released in February 2021. The model consists of a convolutional +encoder and decoder whose architectures are described [here](dall_e/encoder.py) and [here](dall_e/decoder.py), respectively. +For questions or comments about the models or the code release, please file a Github issue. + +## Model Use + +### Intended Use + +The model is intended for others to use for training their own generative models. + +### Out-of-Scope Use Cases + +This model is inappropriate for high-fidelity image processing applications. We also do not recommend its use as a +general-purpose image compressor. + +## Training Data + +The model was trained on publicly available text-image pairs collected from the internet. This data consists partly of +[Conceptual Captions][cc] and a filtered subset of [YFCC100M][yfcc100m]. We used a subset of the filters described in +[Sharma et al.][cc_paper] to construct this dataset; further details are described in [our paper][dalle_paper]. We will +not be releasing the dataset. + +## Performance and Limitations + +The heavy compression from the encoding process results in a noticeable loss of detail in the reconstructed images. This +renders it inappropriate for applications that require fine-grained details of the image to be preserved. + +[dalle_paper]: https://arxiv.org/abs/2102.12092 +[cc]: https://ai.google.com/research/ConceptualCaptions +[cc_paper]: https://www.aclweb.org/anthology/P18-1238/ +[yfcc100m]: http://projects.dfki.uni-kl.de/yfcc100m/ + +# Overview + +[[Blog]](https://openai.com/blog/dall-e/) [[Paper]](https://arxiv.org/abs/2102.12092) [[Model Card]](model_card.md) [[Usage]](notebooks/usage.ipynb) + +This is the official PyTorch package for the discrete VAE used for DALL·E. The transformer used to generate the images from the text is not part of this code release. + +# Installation + +Before running [the example notebook](notebooks/usage.ipynb), you will need to install the package using + + pip install DALL-E + +Pillow +blobfile +mypy +numpy +pytest +requests +torch +torchvision + +from setuptools import setup + +def parse_requirements(filename): + lines = (line.strip() for line in open(filename)) + return [line for line in lines if line and not line.startswith("#")] + +setup(name='DALL-E', + version='0.1', + description='PyTorch package for the discrete VAE used for DALL·E.', + url='http://github.com/openai/DALL-E', + author='Aditya Ramesh', + author_email='aramesh@openai.com', + license='BSD', + packages=['dall_e'], + install_requires=parse_requirements('requirements.txt'), + zip_safe=True) + +# OS specific +*.DS_Store + +# Python +/build +/dist +__pycache__ +*.ipynb_checkpoints +*.egg-info + +# Vim +*.vim +*.swk +*.swl +*.swm +*.swn +*.swo +*.swp + +Modified MIT License + +Software Copyright (c) 2021 OpenAI + +We don’t claim ownership of the content you create with the DALL-E discrete VAE, so it is yours to +do with as you please. We only ask that you use the model responsibly and clearly indicate that it +was used. + +Permission is hereby granted, free of charge, to any person obtaining a copy of this software and +associated documentation files (the "Software"), to deal in the Software without restriction, +including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, +and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, +subject to the following conditions: + +The above copyright notice and this permission notice shall be included +in all copies or substantial portions of the Software. +The above copyright notice and this permission notice need not be included +with content created by the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS +BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE +OR OTHER DEALINGS IN THE SOFTWARE.