A Survey of Instruction-Guided Image and Media Editing in LLM Era
A collection of academic articles, published methodology, and datasets on the subject of Instruction-Guided Image and Media Editing.
Paper Title | Venue | Year | Focus |
---|---|---|---|
A Survey of Multimodal Composite Editing and Retrieval | arXiv | 2024 | Media Retrieval |
INFOBENCH: Evaluating Instruction Following Ability in Large Language Models | arXiv | 2024 | Text Editing |
Multimodal Image Synthesis and Editing: The Generative AI Era | TPAMI | 2023 | X-to-Image Generation |
LLM-driven Instruction Following: Progresses and Concerns | EMNLP | 2023 | Text Editing |
Title | Year | Venue | Editing | Method | Code |
---|---|---|---|---|---|
Guiding Instruction-based Image Editing via Multimodal Large Language Models | 2024 | ICLR | Image | Diffusion | [Code] |
Hive: Harnessing human feedback for instructional visual editing | 2024 | CVPR | Image | Diffusion | [Code] |
EffiVED:Efficient Video Editing via Text-instruction Diffusion Models | 2024 | arXiv | Video | Diffusion | -- |
InstructPix2Pix: Learning To Follow Image Editing Instruction | 2023 | CVPR | Image | Diffusion | [Code] |
Disclaimer
Feel free to contact us if you have any queries or exciting news. In addition, we welcome all researchers to contribute to this repository and further contribute to the knowledge of this field.
If you have some other related references, please feel free to create a Github issue with the paper information. We will glady update the repos according to your suggestions. (You can also create pull requests, but it might take some time for us to do the merge)