- 1 hour in-depth review per paper
Date | Topic | Presenter | Video |
---|---|---|---|
01.25 2024 | Diffusion Model Alignment Using Direct Preference Optimization | 형준하 | Video |
01.18 2024 | ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings | 백유진 | Video |
01.11 2024 |
DayDreamer: World Models for Physical Robot Learning | 이병근 | Video |
01.04 2024 | Computer Vision in The Wild | 송준하 | Video |
12.21 2023 | Language Models Don’t Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting | 최민석 | Slide |
12.07 2023 | DALL-E 3: Improving Image Generation with Better Captions | 황성원 | Video |
~ 2023 | Link |
- 5 minutes quick review per paper
Date | Topic | Presenter | Video |
---|---|---|---|
01.25 2024 | Bad Students Make Great Teachers Rethinking FID: Towards a Better Evaluation Metric for Image Generation InstantID: Zero-shot Identity-Preserving Generation in Seconds AI 커버곡 어떻게 만들까? |
박민호 조영우 |
Video |
01.18 2024 | Tokenizer is Key to Visual Generation Divide and not forget: Ensemble of selectively trained experts in Continual Learning FITS: Modeling Time Series with 10k Parameters ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs Pixart-alpha and Pixart-delta Generative Models: What do they know? Do they know things? Instruct-Imagen: Image Generation with Multi-modal Instruction MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation Boundary Attention: Learning to Find Faint Boundaries at Any Resolution TrustLLM: Trustworthiness in Large Language Models Tuning Language Models by Proxy Improving Text Embeddings with Large Language Models |
조호준 윤주열 박준우 최승환 |
Video |
01.11 2024 |
Are Emergent Abilities of Large Language Models a Mirage? Scaling Data-Constrained Language Models Direct Preference Optimization: Your Language Model is Secretly a Reward Model Mixtral of Experts SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling LLaMA Pro: Progressive LLaMA with Block Expansion Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets FreeU: Free Lunch in Diffusion U-Net Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation TIBET: Identifying and Evaluating Biases in Text-to-Image Generative Models ITI-GEN: Inclusive Text-to-Image Generation Fair Text-to-Image Diffusion via Fair Mapping |
양소영 정하원 정채연 김정호 |
Video |
01.04 2024 | Siamese Masked Autoencoders Learning to Reason and Memorize with Self-Notes Video Prediction Models as Rewards for Reinforcement Learning Pixel Aligned Language Models Gradient-based Parameter Selection for Efficient Fine-Tuning SegGPT: Segmenting Everything In Context Gemini vs GPT-4V: A Preliminary Comparison and Combination Large Language Model Bias Index GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language Models I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Model DreamTuner: Single Image is Enough for Subject-Driven Generation StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation |
이승일 이상현 황동윤 정소현 |
Video |
12.21 2023 | ERM++: An Improved Baseline for Domain Generalization DATACOMP: In search of the next generation of multimodal datasets AI2. Does progress on imagenet transfer to real-world datasets? Aligning Large Language Models through Synthetic Feedback Self-Evaluation Improves Selective Generation in Large Language Models Large Language Models as Optimizers EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision |
이도현 조영우 임혜수 최새미 |
Video |
12.14 2023 | Analyzing and Improving the Training Dynamics of Diffusion Models VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence Cache Me if You Can: Accelerating Diffusion Models through Block Caching DreaMoving: A Human Video Generation Framework based on Diffusion Models Vision Transformers Need Registers DeepCache: Accelerating Diffusion Models for Free Kandinsky 3.0 Technical Report FreeInit: Bridging Initialization Gap in Video Diffusion Models Alpha-CLIP: A CLIP Model Focusing on Wherever You Want The mechanistic basis of data dependence and abrupt learning in an in-context classification task Meta Continual Learning Revisited: Implicitly Enhancing Online Hessian Approximation via Variance Reduction LRM: Large Reconstruction Model for Single Image to 3D |
최승환 박민호 박준우 김태성 |
Video |
12.07 2023 | Towards Accurate Differential Diagnosis with Large Language Models Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models Communicative Agents for Software Development IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models LCM-LoRA: A Universal Stable-Diffusion Acceleration Module Adversarial Diffusion Distillation Training Chain-of-Thought via Latent-Variable Inference The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning GAIA: A Benchmark for General AI Assistants FaceStudio: Put Your Face Everywhere in Seconds ImageDream: Image-Prompt Multi-view Diffusion for 3D Generation Describing Differences in Image Sets with Natural Language |
조호준 윤주열 김진희 |
Video |
~ 2023 | Link |