[TOC]
Title | Venue | Note |
---|---|---|
A Comprehensive Survey on Knowledge Distillation of Diffusion Models | 2023 | Weijian Luo. [pdf] |
Knowledge distillation in iterative generative models for improved sampling speed | 2021 | Eric Luhman, Troy Luhman. [pdf] |
Progressive Distillation for Fast Sampling of Diffusion Models | ICLR 2022 | Tim Salimans and Jonathan Ho. [pdf] |
On Distillation of Guided Diffusion Models | CVPR 2023 | Chenlin Meng, Robin Rombach, Ruiqi Gao, Diederik P. Kingma, Stefano Ermon, Jonathan Ho, Tim Salimans. [pdf] |
TRACT: Denoising Diffusion Models with Transitive Closure Time-Distillation | 2023 | Berthelot, David, Autef, Arnaud, Lin, Jierui, Yap, Dian Ang, Zhai, Shuangfei, Hu, Siyuan, Zheng, Daniel, Talbott, Walter, Gu, Eric. [pdf] |
BK-SDM: Architecturally Compressed Stable Diffusion for Efficient Text-to-Image Generation | ICML 2023 | Kim, Bo-Kyeong, Song, Hyoung-Kyu, Castells, Thibault, Choi, Shinkook. [pdf] |
On Architectural Compression of Text-to-Image Diffusion Models | 2023 | Kim, Bo-Kyeong, Song, Hyoung-Kyu, Castells, Thibault, Choi, Shinkook. [pdf] |
Knowledge Diffusion for Distillation | 2023 | Tao Huang, Yuan Zhang, Mingkai Zheng, Shan You, Fei Wang, Chen Qian, Chang Xu. [pdf] |
SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds | 2023 | Yanyu Li, Huan Wang, Qing Jin, Ju Hu, Pavlo Chemerys, Yun Fu, Yanzhi Wang, Sergey Tulyakov, Jian Ren1. [pdf] |
BOOT: Data-free Distillation of Denoising Diffusion Models with Bootstrapping | 2023 | Jiatao Gu, Shuangfei Zhai, Yizhe Zhang, Lingjie Liu, Josh Susskind. [pdf] |
Consistency models | 2023 | Yang Song, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever. [pdf] |
Title | Venue | Note |
---|---|---|
Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks | arXiv:1709.00513 | |
Knowledge Distillation for Semantic Segmentation | ||
Structured knowledge distillation for semantic segmentation | CVPR-2019 | |
Intra-class feature variation distillation for semantic segmentation | ECCV-2020 | |
Channel-wise knowledge distillation for dense prediction | ICCV-2021 | |
Double Similarity Distillation for Semantic Image Segmentation | TIP-2021 | |
Cross-Image Relational Knowledge Distillation for Semantic Segmentation | CVPR-2022 | |
Title | Venue | Note |
---|---|---|
Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks | arXiv:1709.00513 | |
Mimicking very efficient network for object detection | CVPR 2017 | |
Distilling object detectors with fine-grained feature imitation | CVPR 2019 | |
General instance distillation for object detection | CVPR 2021 | |
Distilling object detectors via decoupled features | CVPR 2021 | |
Distilling object detectors with feature richness | NeurIPS 2021 | |
Focal and global knowledge distillation for detectors | CVPR 2022 | |
Rank Mimicking and Prediction-guided Feature Imitation | AAAI 2022 | |
Prediction-Guided Distillation | ECCV 2022 | |
Masked Distillation with Receptive Tokens | ICLR 2023 | |
Structural Knowledge Distillation for Object Detection | NeurIPS 2022 | OpenReview |
Dual Relation Knowledge Distillation for Object Detection | IJCAI 2023 | |
GLAMD: Global and Local Attention Mask Distillation for Object Detectors | ECCV 2022 | ECVA |
G-DetKD: Towards General Distillation Framework for Object Detectors via Contrastive and Semantic-guided Feature Imitation | ICCV 2021 | CVF |
PKD: General Distillation Framework for Object Detectors via Pearson Correlation Coefficient | NeurIPS 2022 | OpenReview |
MimicDet: Bridging the Gap Between One-Stage and Two-Stage Object Detection | ECCV 2020 | ECVA |
LabelEnc: A New Intermediate Supervision Method for Object Detection | ECCV 2020 | ECVA |
Title | Venue | Note |
---|---|---|
HEtero-Assists Distillation for Heterogeneous Object Detectors | ECCV 2022 | HEAD |
LGD: Label-Guided Self-Distillation for Object Detection | AAAI 2022 | LGD |
When Object Detection Meets Knowledge Distillation: A Survey | TPAMI | |
ScaleKD: Distilling Scale-Aware Knowledge in Small Object Detector | CVPR 2023 | ScaleKD |
CrossKD: Cross-Head Knowledge Distillation for Dense Object Detection | arXiv:2306.11369 | CrossKD |
Title | Venue | Note |
---|---|---|
Training data-efficient image transformers & distillation through attention | ICML2021 | |
Co-advise: Cross inductive bias distillation | CVPR2022 | |
Tinyvit: Fast pretraining distillation for small vision transformers | arXiv:2207.10666 | |
Attention Probe: Vision Transformer Distillation in the Wild | ICASSP2022 | |
Dear KD: Data-Efficient Early Knowledge Distillation for Vision Transformers | CVPR2022 | |
Efficient vision transformers via fine-grained manifold distillation | NIPS2022 | |
Cross-Architecture Knowledge Distillation | arXiv:2207.05273 | |
MiniViT: Compressing Vision Transformers with Weight Multiplexing | CVPR2022 | |
ViTKD: Practical Guidelines for ViT feature knowledge distillation | arXiv 2022 | code |
Title | Venue | Note |
---|---|---|
Improved Knowledge Distillation via Teacher Assistant: Bridging the Gap Between Student and Teacher | AAAI2020 | |
Search to Distill: Pearls are Everywhere but not the Eyes | CVPR 2020 | |
Reducing the Teacher-Student Gap via Spherical Knowledge Disitllation | arXiv:2020 | |
Knowledge Distillation via the Target-aware Transformer | CVPR2022 | |
Decoupled Knowledge Distillation | CVPR 2022 | code |
Prune Your Model Before Distill It | ECCV 2022 | code |
Asymmetric Temperature Scaling Makes Larger Networks Teach Well Again | NeurIPS 2022 | |
Weighted Distillation with Unlabeled Examples | NeurIPS 2022 | |
Respecting Transfer Gap in Knowledge Distillation | NeurIPS 2022 | |
Knowledge Distillation from A Stronger Teacher | arXiv:2205.10536 | |
Masked Generative Distillation | ECCV 2022 | code |
Curriculum Temperature for Knowledge Distillation | AAAI 2023 | code |
Knowledge distillation: A good teacher is patient and consistent | CVPR 2022 | |
Knowledge Distillation with the Reused Teacher Classifier | CVPR 2022 | |
Scaffolding a Student to Instill Knowledge | ICLR2023 | |
Function-Consistent Feature Distillation | ICLR2023 | |
Better Teacher Better Student: Dynamic Prior Knowledge for Knowledge Distillation | ICLR2023 | |
Supervision Complexity and its Role in Knowledge Distillation | ICLR2023 |
Title | Venue | Note |
---|---|---|
Distilling the knowledge in a neural network | arXiv:1503.2531 | |
Deep Model Compression: Distilling Knowledge from Noisy Teachers | arXiv:161009650 | |
Semi-Supervised Knowledge Transfer for Deep Learning from Private Training Data | ICLR 2017 | |
Knowledge Adaptation: Teaching to Adapt | Arxiv:17022052 | |
Learning from Multiple Teacher Networks | KDD 2017 | |
Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results | NIPS 2017 | |
Training Deep Neural Networks in Generations:A More Tolerant Teacher Educates Better Students | arXiv:1805.551 | |
Moonshine:Distilling with Cheap Convolutions | NIPS 2018 | |
Learning from Multiple Teacher Networks | KDD 2017 | |
Positive-Unlabeled Compression on the Cloud | NIPS 2019 | |
Variational Student: Learning Compact and Sparser Networks in Knowledge Distillation Framework | arXiv:1910.12061 | |
Preparing Lessons: Improve Knowledge Distillation with Better Supervision | arXiv:1911.7471 | |
Adaptive Regularization of Labels | arXiv:1908.5474 | |
Learning Metrics from Teachers: Compact Networks for Image Embedding | CVPR 2019 | |
Diversity with Cooperation: Ensemble Methods for Few-Shot Classification | ICCV 2019 | |
Improved Knowledge Distillation via Teacher Assistant: Bridging the Gap Between Student and Teacher | arXiv:1902.3393 | |
MEAL: Multi-Model Ensemble via Adversarial Learning | AAAI 2019 | |
Revisit Knowledge Distillation: a Teacher-free Framework | CVPR 2020 [code] | |
Ensemble Distribution Distillation | ICLR 2020 | |
Noisy Collaboration in Knowledge Distillation | ICLR 2020 | |
Self-training with Noisy Student improves ImageNet classification | CVPR 2020 | |
QUEST: Quantized embedding space for transferring knowledge | CVPR 2020(pre) | |
Meta Pseudo Labels | ICML 2020 | |
Subclass Distillation | ICML2020 | |
Boosting Self-Supervised Learning via Knowledge Transfer | CVPR 2018 | |
Neural Networks Are More Productive Teachers Than Human Raters: Active Mixup for Data-Efficient Knowledge Distillation from a Blackbox Model | CVPR 2020 [code] | |
Regularizing Class-wise Predictions via Self-knowledge Distillation | CVPR 2020 [code] | |
Rethinking Data Augmentation: Self-Supervision and Self-Distillation | ICLR 2020 | |
What it Thinks is Important is Important: Robustness Transfers through Input Gradients | CVPR 2020 | |
Role-Wise Data Augmentation for Knowledge Distillation | ICLR 2020 [code] | |
Distilling Effective Supervision from Severe Label Noise | CVPR 2020 | |
Learning with Noisy Class Labels for Instance Segmentation | ECCV 2020 | |
Self-Distillation Amplifies Regularization in Hilbert Space | arXiv:2002.5715 | |
MINILM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers | arXiv:200210957 | |
Hydra: Preserving Ensemble Diversity for Model Distillation | arXiv:20014694 | |
Teacher-Class Network: A Neural Network Compression Mechanism | arXiv:2004.3281 | |
Learning from a Lightweight Teacher for Efficient Knowledge Distillation | arXiv:2005.9163 | |
Self-Distillation as Instance-Specific Label Smoothing | arXiv:2006.5065 | |
Self-supervised Knowledge Distillation for Few-shot Learning | arXiv:2006.09785 | |
Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation | arXiv:2007.1951 | |
Few Sample Knowledge Distillation for Efficient Network Compression | CVPR 2020 | |
Learning What and Where to Transfer | ICML 2019 | |
Transferring Knowledge across Learning Processes | ICLR 2019 | |
Semantic-Aware Knowledge Preservation for Zero-Shot Sketch-Based Image Retrieval | ICCV 2019 | |
Diversity with Cooperation: Ensemble Methods for Few-Shot Classification | ICCV 2019 | |
Knowledge Representing: Efficient, Sparse Representation of Prior Knowledge for Knowledge Distillation | arXiv:191105329v1 | |
Progressive Knowledge Distillation For Generative Modeling | ICLR 2020 | |
Few Shot Network Compression via Cross Distillation | AAAI 2020 |
Title | Venue | Note |
---|---|---|
Fitnets: Hints for thin deep nets | arXiv:1412.6550 | |
Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer | ICLR 2017 | |
Knowledge Projection for Effective Design of Thinner and Faster Deep Neural Networks | arXiv:1710.9505 | |
A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning | CVPR 2017 | |
Paraphrasing complex network: Network compression via factor transfer | NIPS 2018 | |
Knowledge transfer with jacobian matching | ICML 2018 | |
Like What You Like: Knowledge Distill via Neuron Selectivity Transfer | CVPR2018 | |
An Embarrassingly Simple Approach for Knowledge Distillation | MLR 2018 | |
Self-supervised knowledge distillation using singular value decomposition | ECCV 2018 | |
Learning Deep Representations with Probabilistic Knowledge Transfer | ECCV 2018 | |
Correlation Congruence for Knowledge Distillation | ICCV 2019 | |
Similarity-Preserving Knowledge Distillation | ICCV 2019 | |
Variational Information Distillation for Knowledge Transfer | CVPR 2019 | |
Knowledge Distillation via Instance Relationship Graph | CVPR 2019 | |
Knowledge Distillation via Instance Relationship Graph | CVPR 2019 | |
Knowledge Distillation via Route Constrained Optimization | ICCV 2019 | |
Similarity-Preserving Knowledge Distillation | ICCV 2019 | |
Stagewise Knowledge Distillation | arXiv: 1911.6786 | |
Distilling Object Detectors with Fine-grained Feature Imitation | ICLR 2020 | |
Knowledge Squeezed Adversarial Network Compression | AAAI 2020 | |
Knowledge Distillation from Internal Representations | AAAI 2020 | |
Knowledge Flow:Improve Upon Your Teachers | ICLR 2019 | |
LIT: Learned Intermediate Representation Training for Model Compression | ICML 2019 | |
A Comprehensive Overhaul of Feature Distillation | ICCV 2019 | |
Residual Knowledge Distillation | arXiv:2002.9168 | |
Knowledge distillation via adaptive instance normalization | arXiv:2003.4289 | |
Channel Distillation: Channel-Wise Attention for Knowledge Distillation | arXiv:2006.01683 | |
Matching Guided Distillation | ECCV 2020 | |
Differentiable Feature Aggregation Search for Knowledge Distillation | ECCV 2020 | |
Local Correlation Consistency for Knowledge Distillation | ECCV 2020 |
Title | Venue | Note |
---|---|---|
Deep Mutual Learning | CVPR 2018 | |
Born-Again Neural Networks | ICML 2018 | |
Knowledge distillation by on-the-fly native ensemble | NIPS 2018 | |
Collaborative learning for deep neural networks | NIPS 2018 | |
Unifying Heterogeneous Classifiers with Distillation | CVPR 2019 | |
Snapshot Distillation: Teacher-Student Optimization in One Generation | CVPR 2019 | |
Deeply-supervised knowledge synergy | CVPR 2019 | |
Be Your Own Teacher: Improve the Performance of Convolutional Neural Networks via Self Distillation | ICCV 2019 | |
Distillation-Based Training for Multi-Exit Architectures | ICCV 2019 | |
MSD: Multi-Self-Distillation Learning via Multi-classifiers within Deep Neural Networks | arXiv:1911.9418 | |
FEED: Feature-level Ensemble for Knowledge Distillation | AAAI 2020 | |
Stochasticity and Skip Connection Improve Knowledge Transfer | ICLR 2020 | |
Online Knowledge Distillation with Diverse Peers | AAAI 2020 | |
Online Knowledge Distillation via Collaborative Learning | CVPR 2020 | |
Collaborative Learning for Faster StyleGAN Embedding | arXiv:20071758 | |
Online Knowledge Distillation via Collaborative Learning | CVPR 2020 | |
Feature-map-level Online Adversarial Knowledge Distillation | ICML 2020 | |
Knowledge Transfer via Dense Cross-layer Mutual-distillation | ECCV 2020 | |
MetaDistiller: Network Self-boosting via Meta-learned Top-down Distillation | ECCV 2020 | |
ResKD: Residual-Guided Knowledge Distillation | arXiv:2006.4719 | |
Interactive Knowledge Distillation | arXiv:2007.1476 |
Title | Venue | Note |
---|---|---|
Do deep nets really need to be deep? | NIPS 2014 | |
When Does Label Smoothing Help? | NIPS 2019 | |
Towards Understanding Knowledge Distillation | AAAI 2019 | |
Harnessing deep neural networks with logical rules | ACL 2016 | |
Adaptive Regularization of Labels | arXiv:1908 | |
Knowledge Isomorphism between Neural Networks | arXiv:1908 | |
Understanding and Improving Knowledge Distillation | arXiv:2002.3532 | |
The State of Knowledge Distillation for Classification | arXiv:1912.10850 | |
Explaining Knowledge Distillation by Quantifying the Knowledge | CVPR 2020 | |
DeepVID: deep visual interpretation and diagnosis for image classifiers via knowledge distillation | IEEE Trans, 2019 | |
On the Unreasonable Effectiveness of Knowledge Distillation: Analysis in the Kernel Regime | arXiv:2003.13438 | |
Why distillation helps: a statistical perspective | arXiv:2005.10419 | |
Transferring Inductive Biases through Knowledge Distillation | arXiv:2006.555 | |
Does label smoothing mitigate label noise? Lukasik, Michal et al | ICML 2020 | |
An Empirical Analysis of the Impact of Data Augmentation on Knowledge Distillation | arXiv:2006.3810 | |
Does Adversarial Transferability Indicate Knowledge Transferability? | arXiv:2006.14512 | |
On the Demystification of Knowledge Distillation: A Residual Network Perspective | arXiv:2006.16589 | |
Teaching To Teach By Structured Dark Knowledge | ICLR 2020 | |
Inter-Region Affinity Distillation for Road Marking Segmentation | CVPR 2020 [code] | |
Heterogeneous Knowledge Distillation using Information Flow Modeling | CVPR 2020 [code] | |
Local Correlation Consistency for Knowledge Distillation | ECCV2020 | |
Few-Shot Class-Incremental Learning | CVPR 2020 | |
Unifying distillation and privileged information | ICLR 2016 |
Title | Venue | Note |
---|---|---|
Accelerating Convolutional Neural Networks with Dominant Convolutional Kernel and Knowledge Pre-regression | ECCV 2016 | |
N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning | ICLR 2018 | |
Slimmable Neural Networks | ICLR 2018 | |
Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy | NIPS 2018 | |
MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning | ICCV 2019 | |
LightPAFF: A Two-Stage Distillation Framework for Pre-training and Fine-tuning | ICLR 2020 | |
Pruning with hints: an efficient framework for model acceleration | ICLR 2020 | |
Knapsack Pruning with Inner Distillation | arXiv:2002.8258 | |
Training convolutional neural networks with cheap convolutions and online distillation | arXiv:190913063 | |
Cooperative Pruning in Cross-Domain Deep Neural Network Compression | IJCAI 2019 | |
QKD: Quantization-aware Knowledge Distillation | arXiv:191112491v1 | |
Neural Network Pruning with Residual-Connections and Limited-Data | CVPR 2020 | |
Training Quantized Neural Networks with a Full-precision Auxiliary Module | CVPR 2020 | |
Towards Effective Low-bitwidth Convolutional Neural Networks | CVPR 2018 | |
Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations | arXiv:19084680 | |
Paying more attention to snapshots of Iterative Pruning: Improving Model Compression via Ensemble Distillation | arXiv:200611487 | |
Knowledge Distillation Beyond Model Compression | arxiv:20071493 | |
Teacher Guided Architecture Search | ICCV 2019 | |
Distillation Guided Residual Learning for Binary Convolutional Neural Networks | ECCV 2020 | |
MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution | ECCV 2020 | |
Improving Neural Architecture Search Image Classifiers via Ensemble Learning | arXiv:19036236 | |
Blockwisely Supervised Neural Architecture Search with Knowledge Distillation | arXiv:191113053v1 | |
Towards Oracle Knowledge Distillation with Neural Architecture Search | AAAI 2020 | |
Search for Better Students to Learn Distilled Knowledge | arXiv:200111612 | |
Circumventing Outliers of AutoAugment with Knowledge Distillation | arXiv:200311342 | |
Network Pruning via Transformable Architecture Search | NIPS 2019 | |
Search to Distill: Pearls are Everywhere but not the Eyes | CVPR 2020 | |
AutoGAN-Distiller: Searching to Compress Generative Adversarial Networks | ICML 2020 [code] |
Sub | Title | Venue |
---|---|---|
Graph | Graph-based Knowledge Distillation by Multi-head Attention Network | arXiv:19072226 |
Graph Representation Learning via Multi-task Knowledge Distillation | arXiv:19115700 | |
Deep geometric knowledge distillation with graphs | arXiv:19113080 | |
Better and faster: Knowledge transfer from multiple self-supervised learning tasks via graph distillation for video classification | IJCAI 2018 | |
Distillating Knowledge from Graph Convolutional Networks | CVPR 2020 | |
Face | Face model compression by distilling knowledge from neurons | AAAI 2016 |
MarginDistillation: distillation for margin-based softmax | arXiv:2003.2586 | |
ReID | Distilled Person Re-Identification: Towards a More Scalable System | CVPR 2019 |
Robust Re-Identification by Multiple Views Knowledge Distillation | ECCV 2020 [code] | |
Detection | Learning efficient object detection models with knowledge distillation | NIPS 2017 |
Distilling Object Detectors with Fine-grained Feature Imitation | CVPR 2019 | |
Relation Distillation Networks for Video Object Detection | ICCV 2019 | |
Learning Lightweight Face Detector with Knowledge Distillation | IEEE 2019 | |
Teacher Supervises Students How to Learn From Partially Labeled Images for Facial Landmark Detection | ICCV 2019 | |
Learning Lightweight Lane Detection CNNs by Self Attention Distillation | ICCV 2019 | |
A Multi-Task Mean Teacher for Semi-Supervised Shadow Detection | CVPR 2020 [code] | |
Boosting Weakly Supervised Object Detection with Progressive Knowledge Transfer | ECCV 2020 | |
A Multi-Task Mean Teacher for Semi-Supervised Shadow Detection | CVPR 2020 [code] | |
Temporal Self-Ensembling Teacher for Semi-Supervised Object Detection | IEEE 2020 [code] | |
Uninformed Students: Student-Teacher Anomaly Detection with Discriminative Latent Embeddings | CVPR 2020 | |
Distilling Knowledge from Refinement in Multiple Instance Detection Networks | arXiv:2004.10943 | |
Enabling Incremental Knowledge Transfer for Object Detection at the Edge | arXiv:2004.5746 | |
Pose | DOPE: Distillation Of Part Experts for whole-body 3D pose estimation in the wild | ECCV 2020 |
Fast Human Pose Estimation | CVPR 2019 | |
Distill Knowledge From NRSfM for Weakly Supervised 3D Pose Learning | ICCV 2019 | |
Segmentation | ROAD: Reality Oriented Adaptation for Semantic Segmentation of Urban Scenes | CVPR 2018 |
Knowledge Distillation for Incremental Learning in Semantic Segmentation | arXiv:1911.3462 | |
Geometry-Aware Distillation for Indoor Semantic Segmentation | CVPR 2019 | |
Structured Knowledge Distillation for Semantic Segmentation | CVPR 2019 | |
Self-similarity Student for Partial Label Histopathology Image Segmentation | ECCV 2020 | |
Knowledge Distillation for Brain Tumor Segmentation | arXiv:2002.3688 | |
ROAD: Reality Oriented Adaptation for Semantic Segmentation of Urban Scenes | CVPR 2018 | |
Low-Vision | Lightweight Image Super-Resolution with Information Multi-distillation Network | ICCVW 2019 |
Collaborative Distillation for Ultra-Resolution Universal Style Transfer | CVPR 2020 [code] | |
Video | Efficient Video Classification Using Fewer Frames | CVPR 2019 |
Relation Distillation Networks for Video Object Detection | ICCV 2019 | |
Teacher Supervises Students How to Learn From Partially Labeled Images for Facial Landmark Detection | ICCV 2019 | |
Progressive Teacher-student Learning for Early Action Prediction | CVPR 2019 | |
MOD: A Deep Mixture Model with Online Knowledge Distillation for Large Scale Video Temporal Concept Localization | arXiv:1910.12295 | |
AWSD:Adaptive Weighted Spatiotemporal Distillation for Video Representation | ICCV 2019 | |
Dynamic Kernel Distillation for Efficient Pose Estimation in Videos | ICCV 2019 | |
Online Model Distillation for Efficient Video Inference | ICCV 2019 | |
Optical Flow Distillation: Towards Efficient and Stable Video Style Transfer | ECCV 2020 | |
Adversarial Self-Supervised Learning for Semi-Supervised 3D Action Recognition | ECCV 2020 | |
Object Relational Graph with Teacher-Recommended Learning for Video Captioning | CVPR 2020 | |
Spatio-Temporal Graph for Video Captioning with Knowledge distillation | CVPR 2020 [code] | |
TA-Student VQA: Multi-Agents Training by Self-Questioning | CVPR 2020 |
Title | Venue | Note |
---|---|---|
Data-Free Knowledge Distillation for Deep Neural Networks | NIPS 2017 | |
Zero-Shot Knowledge Distillation in Deep Networks | ICML 2019 | |
DAFL:Data-Free Learning of Student Networks | ICCV 2019 | |
Zero-shot Knowledge Transfer via Adversarial Belief Matching | NIPS 2019 | |
Dream Distillation: A Data-Independent Model Compression Framework | ICML 2019 | |
Dreaming to Distill: Data-free Knowledge Transfer via DeepInversion | CVPR 2020 | |
Data-Free Adversarial Distillation | CVPR 2020 | |
The Knowledge Within: Methods for Data-Free Model Compression | CVPR 2020 | |
Knowledge Extraction with No Observable Data | NIPS 2019 | |
Data-Free Knowledge Amalgamation via Group-Stack Dual-GAN | CVPR 2020 | |
DeGAN : Data-Enriching GAN for Retrieving Representative Samples from a Trained Classifier | arXiv:1912.11960 | |
Generative Low-bitwidth Data Free Quantization | arXiv:2003.3603 | |
This dataset does not exist: training models from generated images | arXiv:1911.2888 | |
MAZE: Data-Free Model Stealing Attack Using Zeroth-Order Gradient Estimation | arXiv:2005.3161 | |
Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data | ECCV 2020 | |
Billion-scale semi-supervised learning for image classification | arXiv:1905.00546 | |
Data-free Parameter Pruning for Deep Neural Networks | arXiv:1507.6149 | |
Data-Free Quantization Through Weight Equalization and Bias Correction | ICCV 2019 | |
DAC: Data-free Automatic Acceleration of Convolutional Networks | WACV 2019 |
Title | Venue | Note |
---|---|---|
SoundNet: Learning Sound Representations from Unlabeled Video SoundNet Architecture | ECCV 2016 | |
Cross Modal Distillation for Supervision Transfer | CVPR 2016 | |
Emotion recognition in speech using cross-modal transfer in the wild | ACM MM 2018 | |
Through-Wall Human Pose Estimation Using Radio Signals | CVPR 2018 | |
Compact Trilinear Interaction for Visual Question Answering | ICCV 2019 | |
Cross-Modal Knowledge Distillation for Action Recognition | ICIP 2019 | |
Learning to Map Nearly Anything | arXiv:1909.6928 | |
Semantic-Aware Knowledge Preservation for Zero-Shot Sketch-Based Image Retrieval | ICCV 2019 | |
UM-Adapt: Unsupervised Multi-Task Adaptation Using Adversarial Cross-Task Distillation | ICCV 2019 | |
CrDoCo: Pixel-level Domain Transfer with Cross-Domain Consistency | CVPR 2019 | |
XD:Cross lingual Knowledge Distillation for Polyglot Sentence Embeddings | ||
Effective Domain Knowledge Transfer with Soft Fine-tuning | arXiv:1909.2236 | |
ASR is all you need: cross-modal distillation for lip reading | arXiv:1911.12747 | |
Knowledge distillation for semi-supervised domain adaptation | arXiv:1908.7355 | |
Domain Adaptation via Teacher-Student Learning for End-to-End Speech Recognition | arXiv:2001.1798 | |
Cluster Alignment with a Teacher for Unsupervised Domain Adaptation | ICCV 2019. | |
Attention Bridging Network for Knowledge Transfer | ICCV 2019 | |
Unpaired Multi-modal Segmentation via Knowledge Distillation | arXiv:2001.3111 | |
Multi-source Distilling Domain Adaptation | arXiv:1911.11554 | |
Creating Something from Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing | CVPR 2020 | |
Improving Semantic Segmentation via Self-Training | arXiv:2004.14960 | |
Speech to Text Adaptation: Towards an Efficient Cross-Modal Distillation | arXiv:2005.8213 | |
Joint Progressive Knowledge Distillation and Unsupervised Domain Adaptation | arXiv:2005.7839 | |
Knowledge as Priors: Cross-Modal Knowledge Generalization for Datasets without Superior Knowledge | CVPR 2020 | |
Large-Scale Domain Adaptation via Teacher-Student Learning | arXiv:1708.5466 | |
Large Scale Audiovisual Learning of Sounds with Weakly Labeled Data | IJCAI 2020 | |
Distilling Cross-Task Knowledge via Relationship Matching | CVPR 2020 [code] | |
Modality distillation with multiple stream networks for action recognition | ECCV 2018 | |
Domain Adaptation through Task Distillation | ECCV 2020 |
Title | Venue | Note |
---|---|---|
Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks | arXiv:1709.00513 | |
KTAN: Knowledge Transfer Adversarial Network | arXiv:1810.08126 | |
KDGAN:Knowledge Distillation with Generative Adversarial Networks. | NIPS 2018 | |
Adversarial Learning of Portable Student Networks | AAAI 2018 | |
Adversarial Network Compression | ECCV 2018 | |
Cross-Modality Distillation: A case for Conditional Generative Adversarial Networks | ICASSP 2018 | |
Adversarial Distillation for Efficient Recommendation with External Knowledge | TOIS 2018 | |
Training student networks for acceleration with conditional adversarial networks | BMVC 2018 | |
Adversarial network compression | ECCV 2018 | |
KDGAN:Knowledge Distillation with Generative Adversarial Networks | NIPS 2018 | |
DAFL:Data-Free Learning of Student Networks | ICCV 2019 | |
MEAL: Multi-Model Ensemble via Adversarial Learning | AAAI 2019 | |
Exploiting the Ground-Truth: An Adversarial Imitation Based Knowledge Distillation Approach for Event Detection | AAAI 2019 | |
Adversarially Robust Distillation | AAAI 2020 | |
GAN-Knowledge Distillation for one-stage Object Detection | arXiv:1906.08467 | |
Lifelong GAN: Continual Learning for Conditional Image Generation | arXiv:1908.03884 | |
Compressing GANs using Knowledge Distillation | arXiv:1902.00159 | |
Feature-map-level Online Adversarial Knowledge Distillation | ICML 2020 | |
MineGAN: effective knowledge transfer from GANs to target domains with few images | CVPR 2020 | |
Distilling portable Generative Adversarial Networks for Image Translation | AAAI 2020 | |
GAN Compression: Efficient Architectures for Interactive Conditional GANs | CVPR 2020 |