To keep track of the large number of recent papers that look at the intersection of Transformers and Neural Architecture Search (NAS), we have created this awesome list of curated papers and resources, inspired by awesome-autodl, awesome-architecture-search, and awesome-computer-vision. Papers are divided into the following categories:
- General Transformer search
- Domain Specific, applied Transformer search (divided into NLP, Vision, ASR)
- Transformers Knowledge: Insights / Searchable parameters / Attention
- Transformer Surveys
- Foundation Models
- Misc Resources
This repository is maintained by Yash Mehta, please feel free to reach out, create pull requests or open an issue to add papers. Please see this Google Doc for a comprehensive list of papers at ICML 2023 on foundation models/large language models.
Title | Venue | Group |
---|---|---|
𝛼NAS: Neural Architecture Search using Property Guided Synthesis | ACM Programming Languages'22 | MIT, Google |
NASViT: Neural Architecture Search for Efficient Vision Transformers with Gradient Conflict aware Supernet Training | ICLR'22 | Meta Reality Labs |
AutoFormer: Searching Transformers for Visual Recognition | ICCV'21 | MSR |
GLiT: Neural Architecture Search for Global and Local Image Transformer | ICCV'21 | University of Sydney |
Searching for Efficient Multi-Stage Vision Transformers | ICCV'21 workshop | MIT |
HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers | CVPR'21 | Bytedance Inc. |
Title | Venue | Group |
---|---|---|
AutoBERT-Zero: Evolving the BERT backbone from scratch | AAAI'22 | Huawei Noah’s Ark Lab |
Primer: Searching for Efficient Transformers for Language Modeling | NeurIPS'21 | |
AutoTinyBERT: Automatic Hyper-parameter Optimization for Efficient Pre-trained Language Models | ACL'21 | Tsinghua, Huawei Naoh's Ark |
NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search | KDD'21 | MSR, Tsinghua University |
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing | ACL'20 | MIT |
Title | Venue | Group |
---|---|---|
SFA: Searching faster architectures for end-to-end automatic speech recognition models | Computer Speech and Language'23 | Chinese Academy of Sciences |
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search | ICASSP'21 | MSR |
Efficient Gradient-Based Neural Architecture Search For End-to-End ASR | ICMI-MLMI'21 | NPU, Xi'an |
Evolved Speech-Transformer: Applying Neural Architecture Search to End-to-End Automatic Speech Recognition | INTERSPEECH'20 | VUNO Inc. |
Title | Venue | Group |
---|---|---|
Transformers in Vision: A Survey | ACM Computing Surveys'22 | MBZ University of AI |
A Survey of Vision Transformers | TPAMI'22 | CAS |
Efficient Transformers: A Survey | ACM Computing Surveys'22 | Google Research |
Neural Architecture Search for Transformers: A Survey | IEEE xplore [Sep'22] | Iowa State Uni |
Title | Venue | Group |
---|---|---|
Neural Architecture Search for Parameter-Efficient Fine-tuning of Large Pre-trained Language Models | arxiv'23 | Amazon Alexa AI |