GitHub - DmitryRyumin/INTERSPEECH-2023-24-Papers: INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!

General Information
Repository Size and Activity
Contribution Statistics
Other Metrics
Application
Progress Status
Main

INTERSPEECH 2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2024 conference. Explore the latest advances in speech and language processing. Code included. ⭐ the repository to support the advancement of speech technology!

Tip

The PDF version of the INTERSPEECH 2024 Conference Programme, comprises a list of all accepted full papers, their presentation order, as well as the designated presentation times.

Other collections of the best AI conferences

Important

Conference table will be up to date all the time.

Conference	Year
Conference	2023	2024
Computer Vision (CV)
CVPR
ICCV
ECCV
WACV	➖
FG	➖
Speech/Signal Processing (SP/SigProc)
ICASSP
INTERSPEECH
ISMIR		➖
Natural Language Processing (NLP)
EMNLP
Machine Learning (ML)
AAAI	➖
ICLR	➖
ICML	➖
NeurIPS	➖

Contributors

Note

Contributions to improve the completeness of this list are greatly appreciated. If you come across any overlooked papers, please feel free to create pull requests, open issues or contact me via email. Your participation is crucial to making this repository even better.

Papers-2024 (`In progress`)

Section	Papers
L2 Speech, Bilingualism and Code-Switching
Speaker Diarization
Speech and Audio Analysis and Representations
Acoustic Event Detection, Segmentation and Classification
Detection and Classification of Bioacoustic Signals

Papers-2023

Section	Papers
Resources for Spoken Language Processing
Speech Synthesis: Prosody and Emotion
Statistical Machine Translation
Self-Supervised Learning in ASR
Prosody
Speech Production
Dysarthric Speech Assessment
Speech Coding: Transmission
Speech Recognition: Signal Processing, Acoustic Modeling, Robustness, Adaptation
Analysis of Speech and Audio Signals
Speech Recognition: Architecture, Search, and Linguistic Components
Speech Recognition: Technologies and Systems for New Applications
Lexical and Language Modeling for ASR
Language Identification and Diarization
Speech Quality Assessment
Feature Modeling for ASR
Interfacing Speech Technology and Phonetics
Speech Synthesis: Multilinguality
Speech Emotion Recognition
Spoken Dialog Systems and Conversational Analysis
Speech Coding and Enhancement
Paralinguistics
Speech Enhancement and Denoising
Speech Synthesis: Evaluation
End-to-End Spoken Dialog Systems
Biosignal-enabled Spoken Communication
Neural-based Speech and Acoustic Analysis
DiGo - Dialog for Good: Speech and Language Technology for Social Good
Spoken Language Processing: Translation, Information Retrieval, Summarization, Resources, and Evaluation
Speech, Voice, and Hearing Disorders
Spoken Term Detection and Voice Search
Models for Streaming ASR
Source Separation
Speech Perception
Phonetics and Phonology: Languages and Varieties
Speaker and Language Identification
Speech Synthesis and Voice Conversion
Speech and Language in Health: from Remote Monitoring to Medical Conversations
Novel Transformer Models for ASR
Speaker Recognition
Cross-lingual and Multilingual ASR
Voice Conversion
Pathological Speech Analysis
Multimodal Speech Emotion Recognition
Phonetics, Phonology, and Prosody
Speech Coding: Privacy
Analysis of Neural Speech Representations
End-to-end ASR
Spoken Language Understanding, Summarization, and Information Retrieval
Invariant and Robust Pre-trained Acoustic Models
Speech Synthesis: Representation Learning
Speech Perception, Production, and Acquisition
Acoustic Model Adaptation for ASR
Speech Synthesis: Expressivity
Multi-modal Systems
Question Answering from Speech
Multi-talker Methods in Speech Processing
Sociophonetics
Speaker and Language Diarization
Anti-Spoofing for Speaker Verification
Speech Coding: Intelligibility
New Computational Strategies for ASR Training and Inference
MERLIon CCS Challenge: Multilingual Everyday Recordings - Language Identification On Code-Switched Child-Directed Speech
Health-Related Speech Analysis
Automatic Audio Classification and Audio Captioning
Speech Synthesis
Speech Synthesis: Controllability and Adaptation
Search Methods and Decoding Algorithms for ASR
Speech Signal Analysis
Connecting Speech-science and Speech-technology for Children's Speech
Dialog Management
Speech Activity Detection and Modeling
Multilingual Models for ASR
Speech Enhancement and Bandwidth Expansion
Articulation
Neural Processing of Speech and Language: Encoding and Decoding the Diverse Auditory Brain
Perception of Paralinguistics
Technologies for Child Speech Processing
Speech Synthesis: Multilinguality; Evaluation
Show and Tell: Health Applications and Emotion Recognition
Show and Tell: Speech Tools, Speech Enhancement, Speech Synthesis
Show and Tell: Language Learning and Educational Resources
Show and Tell: Media and Commercial Applications

Key Terms

To be added soon

Name		Name	Last commit message	Last commit date
Latest commit History 884 Commits
images		images
sections		sections
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
README_2023.md		README_2023.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Contributors

Papers-2024 (`In progress`)

Papers-2023

Key Terms

Star History

About

Uh oh!

Releases 4

Contributors 40

Uh oh!

License

DmitryRyumin/INTERSPEECH-2023-24-Papers

Folders and files

Latest commit

History

Repository files navigation

Contributors

Papers-2024 (In progress)

Papers-2023

Key Terms

Star History

About

Topics

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases 4

Contributors 40

Uh oh!

Papers-2024 (`In progress`)