Neural Networks course - PWR Winter Semester 2023

1. Introduction:

The human voice holds a vast amount of information beyond its role in communication. Through subtle nuances and patterns, valuable insights into an individual's health can be uncovered. Our project aims to utilize the power of neural networks to unlock the untapped diagnostic potential within vocal data.

2. Scientific objective:

The primary scientific objective of our project is to develop and optimize a neural network-based system for the accurate classification of various pathologies through the analysis of vocal patterns. This involves the following key components:

Dataset Preprocessing:

Gather a diverse and comprehensive dataset encompassing a range of pathologies.
Implement preprocessing techniques to extract relevant features from the vocal data, considering utility of spectrograms.

Neural Network Architecture Design:

Explore and experiment with different neural network architectures, such as convolutional neural networks (CNNs) to identify the most effective model for pathology classification based on voice analysis.
Optimize hyperparameters, including learning rates, layer configurations, and activation functions, to enhance the model's accuracy and generalization capabilities.

Training and Validation:

Train the neural network on the prepared dataset, employing rigorous cross-validation techniques to ensure robust model performance.
Implement transfer learning strategies, if applicable, to leverage pre-trained models.

Summary

3. Saarbrücken Dataset Description

The Saarbrücken dataset is a curated collection designed for the analysis and classification of various vocal pathologies [Barry, W.J., Pützer, M.: Saarbrücken Voice Database, Institute of Phonetics, Univ. of Saarland, http://www.stimmdatenbank.coli.uni-saarland.de/].

Pathological Categories and Distribution:

Dysphonie: 101 samples
Funktionelle Dysphonie: 112 samples
Hyperfunktionelle Dysphonie: 212 samples
Laryngitis: 140 samples
Rekurrensparese: 213 samples

For this study we have chosen 5 most common pathologies.

Healthy Samples:

In addition to the pathological recordings, the dataset includes 657 samples from healthy individuals.

Subdivision by Speech Elements:

Vowels: The dataset includes recordings focusing on the stable articulation of the vowels /a/, /i/, and /e/ enabling a detailed examination of vowel-specific characteristics.
Utterance: A set of recordings captures the utterance of the phrase "Guten Morgen, wie geht's es Ihnen?" (Good morning, how are you?), offering insights into the impact of different pathologies on the pronunciation of common phrases.

4. Visual Representation

Visualizing the intricate patterns and relationships within vocal data is crucial for understanding the effectiveness of our neural network-based pathology classification system. The following image provides a snapshot of the spectrogram analysis.

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
Evaluation		Evaluation
Images		Images
Models		Models
.gitignore		.gitignore
README.md		README.md
config.py		config.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Neural Networks course - PWR Winter Semester 2023

1. Introduction:

2. Scientific objective:

3. Saarbrücken Dataset Description

Pathological Categories and Distribution:

Healthy Samples:

Subdivision by Speech Elements:

4. Visual Representation

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

najdamikolaj00/Disease_From_Speech_Classification

Folders and files

Latest commit

History

Repository files navigation

Neural Networks course - PWR Winter Semester 2023

1. Introduction:

2. Scientific objective:

3. Saarbrücken Dataset Description

Pathological Categories and Distribution:

Healthy Samples:

Subdivision by Speech Elements:

4. Visual Representation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages