The repository contains three group final projects and weekly exercises prepared during the academic course FYS-STK4155 - Applied Data Analysis and Machine Learning, taught in the fall semester 2023/24 at the University of Oslo (UiO).
Authors of projects and exercises: Alicja K. Terelak, Giorgio Chiro, Eyyüb Güven
The information below is taken from the description on the course's official GitHub.
I made some small edits to the original content myself (commented by A. K. Terelak).
- Karl Henrik Fredly, k.h.fredly@fys.uio.no
- Daniel Haas Becattini Lima, d.h.b.lima@fys.uio.no
- Morten Hjorth-Jensen, mhjensen@uio.no
- Adam Jakobsen, adam.jakobsen@fys.uio.no
- Fahimeh Najafi, fahimeh.najafi@fys.uio.no
- Ida Torkjellsdatter Storehaug, i.t.storehaug@fys.uio.no
- Mia-Katrin Ose Kvalsund, m.k.o.kvalsund@fys.uio.no
- Statistical analysis and optimization of data
- Machine learning algorithms and Deep Learning
The following topics are normally be covered
- Basic concepts, expectation values, variance, covariance, correlation functions and errors;
- Simpler models, binomial distribution, the Poisson distribution, simple and multivariate normal distributions;
- Central elements of Bayesian statistics and modeling;
- Gradient methods for data optimization,
- Monte Carlo methods, Markov chains, Gibbs sampling and Metropolis-Hastings sampling;
- Estimation of errors and resampling techniques such as the cross-validation, blocking, bootstrapping and jackknife methods;
- Principal Component Analysis (PCA) and its mathematical foundation
The following topics are typically covered:
- Linear Regression and Logistic Regression;
- Neural networks and deep learning, including convolutional and recurrent neural networks
- Decisions trees, Random Forests, Bagging and Boosting
- Support vector machines
- Bayesian linear and logistic regression
- Boltzmann Machines and generative models
- Unsupervised learning Dimensionality reduction, PCA, k-means and clustering
- Autoenconders
- Generative algorithms
Hands-on demonstrations, exercises and projects aim at deepening your understanding of these topics.
Computational aspects play a central role and you are expected to work on numerical examples and projects which illustrate the theory and various algorithms discussed during the lectures.
The lecture notes are collected as a jupyter-book.
Recommended textbooks:
- Christopher M. Bishop, Pattern Recognition and Machine Learning, Springer, https://www.springer.com/gp/book/9780387310732.
- Ian Goodfellow, Yoshua Bengio, and Aaron Courville. The different chapters are available for free at https://www.deeplearningbook.org/.
- Kevin Murphy, Probabilistic Machine Learning, an Introduction, https://probml.github.io/pml-book/book1.html
Additional textbooks:
- Trevor Hastie, Robert Tibshirani, Jerome H. Friedman, The Elements of Statistical Learning, Springer, https://www.springer.com/gp/book/9780387848570.
- Aurelien Geron, Hands‑On Machine Learning with Scikit‑Learn and TensorFlow, O'Reilly, https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/.
General learning book on statistical analysis:
- Christian Robert and George Casella, Monte Carlo Statistical Methods, Springer
- Peter Hoff, A first course in Bayesian statistical models, Springer
General Machine Learning Books:
- Kevin Murphy, Machine Learning: A Probabilistic Perspective, MIT Press
- David J.C. MacKay, Information Theory, Inference, and Learning Algorithms, Cambridge University Press
- David Barber, Bayesian Reasoning and Machine Learning, Cambridge University Press