This is a project for RPI's Extended Cognition course. Our goal it to create a language model that is able to in some meaningful way produce content that might appear in a Cognitive Science department class at RPI. We will be training the model on slides and content from actual classes as well as on papers and research in relevant fields.
- Learn basic Machine Learning concepts
- Learn how to implement them in Pytorch
- Build a database for input
- Build a transformer model and train it on data
- Test trained model
- Fine tune and tweak
- (Optional) make model publicly availabe
We will be using a form of a transformer model which is what all mainstream generative large language models (LLM) use.
Setting up Cuda and Anaconda:
Machine Learning Guide Podcast
Really excellent resource that takes you from the very start of machine learning. Also linked is the compiled, curated resources that are mentioned in the podcast.
https://open.spotify.com/show/5M9yZpSyF1jc7uFp2MlhP9
Training a CNN:
https://pyimagesearch.com/2021/07/19/pytorch-training-your-first-convolutional-neural-network-cnn/
What is a Transformer:
https://www.youtube.com/watch?v=4Bdc55j80l8
Unecessary, but interesting use of transformers in computer vision: https://www.youtube.com/watch?v=TrdevFK_am4
Building an LM from scratch:
https://www.youtube.com/watch?v=kCc8FmEb1nY
Reinforcement Learning
https://towardsdatascience.com/reinforcement-learning-101-e24b50e1d292
Fun Videos: https://www.youtube.com/@aiwarehouse/videos
The AI Dilemma
A real "must watch". Many valuable and realistic perspectives on the state and future of AI https://www.youtube.com/watch?v=xoVJKj8lcNQ
Pytorch Example Code
https://colab.research.google.com/drive/1g_WnSYFj1BYPd6CMUODxdPcvlTNp01DJ?usp=sharing
Presentation slides for Extended Cognition
Transformer overview: https://docs.google.com/presentation/d/12W713sYhEkLEY6iv2_k5WEVPbe0LPerbw0JRTdNASUg/edit?usp=sharing