本项目探索并实现超越传统 Transformer 架构的高效序列建模方法,重点关注状态空间模型(SSM)Mamba 和线性注意力机制等新型架构。项目基于Pytorch框架,从零设计实现了一套完整的模型训练、评估、记录和可视化方案,并完成 GLUE Benchmark 和 LRA 的适配工作。
-
Updated
Jun 11, 2025 - Python
本项目探索并实现超越传统 Transformer 架构的高效序列建模方法,重点关注状态空间模型(SSM)Mamba 和线性注意力机制等新型架构。项目基于Pytorch框架,从零设计实现了一套完整的模型训练、评估、记录和可视化方案,并完成 GLUE Benchmark 和 LRA 的适配工作。
This repository contains my coursework, assignments, and projects from the Deep Learning Specialization by Andrew Ng on Coursera. It includes five courses covering neural networks, improving deep neural networks, structuring machine learning projects, convolutional neural networks, and sequence models.
Discord Chatbot using a simple neural network architecture
projects completed during the Coursera Deep Learning Specialization by AndrewNg
A basic Project to predict Sentence using RNN
Coursera Deep Learning Specialization Repository
Predict 1d Sequences on Stocks and Currency data
Add a description, image, and links to the sequencemodels topic page so that developers can more easily learn about it.
To associate your repository with the sequencemodels topic, visit your repo's landing page and select "manage topics."