Skip to content

Data Analysis and Sequential modeling with BERT to predict scores given students logs in AI assisted tutor

Notifications You must be signed in to change notification settings

pagand/AITutor_SeqModeling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project Overview

This project provides a structured approach to understand and predict user scores based purely on their interaction patterns, employing advanced machine learning techniques to achieve meaningful insights.

Data Analysis

The data analysis is conducted on the log file. The log file contains valuable information about the user's actions and interactions, which are used to predict their scores.

Feature Engineering

Feature engineering is performed to transform raw data into features that better represent the underlying problem to the predictive models, resulting in improved model accuracy.

Sequence Modeling

Sequence modeling involves using the BERT model to predict the sequence of actions. The BERT model is a transformer-based machine learning technique for natural language processing pre-training.

In this project, we use BERT to generate special tokens that represent the sequence of actions. These tokens are then used to predict the user's score.

Data

The data used in this project is stored in the data directory. It includes various files such as data.xlsx, group_a_raw.csv, Groupa_scores.xlsx, Groupb_scores.xlsx, integration_log_group_a.xlsx, and integration_log_group_b.xlsx. (raw Data including scores and integration log are not available for public, please contact to get more information. )

Note:You can add syn_ to the beginning of the files to have synthesised data instead of actual data.

Model Training and Evaluation

The model is trained on the sequences and scores data, and its performance is evaluated on the test set.

Weights and Biases Integration

We use Weights and Biases for experiment tracking, model optimization, and dataset versioning. It helps us to keep track of our experiments, visualize our results, and share our findings with others.

Part 1: Analysis of User Score Prediction

This project leverages actual user interaction logs within an educational environment to predict scores. By capturing the sequence and nature of user actions without direct knowledge of the answers chosen or their correctness, the model aims to infer user scores based on behavioral patterns.

Refer to Part1 for more information.

Part 2: Sequential Modeling with BERT

This project aims to predict user scores based on sequences of their interaction logs with a quiz system, without knowledge of the correct answers or the choices made by the user. The primary challenge is to infer the score from patterns in user behavior during the quiz.

Refer to Part2 for more information.

About

Data Analysis and Sequential modeling with BERT to predict scores given students logs in AI assisted tutor

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published