Skip to content

R-Yin-217/DS4420_ML-DM2_CourseProject

Repository files navigation

DS4420_ML-DM2_CourseProject

This is the repository for the CS4420 Machine Learning and Data Mining 2 Course Project at Northeastern University.

  • Data

    • For Sentiment Analysis

      • sentiment_dataset

        This is the dataset contains text grabbed from reddit

      • sentiment_result

        This is the rankings we get after sentiment analysis.

      • grab_reddit.py

        This is the code using API to grab posts from reddit

    • For Feature Engeering

      • amenities_raw_data

        • PDF data for New York
      • feature_dataset

        • FURMAN, FURMANE_csv

          These are the original data for featrure engeering.

        • year_dataset

          This is the dataset made from FURMAN and organized by each year.

          • year_2023

            The dataset with Walk Score,Bike Score and Transit Score

        • NYC_data.csv dataset_joined.csv

          Final and pre-final version for the feature

        • boston_features_final.csv, nyc_features_final.csv

          These are final dataset.

      • generate_dataset.ipynb

        This is the code which is used to convert FURMAN into dataset.

  • Sentiment Analysis

    • sentiment_analysis.ipynb

      This is the code to do the sentiment analysis for the sentiment_dataset. Here are tools we used in this file.

  • Feature Engineering

    • Classic_Model.ipynb

      THis is the code to run traditional machine learning code.

    • NN_model.ipynb

      This is the code to run neural network.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors 2

  •  
  •