Skip to content

Latest commit

 

History

History
6 lines (4 loc) · 407 Bytes

readme.md

File metadata and controls

6 lines (4 loc) · 407 Bytes

This work was done for the capstone project of the professional certificate in Data Science by HarvardX. The goal is to classify headlines as click-bait or not click-bait. Three approaches were tried:

  • classification based on linguistic features, like length or the presence of exclamation marks
  • logistic regression with count-based string vectorization
  • logistic regression with tf-idf vectorization