Our project is support my learning project at UIT. Focusing on:
- Crawl data: Get motorbike data from classic-trader.com by using Request + BeautifulSoup (Python)
- Clean data: Fill missing and format data.
- Normalize and scale data with reasonable method.
- Visualize data: supported by matplotlib associated with creating a dashboard by Python script in PowerBI. We designed some charts: histogram, pie chart, bubble chart, boxplot,... for analysing purpose.
- Create data model and train: After selecting variables that affect the dependent variable by Pearson Coefficient and ANOVA analysis, we built our model with pipeline:
- 90% dataset for training set.
- Scale numeric features by using StandardScaler()
- Encode categorical features by using OneHotEncoder()
- Choose PolynomialFeatures(degree = 3) and Ridge Regression (alpha = 85)
- Test model, adjust and make evaluation: