Skip to content

Latest commit

 

History

History
34 lines (19 loc) · 1.98 KB

Listings_Data.md

File metadata and controls

34 lines (19 loc) · 1.98 KB

Cross Industry Standard Process for Data Mining

During the pandemic, online businesses grew in size and this led to an influx of new revenue and provided ample opportunities for growth and expansion. Machine learning and data science grew at a rapid pace, with the doubling of projects and personals. But not all DS projects meet the set expectations and according to one study12, only 15 to 20 percent of projects meet their desired expectations. One cause is a lack of a clear standard for the development and maintenance of data science projects, and without an industry-standard process, every firm follows its own machine learning process flow. Cross Industry Standard Process for Data Mining is one such method that can be followed. Variants of CRISP-DM, specific machine learning projects, have been suggested and adopted as well. According to a poll by K-Dnuggets, CRISP-DM is a widely used methodology to implement data science and analytics projects. Other popular methodologies are the KDD process,

Model has 6 steps:

  • Business understanding – What’s the Airbnb business model?

  • Data understanding – Is the data complete and reliable?

  • Data preparation – Preprocessing data for EDA and predictive analytics.

  • Modelling – What modelling techniques should we apply, multivariate regression, deep learning, tree-based models?

  • Evaluation – What are the business objective and have they been sufficiently met, Eg: provide the best possible price for the top customer, match super hosts with super customers?

  • Deployment – Adoption and compliance, and provide support for solution adoption.

  • Monitor and maintenance – Data science-specific process, to track data-drift, evaluate long-term model performance, and model explainability

image image