In this report I use the dataset from a Kaggle challenge Forest Cover Type Prediction to predict the forest cover type (the predominant kind of tree cover) from strictly cartographic variables.
These independent variables were derived from data obtained from the US Geological Survey and USFS.
Six models are involved in this report. They are LDA (Linear Discriminant Analysis), Naïve Bayes, kNN (k-NearestNeighbor), Decision Trees, Random Forest and Boosted Trees.
The report can also be used as a step-by-step tutorial for selecting features, cross validation, building models using primary machine learning algorithms and a beginning of Kaggle challenges.