R Scripts and Markdown for model selection
A model connects the duration of amphibian species in the fossil record with certain ecological and morphological traits. To find the right model I try several approaches such as LM, GAM and Random Forest.
The data is a matrix containing species names and several variables. All species are extinct and data was collected from literature and Paleobiology Database and the database of Vertebrates:fossil Fishes, Amphibians, Reptiles, Birds - fosFARbase The dataset has incomplete cases as well as some highly skewed variables. Due to the nature of the data, I try several modelling techniques.
Disclaimer: This is not the real data set that is used for publishing later on, but a test dataset that closely resembles the real datas characteristics.
Modelling is done in R. Code is provided as R Markdown file. The script including output and some discussion can be found on project webpage .