The purpose of this study is to determine if a machine learning model can be produced, that can correctly distinguish a poisonous mushroom from an edible mushroom. The model is required to have an accuracy score of over 90% to be deemed successful. The approach involves six stages.
First, the dataset needs to be imported and read. The dataset then needs to be visualised. Next, the dataset is spliced into training and testing datasets. A Gaussian Naive Bayes algorithm is used to build a model. Finally, an accuracy score needs to be produced. The accuracy score will then be visualised using a confusion matrix and a heat map. From the results, this machine learning model can be seen as successful as it has over a 90% accuracy score.
This model can accurately tell the difference between a poisonous mushroom from an edible mushroom 92.24% of the time.
The word mushroom is used to describe a variety of fungus, this includes fungi that have a stem and those that do not. The common name for the mushrooms that are discussed in this paper is from the Agaricus and Lepiota Family.
These mushrooms have a wide range of characteristics, this allows mycologist to categorise them. The majority of mushrooms are poisonous for humans to consume. Because there are over 64,000 types of mushrooms in the Ascomycota family it’s extremely difficult for the public to distinguish which ones are poisonous and which ones are edible.
In 2011 the health protection agency's national poison information service had received over 209 calls from NHS to staff trying to treat mushroom poisoning, 147 were from adult seeking medical attention eating mushrooms that they had picked on walks. The U.K.’s most commonly eaten poisonous mushroom is the yellow stained, this is because they can be easily confused with commonly edible varieties which look very similar.
This algorithm aims to determine when given enough data on the physical attributes of a mushroom, if they can be classified as poisonous or edible correctly. This would be done by creating a database that contains known poisonous and edible mushrooms with 22 different physical attributes. The algorithm would then predict the likelihood of a mushroom being poisonous or edible from the given attributes.