A major problem faced by businesses in asset-heavy industries such as manufacturing is the significant costs that are associated with delays in the production process due to mechanical problems. Most of these businesses are interested in predicting these problems in advance so that they can proactively prevent the problems before they occur which will reduce the costly impact caused by downtime.
This example brings together common data elements observed among many predictive maintenance use cases and the data itself is created by data simulation methods.
The business problem for this example is about predicting problems caused by component failures such that the question "What is the probability that a machine will fail in the near future due to a failure of a certain component?" can be answered.
The purpose of the project is to build a classification model using the 'Predictive maintenance' dataset, which consists of 10 000 data points stored as rows with 8 features in columns. The classifier will have to be able to predict the target variable, which takes value '0' if the machine has no failure and therefore no maintenance is needed, '1' if on the contrary, if some kind of damage has been revealed in the machine and it needs maintenance.
The dataset consists of 10 000 data points stored as rows with 14 features in columns
UID: unique identifier ranging from 1 to 10000
productID: consisting of a letter L, M, or H for low (50% of all products), medium (30%), and high (20%) as product quality variants and a variant-specific serial number
air temperature [K]: generated using a random walk process later normalized to a standard deviation of 2 K around 300 K
process temperature [K]: generated using a random walk process normalized to a standard deviation of 1 K, added to the air temperature plus 10 K.
rotational speed [rpm]: calculated from powepower of 2860 W, overlaid with a normally distributed noise
torque [Nm]: torque values are normally distributed around 40 Nm with an σ = 10 Nm and no negative values.
tool wear [min]: The quality variants H/M/L add 5/3/2 minutes of tool wear to the used tool in the process. and 'machine failure' label that indicates, whether the machine has failed in this particular data point for any of the following failure modes are true.
Targets:
Target : Failure or Not
Failure Type : Type of Failure ;