Intro to automated machine learning directly from SQL using IntegratedML feature in InterSystems IRIS. You can find more information in the documentation
Have a look at the slides in Workshop-IntegratedML-Intro.pdf
Run the container we will use as our IRIS instance:
docker-compose up -d
After that, you should be able to access to Management Portal.
We are going to use a dataset about health risks for pregnant patients. Have a look at the details here: https://www.kaggle.com/csafrit2/maternal-health-risk-data
We have already included in the repository a data file.
We need to split the data into training and test datasets. Prepare data simply split the data and generate a train and test CSV files. Have a look at the code using Embedded Python.
Run the method in a WebTerminal session.
do ##class(Workshop.Util).PrepareData()
Create train table:
CREATE TABLE Workshop_Data.MaternalTrain (
Age INT,
SystolicBP INT,
DiastolicBP INT,
BS DOUBLE,
BodyTemp DOUBLE,
HeartRate INT,
RiskLevel VARCHAR(255)
)
Load train data:
LOAD DATA FROM FILE '/app/data/train.csv'
INTO Workshop_Data.MaternalTrain (Age,SystolicBP,DiastolicBP,BS,BodyTemp,HeartRate,RiskLevel)
VALUES (Age,SystolicBP,DiastolicBP,BS,BodyTemp,HeartRate,RiskLevel)
USING {"from":{"file":{"charset": "UTF-8"}}}
Display train data:
SELECT * FROM Workshop_Data.MaternalTrain
Create test table:
CREATE TABLE Workshop_Data.MaternalTest (
Age INT,
SystolicBP INT,
DiastolicBP INT,
BS DOUBLE,
BodyTemp DOUBLE,
HeartRate INT,
RiskLevel VARCHAR(255)
)
Load test data:
LOAD DATA FROM FILE '/app/data/test.csv'
INTO Workshop_Data.MaternalTest (Age,SystolicBP,DiastolicBP,BS,BodyTemp,HeartRate,RiskLevel)
VALUES (Age,SystolicBP,DiastolicBP,BS,BodyTemp,HeartRate,RiskLevel)
USING {"from":{"file":{"charset": "UTF-8"}}}
Display test data:
SELECT * FROM Workshop_Data.MaternalTest
IntegratedML feature works directly in SQL. You can use the SQL Explorer in the Management Portal or use an external JDBC tool like DBeaver.
Create a model to predict RiskLevel
:
CREATE MODEL MaternalModel PREDICTING (RiskLevel) FROM Workshop_Data.MaternalTrain
You can now train the model using the training data
TRAIN MODEL MaternalModel
Evaluate the performance of the predictions of your model
VALIDATE MODEL MaternalModel FROM Workshop_Data.MaternalTest
SELECT * FROM INFORMATION_SCHEMA.ML_VALIDATION_METRICS
Finally you can run predictions on RiskLevel. Here you can compare the predictions Vs. real data:
SELECT *, PREDICT(MaternalModel) AS PredictedRisk,PROBABILITY(MaternalModel FOR 'mid risk') As Probability
FROM Workshop_Data.MaternalTest
SELECT *, PREDICT(MaternalModel) AS PredictedRisk, PROBABILITY(MaternalModel FOR 'mid risk') As Probability
FROM Workshop_Data.MaternalTest
WHERE PROBABILITY(MaternalModel FOR 'mid risk') > 0.7