GitHub - farhadGithub/customer-churn-in-SQL-with-R: This repository provides the files to set up and run a customer churn machine learning platform in SQL server with R services.

CUSTOMER CHURN TEMPLATE on MICROSOFT SQL SERVER R SERVICES

This template demonstrates how to build and deploy a customer churn model using a Microsoft SQL Server R Services. For a full description of the template in Cortana Analytics Suite, visit the project page.

REQUIREMENTS

This template requires the following components:

Microsoft SQL Server R Services: Visit this link for steps to configure a server.
User name and password (set up separately or via Windows account) to access the server.
Data on the profile of users who interact with the business as well as their activities: In this template, we haved provided sample datasets and instructions on how to load these datasets to the server.
Source files: They can be download via this link.

WORKFLOW

The end-to-end workflow to set up the template can be managed via a Windows PowerShell script. The script can be invoked remotely from the PowerShell console by running the following command from the directory where the source files have been downloaded:

MSR-SQL-Churn.ps1

The PowerShell script invokes a number of SQL scripts through the steps described below. Each step can be also skipped if not needed. The PowerShell script is mainly provided as a convenient way for the user to deploy the template.
An experienced user may directly run, modifiy or intergrate the provided SQL scripts in their your system.

STEP 1: DATA PREPARATION

The template requires two datasets as input: a dataset of the user profiles and a dataset of the user activities. As part of this repository, we have provided sample files for these datasets. These files can be downloaded from here and here. The schemas for the datasets are described here.

In this step, the user is first asked to enter the following information:

Server name (or its IP address).
Database name: If the user does not provide a name and simply presses enter when answering this question, a default name is used.
User name and password: If the user does not provide a user name and password, the script assumes a Windows Authnetication connection must be established and employes the Windows account informtion to connect to the server.
URL address for the users and activities files: If the user does not specify a URL, the files are downloaded from a default location.
Churn period and threshold: In order to identify churners and non-churners, the templates needs these two parameters. The churn period specifies the number of days at the end of the activities period for which we observe the user activities to identify churners. Within this period, those users who have activities more than the churn threshold are considered as non-churners and otherwise as churners.

Using the bcp utility, the script then retrieves the users and activities files from the URL address and uploads them into the server. It then envokes CreateDBTables.sql to create the database with the following tables:

Table	Purpose
`Activities`	Customer activities
`Users`	Customer profiles
`ChurnVars`	Churn period and threshold
`ChurnModelR`	Churn model trained using open-source R
`ChurnModelRx`	Churn model trained using Microsoft R Server
`ChurnPredictR`	Prediction results based on open-source R model
`ChurnPredictRx`	Prediction results based on Microsoft R Server model

STEP 2: FEATURE ENGINEERING

In the second step, CreateFeatures.sql and CreateTag.sql are invoked to create features and tags. The output of these two scripts is stored in these tables: Features and Tags. For more details on feature engineering, visit this link.

STEP 3 (a and b): MODEL TRAINING

In the third step, TrainModelR.sql or TrainModelRx.sql are invoked to train the churn models. TrainModelR.sql employs the open-source R functions whereas TrainModelRx.sql employs the Microsoft R Server functions. The trained models are stored in ChurnModelR and ChurnModelRx tables.

STEP 4 (a and b): PREDICTION

In the fourth step, PredictR.sql or PredictRx.sql are invoked to make predictions based on the models trained in the previous steps. PredictR.sql employs the open-source R model whereas PredictRx.sql employs the Microsoft R Server model. The results are stored in a table with the following columns:

Column	Description
`UserId`	User Id
`Tag`	True customer status (churner or non-churner)
`Score`	Model score
`Auc`	Model AUC on test dataset (identical for all columns)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CUSTOMER CHURN TEMPLATE on MICROSOFT SQL SERVER R SERVICES

REQUIREMENTS

WORKFLOW

STEP 1: DATA PREPARATION

STEP 2: FEATURE ENGINEERING

STEP 3 (a and b): MODEL TRAINING

STEP 4 (a and b): PREDICTION

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
Activities.csv		Activities.csv
CreateDBTables.sql		CreateDBTables.sql
CreateFeatures.sql		CreateFeatures.sql
CreateTag.sql		CreateTag.sql
MSR-SQL-Churn.ps1		MSR-SQL-Churn.ps1
PredictR.sql		PredictR.sql
PredictRx.sql		PredictRx.sql
README.md		README.md
TrainModelR.sql		TrainModelR.sql
TrainModelRx.sql		TrainModelRx.sql
Users.csv		Users.csv

farhadGithub/customer-churn-in-SQL-with-R

Folders and files

Latest commit

History

Repository files navigation

CUSTOMER CHURN TEMPLATE on MICROSOFT SQL SERVER R SERVICES

REQUIREMENTS

WORKFLOW

STEP 1: DATA PREPARATION

STEP 2: FEATURE ENGINEERING

STEP 3 (a and b): MODEL TRAINING

STEP 4 (a and b): PREDICTION

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages