Skip to content
This repository has been archived by the owner on Jul 22, 2024. It is now read-only.

Predict, Manage and Monitor the call drops of cell towers using Cloud Pak for Data

License

Notifications You must be signed in to change notification settings

IBM/icp4d-telco-manage-ml-project

Repository files navigation

Predict, Manage and Monitor the call drops of cell towers using Cloud Pak for Data

Note: This code pattern is part of a series that explores telecom call-drop predictions using IBM Cloud Pak for Data, Data Virtualization, Watson OpenScale, and Cognos Analyics. Other assets included in the series include:

A call drop is a situation where a call on a wireless network is disconnected before the caller ends the call. Some of the main reasons for call drops are:

  • Inadequate coverage, which can be due to multiple reasons:

    • Lack of tower infrastructure
    • Improper network planning
    • Non-optimization of network
  • Overloaded cell towers – the number of subscribers grow each day and most of them are on smartphones. The network capacity is simply not being ramped up at the same pace, which results in overloaded networks.

  • Cityscape changes – there have been instances where newly-built multi-storied buildings cause adjacent building’s subscribers to lose cell reception. Such instances are very common with rapidly changing cityscapes and calls for routine network data analysis from service providers.

  • Switching between towers – this situation occurs when a person is traveling or moving around while talking. Chances of a dropped call increaase if a call handover takes place (transfer from one base transceiver station (BTS) to another), especially in the case of overloaded networks.

  • Technical failures – this is beyond anyone's control and operators generally monitor down-times through well-equipped network operation centers.

This code pattern aims to create a model to predict call drops, trained on the above mentioned failures. With the help of an interactive dashboard, we use a time series model to better understand call drops. As a benefit to telecom providers and their customers, it can be used to identify issues at an earlier stage, allowing more time to take the necessary measures to mitigate problems. The main features of the solution include:

  • Built on IBM Cloud Pak for Data.
  • Data can come from multiple DB sources, for example an internal DB2 Warehouse (SMP) within the Cloud Pak for Data instance, or other external sources like DB2 on Cloud, Oracle DB, Postgres DB, SingleStore and so on. Data Virtualization will be used to integrate them all into one DB source.
  • Using a built-in notebook service, a time-series model that predicts call-drops in the the next 24 hours.
  • A call-drop prediction model for each cell tower. These models will be monitored for quality and fairness using AI OpenScale.
  • A Cognos Analytics dashboard that provides the user with an overall region-wise view of the call-drop scenarios. With the help of Watson OpenScale, the time-series model will be output in a graph, along with the models performance improvements.

After completing this code pattern, you'll learn how to:

  • Use Data Virtualization.
  • Create connections from DBs hosted on multiple Cloud (AWS, Azure or IBM Cloud) or on-premise environments.
  • Create views from joins and publish data to your current Project.
  • Store custom models using open source technology on Watson Machine Learning.
  • Deploy a model and connect the model deployment to Watson OpenScale on Cloud Pak for Data and IBM Cloud.
  • Setup Model Fairness and Model Quality montiors in Watson OpenScale on Cloud Pak for Data, and on IBM Cloud, using a python notebook.
  • Create a project and setup a python notebook on Cloud Pak for Data.

architecture

Flow

  1. Data stored across various sources, like AWS Cloud and IBM Cloud, are virtualized and joined as needed by the AI Models.
  2. The joined data is stored back to the Internal DB(DB2 or SingleStore) of Cloud Pak for Data and assigned to the current working project.
  3. Create machine learning models using Jupyter Python Notebooks to predict call-drops per tower, and also a time-series model that projects a call-drop percentage based on real-time conditions.
  4. Model trained and/or stored in Watson Machine Learning, which is also connected to the Watson OpenScale.
  5. Visualize and analyse insights from the trained models and the data using Cognos Analytics dasboards.
  6. Configure fairness, quality and explainability montiors for each tower's model.

Featured technologies

  • Jupyter Notebooks: An open-source web application that allows you to create and share documents that contain live code, equations, visualizations, and explanatory text.
  • Pandas: An open source library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.

Prerequisites

Note: As mentioned earlier, this code pattern is part of a series of assets around predicted call-drops from cell towers. It assumes that you have completed the other assets in the series (as listed above in the Prerequisites section). Those assets take you through some set-up steps that must be completed before starting on this code pattern.

Steps

  1. Clone the repository
  2. Obtain your data from Data Virtualization
  3. Create a new project in Cloud Pak for Data
  4. Upload the dataset to Cloud Pak for Data
  5. Import notebook to Cloud Pak for Data
  6. Follow the steps in the notebook
  7. Setup your notebook for call-drop monitoring
  8. Setup Cognos Analytics Dashboard on your Cloud Pak for Data instance for visualizations

1. Clone the repository

git clone https://github.com/IBM/icp4d-telco-manage-ml-project
cd icp4d-telco-manage-ml-project

2. Obtain your data from Data Virtualization

To create the data set required for this code pattern, you have 2 options.

  1. You can build the data yourself.
  • Follow the Data Virtualization Tutorial to virtualize and join the data.
  • Select the weather.csv and tower.csv files as the two files to be joined and virtualized. Both files are located in the /data directory.
  1. As a convinence, we have created a csv file version of this merged data wich you can use directly. The file is named Telco_training_final.csv and is also located in the data directory.

Note : Pls execute the below section(Steps for obtaining DB Credentials for DB2 on Cloud) ONLY if you want to use DB2 on Cloud to store the output of the Time-Series model.

Steps for obtaining DB Credentials

For DB2 on Cloud

  • Create a DB2 instance on your Cloud account.

  • Once the instance is created, in the Service Credential tab, click on Open Console.

    db2cred

  • Click on New Credential and the Add.

    addcred

  • Now the service credential will be created, click on the copy button and save the credentials.

    copycred

For SingleStore

  • The SingleStore credentials are uid of admin and password is whatever you have used to launch cluster. Please make a note of the credentials configured for SingleStore during installatio and loading of data - hostname(Cluster_IP), password(cluster_password), port(3306) and database name(as configured during data load).

3. Create a new project in Cloud Pak for Data

  • Once you login to your Cloud Pak for Data instance. Click on the (☰) menu icon in the top left corner of your screen and click Projects.

    select_project

  • When you reach the Project list, click on New Project. You will get a pop-up, make sure to have the Analytics Project option and enter the desired name. Once you click on Ok you will go to a new screen. Click on Create to complete your project creation.

    name_project

  • Once the start-up page opens up. Click on Add to Project menu.

  • Once the pop-up opens up click on connection.

    connection

  • Click on the appropriate DB option. And enter the saved credentials.

For DB2 Database, follow the below steps:

connection

NOTE: Click on DB2 on Cloud if you have did the Steps for DB Credentials for DB2 on Cloud. For any other DB click other options and obtain the credentials.

  • Click on Test and once it is succesful, click on Create.

For SingleStore, follow the below steps:

  • Select the Compose for MySQL Database connection.

    connect_to_db

  • Enter the credentials of the Database. Click on Test and then click on Create.

    Enter_Cred

4. Upload the dataset to Cloud Pak for Data

  • From your project page, click on Data set > Add Data Set. Select the Telco_training_final.csv file from the /data directory.

    add_data_set

5. Import notebook to Cloud Pak for Data

Note: Choose the Python 3.6 environment.

6. Follow the steps in the notebook

You will run cells individually by highlighting each cell, then either click the Run button at the top of the notebook. While the cell is running, an asterisk ([*]) will show up to the left of the cell. When that cell has finished executing a sequential number will show up (i.e. [17]).

NOTE: For reference, we have included a completed notebook in the /examples directory of this repo. This version of the notebook includes all the executed steps and outputs. See https://github.com/IBM/icp4d-telco-manage-ml-project/blob/master/examples/Multivariate_Time_Series-Example.ipynb

Configure in Notebook

Insert your created DB credentials, below the section 2.1 Insert the DB Credentials in the notebook.

configure_db_creds

Note: Run the below step (Add the Dataset) ONLY if you have not completed the Data Virtualization tutorial.

Add the Dataset

In section 2.2 Add Dataset, highlight the blank cell by clicking on it. Click on the 10/01 button to select a specific data set.

Choose the Local tab, and select the call_drop_data_train.csv file that you added to the project. Under the Insert to code option, click Insert Pandas DataFrame.

add_dataset

IMPORTANT: Ensure the variable name is set to df_data_1.

Run the notebook

  • Go to the next-to-last cell in the notebook. Replace the schema name with an existing schema name from your DB.
  • Run the notebook to completion by clicking on Cell > Run all.

7. Setup your notebook for call-drop monitoring

Complete the previous code pattern in this code series - Monitor your Open Source ML Models using Watson OpenScale.

It performs the setup of the deployments on Watson OpenScale.

8. Setup Cognos Analytics Dashboard on your Cloud Pak for Data instance for visualizations

Once data is generated and stored in the DB., and after running the Time Series notebook, follow the tutorial - Build dashboards in Cognos Analytics on IBM Cloud Pak for Data to generate the output as shown in the next section.

Sample Output

A map based selection of each tower. When a tower is selected, the graphics indicate the call-drop prediction over the next 24 hours, with the help of the Time Series Model. It also shows which factors affect the call-drop percentage at the tower and by how much.

output_1

License

This code pattern is licensed under the Apache License, Version 2. Separate third-party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 and the Apache License, Version 2.