The anomaly detection tool (CyberDefense) is used to detect BGP anomalous events based on routing records collected worldwide from major Internet exchange points. It also facilitates creating new machine learning models based on historical BGP anomalous events. CyberDefense integrates various stages of the anomaly detection process. The tool consists of: data download, real-time data retrieval, data container, feature extraction, feature extraction, label refinement, data partition, data processing, machine learning algorithms, hyper-parameter selection, machine learning models, and classification modules. It is based on Python and JavaScript as terminal-based or web-based application. CyberDefense consists of BGPGuard and ModelOcean subsystems.
The web-based CyberDefense version offers an interactive interface for monitoring and performing experiments. It includes several functions from the terminal-based application and is developed using additional programming languages, frameworks, and functions. Its front-end is built based on HTML, CSS (Bootstrap: an open-source CSS framework), and Socket.IO (a transport protocol written in a JavaScript for real-time web applications). Its back-end is developed using Flask (a micro web framework written in Python).
BGPGuard is used for real-time anomaly detection or off-line data classification based on messages collected from RIPE or Route Views. For real-time anomaly detection, BGP update messages are retrieved, processed, and analyzed using available pre-trained models. The off-line classification is based on the specified start and end dates, times of the anomalous event, partitioning of the training and test datasets, and machine learning algorithms (GBDT, CNN, RNN, or VFBLS).
Terminal-based: https://github.com/zhida-li/BGPGuard-terminal-app
ModelOcean enables processing various datasets such as NSL-KDD and CIC datasets to create models for various types of intrusion attacks such as DDoS, User to Root (U2R), Remote to Local (R2L), and probing. It has an additional module (data container) for custom datasets.
CyberDefense
├── LICENSE
├── README.md
├── app.py
├── app_offline.py
├── app_realtime.py
├── config.py
├── requirements.txt
├── database
│ ├── database.py
│ └── db_files
├── src
│ ├── __init__.py
│ ├── check_versions.py
│ ├── dataDownload.py
│ ├── data_partition.py
│ ├── data_process.py
│ ├── featureExtraction.py
│ ├── input_exp.txt
│ ├── label_generation.py
│ ├── progress.py
│ ├── progress_bar.py
│ ├── subprocess_cmd.py
│ ├── time_tracker.py
│ ├── CSharp_Tool_BGP
│ ├── STAT
│ ├── VFBLS_v110
│ ├── data_historical
│ ├── data_ripe
│ ├── data_routeviews
│ ├── data_split
│ ├── data_test
│ └── parmSel
│ └── playground
├── static
│ ├── css
│ │ └── style.css
│ ├── imgs
│ └── js
└── templates
├── bgp_ad_offline.html
├── bgp_ad_realtime.html
├── contact.html
├── index.html
└── layout.html
The src
directory contains the source code for the real-time detection and off-line classification tasks.
Various Python functions have been developed to implement and incorporate the anomaly detection steps.
The web-based CyberDefense version relies on additional external libraries.
The external CSS and JavaScript libraries provided by jsDelivr have been
included in layout.html
.
The Python libraries installed by pip are:
-
NumPy: used to perform mathematical operations on multi-dimensional arrays and on matrices generated during the process.
-
SciPy: dependency of the scikit-learn library. SciPy's zscore: function used to perform normalization.
-
scikit-learn: employed for processing data and calculating performance metrics.
-
PyTorch: used for developing deep learning models.
-
Flask: web framework based on Werkzeug and Jinja. (The Flask's functions are used to transfer variables and render web pages. Flask also processes the GET/POST requests from the front-end.)
-
Werkzeug: web application library used to create a web server gateway interface (WSGI).
-
Jinja: web template engine. Variables, statements, and expressions allowed to include in HTML files.
-
Flask-SocketIO: offers bi-directional communications with low latency between the clients (front-end) and the server (back-end) for Flask applications.
-
python-socketio: implementation of the Engine.IO in Python.
-
python-engineio: library for real-time communication between client and server based on WebSocket.
-
Eventlet: networking library for executing asynchronous tasks.
Note: virtual environment is recommended.
pip install -r requirements.txt
- Mono: an open source version of Microsoft .NET framework. Mono includes a C# compiler for several operating systems such as macOS, Linux, and Windows.
The Python file app.py
is used to execute the application.
The following command is used to start the application:
export FLASK_APP=app.py
export FLASK_ENV=development
flask run
Running on:
http://127.0.0.1:5000/
Sample code for gradient boosting machines may be run without launching the app.py
:
./src/playground/gbdt_offline_sample
Please see details in README.md
.
The CyberDefense is open-sourced tool under the terms of the MIT license.