WHALE: Web-Scale Hybrid Explainable Machine Learning

WHALE (Web-Scale Hybrid Explainable Machine Learning) is a pioneering project that aims to develop time-efficient, explainable machine learning models that can operate on web-scale RDF knowledge graphs. By leveraging the power of large-scale language models and innovative hybrid class expression learning (CEL) approaches, WHALE seeks to create scalable, reusable models that enhance the transparency and trustworthiness of AI decisions, particularly in complex web environments.

Overview

The Web has become the largest and most widely used information infrastructure globally, hosting vast amounts of data in the form of RDF knowledge graphs. These knowledge graphs play a critical role in various applications, from scientific research to everyday use on platforms like Google and Facebook.

Key Goals

Explainability: Enhance trust in AI systems by making their decisions explainable, especially when working with massive web-scale knowledge graphs.
Efficiency: Develop time-efficient methods for CEL on expressive description logics (DLs) like SROIQ(D), which are commonly used in RDF knowledge bases.
Scalability: Create models that can efficiently process and link extremely large datasets, ensuring they are applicable to a broad range of web-scale data applications.

Features

Hybrid Class Expression Learning (CEL): Combines multiple representations of knowledge to accelerate the learning process, making it feasible to apply CEL on large-scale knowledge graphs.
Universal Knowledge Graph Embeddings: Development of embeddings for large-scale knowledge graphs, enabling the training of deep learning models that act as efficient function approximators during the CEL process.
Tensor-Based Querying: Use of tensor representations to improve the runtime of queries on RDF data, facilitating instance retrieval on a web scale.

Installation

To install and run WHALE, follow these steps:

Clone the repository:

git clone https://github.com/dice-group/WHALE.git
cd WHALE

Set up a virtual environment (optional but recommended):
```
python3 -m venv venv
source venv/bin/activate
```
Install dependencies:
```
pip install -r requirements.txt
```

Research and Development

WHALE is a result of collaborative efforts involving experts in multi-processing deep learning techniques, knowledge graph embeddings, and tensor representations. The project involves a multi-step workflow:

Data Gathering: Collection of large-scale RDF knowledge graphs.
Preprocessing: Transformation of RDF data for compatibility with various tools and libraries.
Training: Development of knowledge graph embeddings using hybrid CEL approaches.
Linking: Integration of different knowledge graphs to create a unified dataset.
Model Training: Training of large language models (LLMs) on the unified embeddings.
Benchmarking: Evaluation of the trained models to ensure performance and accuracy.

Acknowledgements

WHALE is supported by the Lamarr Fellowship and developed at Paderborn University by Prof. Dr. Axel Ngonga and his team. The project also collaborates with the Lamarr Network and various other academic and research institutions.

For any inquiries or support, please contact the maintainers at sshivam@mail.uni-paderborn.de.

Name		Name	Last commit message	Last commit date
Latest commit History 134 Commits
Amazon_Products		Amazon_Products
WDC_scripts		WDC_scripts
.gitignore		.gitignore
Datasets_stats.png		Datasets_stats.png
README.md		README.md
fetch_seed_data.py		fetch_seed_data.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WHALE: Web-Scale Hybrid Explainable Machine Learning

Overview

Key Goals

Features

Installation

Research and Development

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

dice-group/WHALE

Folders and files

Latest commit

History

Repository files navigation

WHALE: Web-Scale Hybrid Explainable Machine Learning

Overview

Key Goals

Features

Installation

Research and Development

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages