Skip to content

This repository contains a Jupyter Notebook (test.ipynb) that provides a practical guide to understanding and building a Knowledge Graph (KG) from unstructured text data using Natural Language Processing (NLP) techniques.

Notifications You must be signed in to change notification settings

Rakesh-Seenu/Knowledge-Graph-Building-From-Scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Knowledge-Graph-Building-From-Scratch

This repository contains a Jupyter Notebook (test.ipynb) that provides a practical guide to understanding and building a Knowledge Graph (KG) from unstructured text data using Natural Language Processing (NLP) techniques.

What is a Knowledge Graph?

A Knowledge Graph is a structured representation of information that describes interlinked entities and their relationships. At its core, a KG consists of:

  • Nodes (Entities): Representing real-world objects, concepts, or events (e.g., "Germany," "Holy Roman Empire," "1815").
  • Edges (Relationships): Representing the connections or associations between entities (e.g., "began in," "inhabited," "formed in").

The smallest unit in a Knowledge Graph is often referred to as a "triple," which comprises two entities connected by a single relationship (e.g., (Germanic tribes, inhabited, region)).

Why Build a Knowledge Graph?

Knowledge Graphs are powerful tools for:

  • Enhanced Search: Providing more relevant and contextual search results.
  • Data Integration: Connecting disparate data sources.
  • AI Applications: Powering intelligent applications like recommendation systems, chatbots, and question-answering systems.
  • Semantic Understanding: Enabling machines to understand the meaning and relationships within data.

How to Build a Knowledge Graph from Text

The test.ipynb notebook demonstrates a step-by-step approach to extracting information from text and transforming it into a structured Knowledge Graph. This process primarily leverages various NLP techniques:

  1. Sentence Segmentation: Breaking down raw text into individual sentences.
  2. Dependency Parsing: Analyzing the grammatical structure of sentences to identify relationships between words.
  3. Parts of Speech (POS) Tagging: Identifying the grammatical role of each word (e.g., noun, verb, adjective).
  4. Entity Recognition: Identifying and classifying key entities within the text.

The notebook provides practical examples and Python code using popular NLP libraries to illustrate these concepts.

Notebook Highlights

  • Introduction to KGs: Clear definitions of nodes, edges, and triples.
  • NLP for KG Construction: Explanation of essential NLP techniques like dependency parsing, POS tagging, and entity recognition.
  • Code Examples: Practical Python code snippets demonstrating how to implement these techniques.
  • SpaCy Integration: Utilizes the spaCy library for efficient linguistic processing.
  • Relationship Extraction: Illustrates how to extract meaningful relationships between entities from text.

Getting Started

To run the notebook and explore the code:

  1. Clone this repository:
    git clone [YOUR_REPOSITORY_URL]
  2. Navigate to the repository directory:
    cd [YOUR_REPOSITORY_NAME]
  3. Install the necessary libraries (e.g., spaCy, pandas):
    pip install -r requirements.txt
    python -m spacy download en_core_web_sm
  4. Open the test.ipynb notebook using Jupyter Lab or Jupyter Notebook:
    jupyter lab test.ipynb
    or
    jupyter notebook test.ipynb

Feel free to explore the code, experiment with different texts, and adapt it for your own Knowledge Graph projects!


About

This repository contains a Jupyter Notebook (test.ipynb) that provides a practical guide to understanding and building a Knowledge Graph (KG) from unstructured text data using Natural Language Processing (NLP) techniques.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published