Skip to content

Mounir-charef/ri-indexer

Repository files navigation

RI-indexer

The RI-indexer is a Python-based tool for indexing and searching text documents.

Features

1. Tokenization

  • Implements two types of tokenization: Split and NLTK tokenizer.

2. Stemming

  • Supports two stemming algorithms: Porter and Lancaster.

3. Document Indexing

  • Generates descriptors and inverted indices for documents.

4. Search Functionality

  • Allows searching through indexed documents based on indexed terms.

Usage

This project provides an interface to:

  • Tokenize text documents
  • Perform stemming
  • Generate document descriptors and inverted indices
  • Search through indexed documents

Contributing

Contributions are welcome! If you want to contribute to this project:

  1. Fork the repository
  2. Create a new branch
  3. Make your changes and submit a pull request

Feel free to enhance or customize the sections as needed to better describe the specific features and functionality of your "RI-indexer" project.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages