This is an educational implementation of a simple text search engine using MongoDB, FastAPI, and NLTK. It demonstrates basic concepts of text indexing and search but is not suitable for production use.
This project is for educational purposes only. It is not:
- A production-ready search solution
- Optimized for performance or scale
- Secure enough for real-world use
- A replacement for dedicated search technologies
Use only for learning about basic search concepts.
- Basic document indexing with term frequency counting
- Simple search functionality
- Stemming and stopword removal using NLTK
- Bulk document indexing
- Clone the repository:
git clone https://github.com/patw/atlasbadsearch.git
cd mongodb-text-search
- Install dependencies:
pip install -r requirements.txt
- Create a
.env
file with your MongoDB connection string:
echo "MONGODB_URI=mongodb+srv://username:password@cluster.mongodb.net/?retryWrites=true&w=majority" > .env
echo "MONGODB_DBNAME=search_db" >> .env
- Run the FastAPI server:
uvicorn app.main:app --reload
POST /index
- Index a single documentPOST /bulk_index
- Index multiple documents at onceGET /search
- Search documents by query terms
Index a document:
curl -X POST "http://localhost:8000/index" \
-H "Content-Type: application/json" \
-d '{"text":"This is an example document to index"}'
Bulk index documents from file:
curl -X POST "http://localhost:8000/bulk_index" \
-H "Content-Type: application/json" \
-d @sample_bulk.txt
Search documents:
curl "http://localhost:8000/search?query=example document"
- Tokenization and stemming
- Stopword removal
- Term frequency counting
- Basic search scoring
- MongoDB document structure for search
MIT License - See LICENSE for details.