This project is a simple Semantic Search API built using FastAPI, Sentence Transformers, and FAISS. The primary goal is to provide a practical understanding of how AI models find meaning-based similarities between sentences—a foundational concept behind advanced systems like Retrieval-Augmented Generation (RAG) and ChatGPT search.
Semantic Search is the process of finding text that has a similar meaning rather than relying on exact keyword matches.
For example:
The query “Doctor helps patients” will be considered semantically similar to the stored document “Physicians treat people.”
This mechanism is achieved using vector embeddings, which mathematically convert text into high-dimensional numerical representations of their meaning.
Semantic Search relies on two core components:
| Component | Purpose |
|---|---|
| Sentence Transformers | Converts text/sentences into embeddings (dense vector numbers) that capture semantic meaning. |
| FAISS (Facebook AI Similarity Search) | A highly efficient library used to store and rapidly search through these embeddings to find semantically similar entries. |
- Encoding: Encode the source text sentences using Sentence Transformers.
- Indexing: Store the resulting embeddings inside a FAISS index.
- Searching: When a new query is entered, it is first encoded, then searched against the FAISS index to get the most similar results instantly.
| Library | Purpose |
|---|---|
fastapi |
For building and serving the high-performance REST API. |
uvicorn |
ASGI server for running the FastAPI application. |
pydantic |
For data validation of request and response models. |
sentence-transformers |
For generating meaning-based text embeddings. |
faiss-cpu |
For fast nearest-neighbor (vector similarity) search. |
google-genai |
For text generation and integration with the Gemini LLM. |
python-dotenv |
For securely loading environment variables like API keys. |
💡 Note: You can switch
faiss-cputofaiss-gpuif your system supports CUDA for accelerated vector search.
git clone https://github.com/Aashish2k1S/Semantic_Search_API.git
cd Semantic_Search_API pip install -r requirements.txtCreate a .env file in the root folder of the project:
GEMINI_API_KEY=your_api_key_here
uvicorn main:app --reloadThe API documentation will be live at → http://127.0.0.1:8000/docs
Endpoint: POST /search
Request Body:
{
"question": "How does AI help doctors?"
}Example Response:
{
"question": "How does AI help doctors?",
"context": [
"AI is transforming healthcare and diagnostics.",
"Deep learning assists in image-based medical analysis.",
"FastAPI enables rapid API development for AI apps."
],
"answer": "AI is transforming healthcare and diagnostics, offering significant assistance to doctors. It helps by revolutionizing the way medical information is processed and understood. Specifically, deep learning, a key AI technology, assists doctors with medical analysis. This support is particularly beneficial for image-based medical analysis, aiding in more accurate and efficient diagnoses."
}This mini-project served to solidify the following concepts:
- What embeddings are and how text can be represented numerically.
- The function of FAISS in performing rapid similarity search.
- How to expose an ML workflow through an API using FastAPI.
- Understanding the core foundation of RAG (Retrieval-Augmented Generation).
- Add real dataset ingestion (e.g., text files, articles, PDF content) instead of mock data.
- Integrate the Gemini or OpenAI LLM to generate a context-based answer from the retrieved documents.
- Implement saving and loading of the FAISS index for persistent storage.
- Deploy the application on platforms like Render or Hugging Face Spaces.
Aashish Gupta
.NET & Python Developer | Exploring AI Systems
🌱 “I’m not chasing perfection — I’m learning by doing, one project at a time.”