Welcome to Reading Bee, where every book finds its buzz! β¨
Reading Bee is an online database and personalized book recommendation platform. It integrates an LLM chatbot, advanced search, and similarity-based recommendations into one unified system.
- LLM Chatbot for book suggestions
- Advanced Book Search & Filtering
- Save favorite books to "My List"
- Similar Book Recommendations based on semantic search
- Full-Stack: backend + frontend + PostgreSQL database
- Data engineering and sentiment analysis
-
π¬ A conversational interface based on Retrieval-Augmented Generation (RAG). User can ask for book suggestions in natural language. Ollama LLM Chatbot returns relevant titles and summarize results (embeddings and metadata).
-
β LLM dynamically triggers tool calling to backend. User messages (such as book descriptions or ISBNs) are parsed and perform a vector search (PostgreSQL + pgvector/FAISS) to retrieve similar books.
-
β‘οΈ Quick search by title, author, or ISBN with validation checks (e.g. minimum text length, 13-digit ISBN requirement).
-
π Advanced search and filter by year, rating, price, publisher, category, etc. Results are ranked by relevance and rating, with book detail pages showing metadata, cover, and reader reviews.
- π Registered users can create a personal bookshelf, "My List", to organize their favorite books. Adding or removing items anytime.
- β€οΈ On each book detail page, "You Might Also Like" section suggests related books with similar themes, authors, or content (via semantic search).
-
π₯οΈ Backend APIs β RESTful endpoints built with FastAPI + Pydantic, connecting to a PostgreSQL database. π Backend API
-
π¨ Frontend UI β Responsive React interface with dynamic components (grids, filters, hover effects, book cards). Designed in Figma with user-friendly layouts. π Frontend & UI design
-
ποΈ Database β A normalized PostgreSQL schema (3NF) with junction tables for many-to-many relationships. Supports efficient joins, aggregated views, and complex SQL queries. Semantic similarity search powered by FAISS(Facebook AI Similarity Search) with Sentence-BERT embeddings. π Database Docs
-
π Authentication β Secure sign-up/login with JWT tokens, where each user account is identified by a UUID (uuid4). Tokens include user id, issue time, and expiration, only authorized users can manage their profile.
-
π Integrated Data Sources β Combines Amazon Books and Book-Crossing raw data into a unified datasets. Large-scale metadata joins and review aggregation.
-
π§Ή Processing & Normalization β Cleaning, merging, handling missing values, data standardization, book title canonicalization, and author deduplication, etc.
-
βοΈ Feature Engineering β Sentence-BERT embeddings, FAISS vector index for semantic similarity, and sentiment scoring (positivity/negativity with VADER + GPT for multilingual reviews). π Data Processing Docs
- Frontend: React, HTML, CSS, JavaScript
- Backend: FastAPI, JWT, Pydantic, Pytest, Postman
- Database: PostgreSQL
- Vector Search: FAISS (Facebook AI Similarity Search)
- Data Processing: Python, Pandas, scikit-learn, Google Colab
- Version Control: Git, GitHub
- Recommendation System:
- Retrieval-Augmented Generation (RAG) pipeline
- Sentence-BERT embeddings
- GPT + Ollama LLM
- VADER sentiment analysis
- FAISS (Facebook AI Similarity Search)
- Deployment: Docker, Docker Compose for DevOps
root/
β
βββ reading-bee-data-private/ β Data repository (separate)
β βββ *.csv β Book metadata CSVs
β βββ description_embeddings.npy
β βββ description_index.faiss
β
β
βββ reading-bee/
β
βββ backend/ β FastAPI backend service
β βββ routes/ β API route handlers
β βββ main.py β App entry point
β βββ db.py β Database connection
β βββ ...
β
βββ frontend/ β React frontend app
β βββ assets/ β Static assets
β βββ components/ β Source code (JSX, CSS)
β βββ index.html β Main entry point for the website
β βββ ...
β
βββ database/ β SQL scripts and schema
β
βββ docker/ β Docker-related configs
β βββ ...
β
βββ data/ β raw data and processing notebooks
β βββ raw/ β Raw data files
β βββ data-processing/ β Data analysis and processing
β βββ ...
β
βββ docker-compose.yml β Main Docker Compose config
βββ start_dev.sh β Development launcher script
βββ README.md β Project overview
This project uses Docker + Ollama for local development and LLM, please follow the instructions below.
Cloning the data repo:
git clone https://github.com/Chengyuli33/reading-bee-data-private.git
cd ..Cloning the main Reading Bee repo:
git clone https://github.com/Chengyuli33/reading-bee.git
cd reading-beereading-bee-data-private/ should be placed under the same root as the reading-bee/ folder:
root/
βββ reading-bee
βββ reading-bee-data-private
Download Docker Desktop to run containers locally:
brew install --cask dockerDownload Ollama LLM:
brew install ollamaThen, start Docker Desktop manually:
β + Space β Docker
Start Ollama service in background and pull the model (first time only):
ollama serve &
ollama pull llama3This will download and launch the llama3 model (1st time β 4GB, may take a few mins)
# Make the script executable
chmod +x start_dev.sh
# Run the development environment
./start_dev.shYou will see the following logs if everything is running smoothly:
π Welcome to Reading Bee Dev Environment!
π Starting up...
β
Ollama service is already running
β
llama3 model ready
π³ Starting Docker services...
π Using docker compose (v2+)
[+] Building
[+] Running 7/7
...
β¨ All services started!
π± Frontend Website: http://localhost:3000
π§ Backend FastAPI: http://localhost:8000/docs
π€ Ollama: http://localhost:11434
If you see port conflicts:
# Check what's using the port
lsof -i :3000 # or :8000, :5432
# Stop existing containers
docker compose downAfter starting all services, verify everything works:
docker exec -it readingbee-db psql -U postgres -d reading_bee \
-c "SELECT COUNT(*) FROM all_book_full_details_view;"Expected output: 273225
Visit http://localhost:8000/docs or use curl:
curl "http://localhost:8000/books/search?title=harry+potter" | jqExpected output: JSON with a total of 25706 book search results.
Open http://localhost:3000 in the browser.
curl http://localhost:11434/api/tagsShould return list of models {"models": ...} including llama3.
This project is licensed under the Apache License 2.0. See the LICENSE file for details.