Skip to content

LLM-powered FastAPI service for structuring unstructured customer-support messages using LangGraph workflows, PostgreSQL long-term memory, local embeddings, and semantic ticket retrieval.

License

Notifications You must be signed in to change notification settings

r4stin/support-memory-weave

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📘 Support Memory Weave

LLM-powered pipeline for structuring unstructured customer support data

Python FastAPI PostgreSQL LangGraph SentenceTransformers License: MIT

Support Memory Weave is a FastAPI + LangGraph + PostgreSQL system that transforms raw customer support messages into structured, searchable, and LLM-ready tickets.

The system includes an ingestion pipeline, automated ticket structuring, semantic vector retrieval, and suggested reply generation, all backed by realistic conversational data.


🚀 Features Implemented

1. FastAPI Ingestion Pipeline

  • Accepts raw customer support messages
  • Reconstructs conversation context
  • Creates or extends conversation threads
  • Writes conversations, messages, and tickets into PostgreSQL

2. LangGraph-Based Ticket Structuring

A multi-step workflow that generates a structured support ticket:

  • Issue type classification
  • Severity assessment
  • Sentiment detection
  • Short and long summaries
  • Action recommendations
  • Fully deterministic and expandable state-machine design

3. SQL-Backed Long-Term Memory

Using PostgreSQL + SQLAlchemy:

  • Stores conversations, messages, structured tickets
  • Ensures relational consistency
  • Provides a backend for retrieval and analytics

4. Dataset Ingestion Support

  • Script to ingest curated subsets of the Customer Support on Twitter dataset
  • Loads real inbound user tweets
  • Automatically transforms them into structured tickets
  • Enables realistic testing and evaluation

5. Local Embeddings & Semantic Vector Search

  • Uses SentenceTransformers (all-MiniLM-L6-v2) to embed ticket summaries
  • Stores vector embeddings directly in PostgreSQL (JSON format)
  • Retrieval powered by cosine similarity
  • /tickets/{id}/suggest-reply returns semantically similar tickets

6. Suggested Reply Generation

  • Retrieves similar tickets as evidence
  • Synthesizes a suggested reply using contextual patterns
  • Modular design — ready to be swapped with real LLM calls later

7. Automatic API Documentation

Interactive Swagger UI for all routes:

👉 http://localhost:8000/docs


🧠 Architecture Overview

Raw message
     ↓
FastAPI → PostgreSQL storage
     ↓
LangGraph Ticket Structurer
     ↓
Structured Ticket (issue type, severity, sentiment, summaries, action)
     ↓
Embedding generation → stored in PostgreSQL
     ↓
Semantic vector search → similar tickets
     ↓
Suggested reply generation

🗂️ Project Structure

app/
├── api/
│   └── v1/
│       └── tickets.py
├── core/
│   ├── config.py
│   └── embedding_client.py
├── db/
│   ├── base.py
│   ├── models.py
│   └── session.py
├── graphs/
│   └── ticket_structurer_graph.py
├── schemas/
│   └── tickets.py
├── services/
│   ├── structuring.py
│   └── retrieval.py
└── main.py

scripts/
├── load_twitter_dataset.py
└── embed_all_tickets.py

🛠 Tech Stack

Layer Tool
API FastAPI
Workflow LangGraph
Database PostgreSQL + SQLAlchemy
Embeddings SentenceTransformers
Retrieval Cosine similarity
LLM Integration OpenAI / Gemini / Vertex (planned)
Dataset Customer Support on Twitter (Kaggle)

📚 Dataset

This project uses a curated subset of the Customer Support on Twitter dataset:

Axelbrooke, S. (2017). Customer Support on Twitter. Kaggle.
DOI: 10.34740/KAGGLE/DSV/8841
https://www.kaggle.com/datasets/thoughtvector/customer-support-on-twitter

The full dataset contains 2.8M+ tweets across a wide range of brands and support scenarios.
For local development, this project uses a filtered subset consisting only of inbound customer messages (inbound=True), prepared via a Kaggle notebook and exported as a lightweight CSV.

This dataset provides realistic, modern conversational support data for evaluating ticket structuring, memory, embedding-based retrieval, and reply suggestion workflows.


🧪 Running Locally

1. Virtual environment

python3 -m venv venv
source venv/bin/activate

2. Install dependencies

pip install -r requirements.txt

3. Create .env

POSTGRES_USER=postgres
POSTGRES_PASSWORD=postgres
POSTGRES_DB=support_memory_weave
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
REDIS_HOST=localhost
REDIS_PORT=6379

4. Start Postgres (local or Docker)

docker-compose up -d

5. Launch API

uvicorn app.main:app --reload

6. API Docs

👉 http://localhost:8000/docs


🏷️ Versioning

v0.2.0

  • Added dataset ingestion pipeline
  • Added local embeddings (SentenceTransformers)
  • Added semantic vector search via cosine similarity
  • Enhanced suggested reply engine
  • Retains all v0.1.0 functionality

v0.1.0

  • Initial MVP: FastAPI + LangGraph structuring + PostgreSQL storage

About

LLM-powered FastAPI service for structuring unstructured customer-support messages using LangGraph workflows, PostgreSQL long-term memory, local embeddings, and semantic ticket retrieval.

Topics

Resources

License

Stars

Watchers

Forks

Languages