Skip to content

A real-time AI-powered document processing system . It demonstrates streaming ETL, dynamic indexing, and live RAG capabilities for logistics and finance operations.

Notifications You must be signed in to change notification settings

bunnysunny24/Logistics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

42 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿš› Logistics Pulse Copilot - Real-Time AI Assistant

Pathway Hackathon Real-Time RAG Track 2

๐Ÿ† Pathway Hackathon Submission - Track 2: Logistics Pulse Copilot

A real-time AI-powered logistics and finance document processing system that detects anomalies, ensures compliance, and provides instant insights using Pathway's streaming ETL pipeline.

๐ŸŽฏ Problem Statement

In logistics operations, critical updates happen every few minutes:

  • 8:07 AM: Driver safety status changes from "Low" to "High risk"
  • 8:12 AM: Finance publishes new payout rules with updated rates
  • 8:18 AM: Shipment scan flags "Exception: package missing"

The Challenge: If these updates don't surface instantly, bad decisions followโ€”unsafe drivers stay on the road, wrong rates get quoted, customers wait in the dark.

Our Solution: A real-time RAG application that watches live data sources, indexes every new record through Pathway, and proves its currency with instant, up-to-date responses.

โœจ Hackathon Requirements Compliance

โœ… Pathway-Powered Streaming ETL

  • Core Engine: Pathway framework handles all data ingestion and processing
  • Real-Time Processing: Continuously ingests from file directories, APIs, and webhooks
  • Streaming Pipeline: backend/pipeline/pathway_ingest.py implements the backbone ETL

โœ… Dynamic Indexing (No Rebuilds)

  • On-the-Fly Integration: New data indexed automatically without manual reloads
  • Real-Time Updates: Data changes flow through to answers immediately
  • No Manual Rebuilds: Pathway's incremental processing eliminates rebuild needs

โœ… Live Retrieval/Generation Interface

  • Multiple Interfaces: FastAPI endpoints, React frontend, and direct API access
  • Real-Time Responses: Answers reflect latest data within seconds
  • Live Updates: T+0 data changes included in T+1 queries

โœ… Demo Video Ready

  • Before/After Proof: System designed to showcase live update flow
  • Real-Time Demo: Add file โ†’ trigger update โ†’ see new answers immediately
  • Hackathon Validation: Built-in demo endpoints for judges

๐Ÿ—๏ธ Architecture Overview

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Data Sources  โ”‚    โ”‚  Pathway Engine  โ”‚    โ”‚  AI Interface   โ”‚
โ”‚                 โ”‚    โ”‚                  โ”‚    โ”‚                 โ”‚
โ”‚ โ€ข CSV Files     โ”‚โ”€โ”€โ”€โ–ถโ”‚ โ€ข Streaming ETL  โ”‚โ”€โ”€โ”€โ–ถโ”‚ โ€ข FastAPI       โ”‚
โ”‚ โ€ข PDF Docs      โ”‚    โ”‚ โ€ข Dynamic Index  โ”‚    โ”‚ โ€ข React UI      โ”‚
โ”‚ โ€ข API Feeds     โ”‚    โ”‚ โ€ข Anomaly Engine โ”‚    โ”‚ โ€ข RAG Queries   โ”‚
โ”‚ โ€ข Webhooks      โ”‚    โ”‚ โ€ข Vector Stores  โ”‚    โ”‚ โ€ข Live Updates  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ”„ Real-Time Data Flow

  1. Ingestion: Pathway monitors ./data/ directories for new files
  2. Processing: Streaming ETL extracts, transforms, and indexes data
  3. Anomaly Detection: AI engine flags suspicious patterns in real-time
  4. Vector Indexing: Documents automatically added to searchable stores
  5. Query Response: Users get answers reflecting latest data instantly

โšก Getting Started - 30 Second Setup

For Hackathon Judges: Here's the fastest way to see the system in action!

One-Click Windows Start

# Clone and start (all-in-one command)
git clone <repository-url> && cd logistics-pulse-copilot && .\start.ps1

One-Click macOS/Linux Start

# Clone and start (all-in-one command)
git clone <repository-url> && cd logistics-pulse-copilot && chmod +x start.sh && ./start.sh

Manual Start (3 commands)

# 1. Setup
pip install -r requirements.txt && python setup_enhanced.py

# 2. Start backend
python backend/main_enhanced.py &

# 3. Test it works
curl http://localhost:8000/api/status

๐ŸŽฏ Verification: Visit http://localhost:8000/docs to see the live API, or http://localhost:3000 for the frontend UI.

๐Ÿš€ Detailed Setup Guide

Prerequisites

  • Python 3.8+
  • Node.js 16+ (for frontend)
  • Git

1. Clone Repository

git clone https://github.com/your-username/logistics-pulse-copilot.git
cd logistics-pulse-copilot

2. Backend Setup

# Install Python dependencies
pip install -r requirements.txt

# Set up environment
cp .env.example .env

# Initialize data directories
python setup_enhanced.py

3. Start the System

Option A: Quick Start (Recommended for Demo)

# Windows
./start.bat

# macOS/Linux
./start.ps1

Option B: Manual Start

# Start backend
cd backend
python main_enhanced.py

# Start frontend (new terminal)
cd frontend
npm install
npm start

4. Verify Installation

๐ŸŽฎ Demo Instructions

Real-Time Update Demo

  1. Initial Query: Ask "What high-risk shipments do we have?"
  2. Add New Data: Drop a CSV with anomalies into ./data/uploads/
  3. Watch Magic: Same query now returns updated results instantly!

Demo Endpoints

# Check system status
GET /api/status

# Upload new document
POST /api/upload

# Query with real-time data
POST /api/query

# Trigger anomaly detection
POST /api/detect-anomalies

# Get current anomalies
GET /api/anomalies

Demo Scenarios

Scenario 1: Invoice Compliance Alert

curl -X POST "http://localhost:8000/api/query" \
  -H "Content-Type: application/json" \
  -d '{"message": "Are there any non-compliant invoices?"}'

Scenario 2: Shipment Risk Assessment

curl -X POST "http://localhost:8000/api/query" \
  -H "Content-Type: application/json" \
  -d '{"message": "Show me shipments with route deviations"}'

Scenario 3: Real-Time Policy Updates

# Update policy file, then query
curl -X POST "http://localhost:8000/api/query" \
  -H "Content-Type: application/json" \
  -d '{"message": "What are the current late fee rates?"}'

๐Ÿ”ง Key Components

Pathway Integration

Component File Purpose
Streaming ETL backend/pipeline/pathway_ingest.py Core Pathway pipeline for data processing
Pipeline Manager backend/pipeline/pathway_manager.py Controls and monitors Pathway operations
Real-Time RAG backend/models/rag_model.py Integrates Pathway with vector stores

RAG System

Component File Purpose
Local LLM backend/models/local_llm.py Hugging Face model integration
Vector Stores backend/models/rag_model.py FAISS + Pathway vector indexing
Anomaly Detection backend/pipeline/enhanced_anomaly_detector.py Real-time anomaly flagging

API Layer

Endpoint Purpose Real-Time Feature
/api/upload Document ingestion Immediate processing via Pathway
/api/query Natural language queries Latest data always included
/api/anomalies Risk alerts Real-time anomaly detection
/api/status System health Live pipeline monitoring

๐Ÿ“Š Use Cases Implemented

1. Driver Safety Monitoring

  • Real-Time Updates: Driver risk status changes trigger immediate alerts
  • Example: "Driver Maya moved from Low to High risk - recommend reassignment"
  • Data Sources: Safety files, incident reports, performance metrics

2. Invoice & Payment Compliance

  • Policy Tracking: System cross-checks invoices against up-to-date contract terms
  • Example: "Invoice #234 is non-compliant: late-fee clause #4 now applies"
  • Real-Time: Finance updates โ†’ instant policy application

3. Shipment Anomaly & Fraud Detection

  • Live Monitoring: Real-time shipment feeds flag suspicious patterns
  • Example: "Shipment #1027 shows significant route deviationโ€”possible fraud"
  • Instant Investigation: Pulls relevant policies and historical data immediately

๐Ÿ” Technical Deep Dive

Pathway Streaming Architecture

# Core streaming pipeline
class PathwayIngestPipeline:
    def build_pipeline(self):
        # 1. Input connectors for each data type
        invoices = pw.io.fs.read("./data/invoices", format="csv", mode="streaming")
        shipments = pw.io.fs.read("./data/shipments", format="csv", mode="streaming")
        
        # 2. Real-time processing
        processed_docs = self._process_documents(invoices, shipments)
        
        # 3. Anomaly detection
        anomalies = self._detect_anomalies(processed_docs)
        
        # 4. Vector indexing
        self._index_documents(processed_docs)

Dynamic Indexing

# RAG model with Pathway integration
class LogisticsPulseRAG:
    def add_document_to_index(self, content, doc_type, metadata):
        if self.pathway_enabled:
            # Route through Pathway for real-time processing
            self._route_through_pathway(content, doc_type, metadata)
        
        # Also update local store for immediate access
        self._add_to_local_vector_store(content, doc_type, metadata)

Real-Time Query Processing

# Live retrieval with latest data
def process_query(self, query):
    # 1. Sync with Pathway's latest output
    self.sync_with_pathway_index()
    
    # 2. Hybrid retrieval (semantic + keyword)
    docs = self._hybrid_search(query)
    
    # 3. Generate response with fresh data
    return self._generate_response(query, docs)

๐Ÿ“ˆ Performance & Scalability

  • Latency: Sub-second query responses with real-time data
  • Throughput: Handles hundreds of documents per minute
  • Scalability: Pathway enables horizontal scaling
  • Memory: Efficient vector store management with FAISS

๐Ÿงช Testing

Run Tests

# Backend tests
python -m pytest backend/tests/

# Integration tests
python test_complete_workflow.py

# Real-time demo
python demo_causal_flow.py

Test Real-Time Updates

# 1. Start system
python start_system.py

# 2. Upload test data
python test_upload_data.py

# 3. Verify real-time processing
python test_dashboard_update.py

๐Ÿ“‹ Data Formats Supported

Format Examples Real-Time Processing
CSV Invoices, shipments, driver data โœ… Streaming ETL
PDF Policies, contracts, reports โœ… Text extraction + indexing
JSON API feeds, webhook data โœ… Direct processing
Markdown Policy documents โœ… Chunked indexing

๐ŸŽฅ Demo Video Highlights

Our demo video showcases:

  1. Initial State: System answers query with existing data
  2. Live Update: New document added to watched directory
  3. Pathway Processing: Real-time ETL pipeline processes new data
  4. Updated Response: Same query now includes new information
  5. Proof of Real-Time: Timestamps show sub-second updates

๐Ÿ”ฎ Future Enhancements

Agentic RAG (Optional Implementation)

  • LangGraph Integration: Multi-step reasoning workflows
  • Agent Orchestration: Intelligent query routing and escalation
  • REST API: /api/agents endpoint for agentic workflows

Advanced Features

  • Multi-modal Processing: Images, videos, audio files
  • Webhook Integrations: Real-time API feeds
  • Advanced Analytics: Predictive risk modeling
  • Multi-tenant Support: Enterprise deployment ready

๐Ÿ“ž Support & Contact

  • Issues: Open GitHub issues for bugs or questions
  • Discussions: Use GitHub Discussions for feature requests
  • Documentation: Check docs/ folder for detailed guides

๐Ÿ† Hackathon Submission Checklist

  • โœ… Working Prototype: Fully functional system with real-time updates
  • โœ… Code Repository: Complete source code with clear documentation
  • โœ… Pathway Integration: Core streaming ETL using Pathway framework
  • โœ… Dynamic Indexing: No manual rebuilds required
  • โœ… Live Interface: API and UI for real-time queries
  • โœ… Demo Ready: Built-in demonstration capabilities
  • โœ… Setup Instructions: Clear installation and running guide

๐Ÿ“„ License

MIT License

Copyright (c) 2025 Logistics Pulse Copilot

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Built for the Pathway Hackathon 2025 ๐Ÿš€

Demonstrating the power of real-time RAG for logistics operations

About

A real-time AI-powered document processing system . It demonstrates streaming ETL, dynamic indexing, and live RAG capabilities for logistics and finance operations.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published