Architecture Document for Supermarket Receipt Parsers and Nutri-Scanorama v2

Overview

This document outlines the strategy for implementing parsers for various German supermarkets in the receipt processing application. The goal is to extend the current functionality to support multiple supermarket receipt formats, ensuring accurate data extraction and processing.

List of Major German Supermarkets

REWE
Aldi (Aldi Nord and Aldi Süd)
Lidl
Edeka
Penny
Netto
Kaufland
Real
dm (Drogerie Markt)
Rossmann

Strategy for Implementing Supermarket Parsers

Define Parser Requirements
- Each parser should extract relevant data: store name, address, purchase date, item details (name, quantity, price), total amount, and tax details.
- Analyze sample receipts to identify patterns and common elements for each supermarket.
Create a Generic Parser Interface
- Define a common interface or base class for all parsers to ensure consistency and easier management.
- Example interface:
```
interface SupermarketParser {
  parseReceipt(text: string, receiptId: number): Promise<ParsedReceipt>;
}
```
Implement Individual Parsers
- Implement a dedicated parser for each supermarket that adheres to the defined interface, handling the specific receipt format.
- Example functions: parseAldiReceipt, parseLidlReceipt, etc.

Modify the Existing Codebase

Update the upload and parsing logic to accommodate the new parsers. Modify the handleFileUpload function to check for keywords or patterns that identify the supermarket and call the appropriate parser.

Example modification:

const isAldi = result.data.text.includes('Aldi');
const isLidl = result.data.text.includes('Lidl');
const parsedData = await (isAldi 
  ? parseAldiReceipt(result.data.text, receiptId)
  : isLidl 
  ? parseLidlReceipt(result.data.text, receiptId)
  : parseReweReceipt(result.data.text, receiptId));

Receipt Processing Architecture

Store Name Handling

The logic for determining the store name based on parsed receipt data will be handled in the UploadButton component after the parsing step.
If the store name is identified as 'Other', the user will be prompted to enter the correct store name.
This change ensures that the parsers remain focused solely on parsing tasks without incorporating business logic related to store identification.

Default Receipt Parser Implementation

Purpose: Created a default receipt parser to handle unknown receipts. This parser prompts the user for missing information, such as the store name and address, if items are found in the receipt.
Error Handling: If no items are found, the parser throws a ReceiptValidationError indicating that no valid items were detected.
Integration: The default parser is invoked when no recognized store is identified during the parsing process, ensuring that user input is captured for completeness.

AI Integration

Overview

The AI integration has been fully implemented with the following features:

Automatic item extraction from receipt text
Smart retry system with up to 3 attempts
Real-time progress tracking and user feedback
Automatic category assignment
Integration with sync queue and local database

Ollama Service Implementation

Processes raw receipt text to extract structured item data
Handles validation and error cases
Provides detailed feedback for debugging
Supports both initial processing and manual retries

User Interface Integration

Progress indicators during extraction
Clear success/error notifications
Manual retry option with attempt tracking
Real-time updates of extracted items

Data Flow

Receipt text → Ollama processing → Structured items
Automatic category assignment
Database updates and sync queue integration
UI state management and updates

AI Integration Enhancements

Model Switching

The application now supports switching between different AI models (fast and precise) for receipt processing. This allows users to optimize processing speed and accuracy based on their needs.

Duplicate Prevention

Implemented logic to clear existing items before adding new ones during AI extraction. This ensures that running AI extraction multiple times on the same receipt does not result in duplicate items.

Enhanced Filtering

Improved parser logic to filter out non-relevant data such as location names and receipt metadata (e.g., "Düsseldorf") from AI extraction results. This enhances the accuracy of the extracted items.

Environment Management

Integrated dotenv for managing environment variables, enhancing configurability and security. This allows for secure and flexible configuration of API keys and URLs.

Store Images Implementation

The image handling system has been implemented with the following features:

Image Storage

Dual Image Storage: Each receipt image is stored in two formats:
- Thumbnail (50px wide, 30% quality) for quick loading and preview
- Full-size (1200px wide, 80% quality) for detailed viewing

Database Schema: The ReceiptImage model includes fields for both versions:

interface ReceiptImage {
  id?: number;
  receiptId: number;
  thumbnail: Blob;     // Small version for icon
  fullsize: Blob;      // High quality version for viewing
  mimeType: string;
  size: number;
  createdAt: Date;
}

Image Service

Processing: The ImageService handles image resizing and compression:

interface ImageProcessingOptions {
  maxWidth?: number;
  quality?: number;
}

URL Management: Handles Blob URL creation and cleanup
Memory Optimization: Automatic cleanup of unused Blob URLs

UI Components

ReceiptImageThumbnail
- Displays small receipt preview
- Handles loading states
- Memory-efficient Blob URL management
- Click handling for full-size view
ReceiptImageViewer
- Modal dialog for full-size image viewing
- High-quality image display
- Error handling and loading states
- Clean URL management

Integration Points

Upload Process
- Processes both thumbnail and full-size versions
- Stores both versions in IndexedDB
- Provides upload progress feedback
Display Locations
- Receipt details dialog
- Items page header
- Recent scans list

Performance Considerations

Thumbnail size optimized for quick loading
Full-size image loaded only when needed
Automatic cleanup of Blob URLs
Progressive loading with loading states

Error Handling

Graceful fallback for missing images
Loading state indicators
Clear error messages
Backward compatibility with old image format

AI Text Extraction and Processing

Overview

The AI text extraction system uses a combination of OCR and LLM processing to accurately extract and categorize receipt data. The system is designed to handle various receipt formats while maintaining high accuracy and performance.

Core Components

OCR Processing

interface OCRResult {
  text: string;
  confidence: number;
  blocks: Array<{
    text: string;
    bbox: BoundingBox;
    confidence: number;
  }>;
}

Tesseract.js for text extraction
Block-level confidence scoring
Position information preservation
Multi-language support

LLM Processing

interface ProcessedReceipt {
  storeName: string;
  items: Array<{
    name: string;
    category: CategoryName;
    price: number;
    quantity?: number;
    unit?: string;
    pricePerUnit?: number;
    taxRate?: string;
  }>;
  metadata: {
    storeAddress?: string;
    date?: string;
    totalAmount: number;
    taxDetails?: {
      taxRateA: { rate: number, net: number, tax: number, gross: number };
      taxRateB?: { rate: number, net: number, tax: number, gross: number };
    };
  };
}

Structured data extraction
Category assignment
Price and quantity parsing
Tax information extraction

Model Selection

type ModelType = 'fast' | 'precise';

const MODELS = {
  fast: 'meta-llama-3.2-1b',
  precise: 'qwen2.5-coder-32b-instruct'
};

Dual model approach
Performance vs accuracy tradeoff
Automatic fallback mechanisms

Processing Pipeline

Image Preprocessing
- Resolution optimization
- Contrast enhancement
- Noise reduction
- Orientation correction
Text Extraction
- OCR processing
- Confidence filtering
- Layout analysis
- Text cleaning
Data Structuring
- Store detection
- Item parsing
- Price extraction
- Category assignment

Design Decisions

Dual Model Strategy
- Decision: Implement both fast and precise models
- Rationale:
  - Balances speed and accuracy
  - Handles varying receipt complexity
  - Optimizes resource usage
  - Provides user choice
Strict Category Enforcement
- Decision: Use predefined category set
- Rationale:
  - Ensures data consistency
  - Improves categorization accuracy
  - Simplifies reporting
  - Better user experience
Structured Response Format
- Decision: Enforce strict JSON schema
- Rationale:
  - Reliable parsing
  - Type safety
  - Error prevention
  - Easy validation
Progressive Processing
- Decision: Multi-stage extraction pipeline
- Rationale:
  - Better error handling
  - Incremental feedback
  - Recovery options
  - Performance optimization

Error Handling

OCR Failures
- Confidence thresholds
- Retry mechanisms
- Alternative processing paths
- User feedback
LLM Processing
- Response validation
- Fallback processing
- Format correction
- Error reporting

Performance Optimization

Processing Strategy
- Parallel processing where possible
- Caching of intermediate results
- Resource usage monitoring
- Background processing
Memory Management
- Efficient data structures
- Stream processing
- Resource cleanup
- Memory limits

Future Enhancements

Model Improvements
- Custom model training
- Receipt-specific fine-tuning
- Multi-language support
- Performance optimization
Feature Additions
- Advanced tax handling
- Currency conversion
- Receipt comparison
- Fraud detection

Testing Strategy

Unit Tests
- OCR accuracy
- Parser reliability
- Category assignment
- Error handling
Integration Tests
- End-to-end processing
- Model switching
- Error recovery
- Performance metrics

Security Considerations

Data Protection
- Personal information handling
- Data retention policies
- Access controls
- Encryption
Model Security
- Input validation
- Output sanitization
- Resource limits
- Version control

Category Management System

Overview

The category management system is designed to provide a flexible and maintainable way to categorize items from receipts, both automatically and manually. The system consists of several key components that work together to provide a seamless categorization experience.

Core Components

Category Data Structure
```
type CategoryName = 
  | 'Fruits' | 'Vegetables' | 'Dairy' | 'Meat'
  | 'Bakery' | 'Beverages' | 'Snacks' | 'Cereals'
  | 'Other' | 'Sweets' | 'Oils';
```
- Fixed set of categories to ensure consistency
- Each category has an associated color for visual identification
- 'Other' category serves as a fallback for uncategorized items
Category Mapping System
- Maintains a database of keyword-to-category mappings
- Supports both manual and AI-generated mappings
- Uses case-insensitive matching for better accuracy
- Mappings are stored in IndexedDB for offline access
AI Integration
- Uses Ollama service for intelligent categorization
- Strict prompt engineering to ensure category consistency
- Fallback mechanisms for handling unknown items
- Real-time processing with user feedback

User Interface Design

Category Manager Component
- Collapsible category sections for better organization
- Preview mode showing limited items per category
- "Show More" functionality for detailed viewing
- Immediate feedback for all user actions
- Integrated AI categorization for bulk processing
Settings Integration
- Category management placed in Settings for easy access
- Clear separation from storage management
- Intuitive interface for adding/removing mappings
- Visual feedback through toast notifications

Data Flow

Manual Categorization

User Input → Keyword/Category Selection → Database Update → UI Refresh

AI Categorization

Text Input → Ollama Processing → Mapping Creation → Database Update → UI Refresh

Category Mapping Usage

Receipt Upload → Item Extraction → Category Lookup → Default/AI Assignment

Design Decisions

Fixed Category Set
- Decision: Use a fixed set of categories rather than user-defined categories
- Rationale:
  - Ensures consistency across the application
  - Simplifies AI training and categorization
  - Prevents category proliferation
  - Makes statistics and visualization more meaningful
Two-Tier Categorization
- Decision: Implement both manual and AI-powered categorization
- Rationale:
  - Manual mappings provide precise control
  - AI categorization handles bulk processing
  - Hybrid approach maximizes accuracy and efficiency
Collapsible UI
- Decision: Use collapsible sections with preview mode
- Rationale:
  - Reduces visual clutter
  - Improves navigation in large datasets
  - Maintains access to full information when needed
Local Storage
- Decision: Store category mappings in IndexedDB
- Rationale:
  - Enables offline functionality
  - Provides fast access to mappings
  - Supports large numbers of mappings

Future Considerations

Performance Optimization
- Implement pagination for large mapping sets
- Add caching for frequently used mappings
- Optimize database queries for faster lookups
Feature Enhancements
- Add bulk import/export of mappings
- Implement mapping suggestions based on user patterns
- Add category statistics and insights
- Support for subcategories if needed
AI Improvements
- Fine-tune AI categorization based on user corrections
- Add confidence scores for AI categorizations
- Implement batch processing for large datasets

Validation and Testing

Category Mapping Tests
- Verify case-insensitive matching
- Test duplicate handling
- Validate category constraints
AI Integration Tests
- Test prompt effectiveness
- Verify category consistency
- Measure categorization accuracy
UI Testing
- Verify responsive design
- Test accessibility features
- Validate user interaction flows

Current Application Features

Home Page

Recent Scans
- Displays recently scanned receipts with detailed item information
- Shows category icons and names for each item
- Provides a clean, modern interface for viewing scanned items
Top Categories
- Shows the most frequently occurring categories
- Displays category icons with corresponding colors
- Helps users track their shopping patterns
Upload Functionality
- Allows users to scan and upload receipts
- Processes receipts using OCR and AI extraction
- Automatically categorizes items based on content

Scanned Items Page

All Items View
- Lists all scanned items chronologically
- Displays item name, category (with icon), and price
- Matches the home page styling for consistency
- Provides a comprehensive view of all scanned items

Data Management

Local Database
- Uses Dexie.js for IndexedDB management
- Stores items with categories and metadata
- Enables offline functionality
Category System
- Predefined categories with custom icons
- Color-coded category indicators
- Consistent category display across all views

User Interface

Navigation
- Bottom navigation bar for easy access
- Intuitive icons for different sections
- Responsive design for mobile use
Styling
- Modern, clean interface
- Consistent color scheme
- Backdrop blur effects for visual appeal
- Proper spacing and padding throughout

AI Integration

Receipt Processing
- OCR for text extraction
- AI-powered item categorization
- Smart total validation
Data Extraction
- Intelligent item name parsing
- Price extraction and validation
- Category suggestion based on item content

Recent Decisions and Updates

Receipt Parsing Logic Enhancements

Case-Insensitive String Comparisons: Implemented case-insensitive checks for store names and other string comparisons to improve parsing accuracy.
Partial Success Logic: Added a mechanism to handle partial success in receipt parsing. If the sum of extracted item prices differs from the total amount, a warning message is logged to inform the user to review the items for completeness.
Item Extraction Improvements: Enhanced the item extraction logic to better capture item names and prices from various receipt formats. This includes refining regex patterns and handling different formats in the raw text.
Logging Enhancements: Added structured logging for extracted details, including store name, total amount, store address, and items. Each log entry is tagged with [ALDI_RECEIPT] for easier filtering.
Invalid Item Handling: Updated the validation logic to allow for partial successes. Invalid items are logged without failing the entire receipt processing, enabling users to see which items may need correction.

Future Considerations

Further Refinement of Regex Patterns: Continue to refine regex patterns used for item extraction to accommodate more variations in receipt formats.
User Interface Updates: Consider adding UI elements to allow users to manually edit or confirm extracted items when discrepancies are detected.

Testing and Validation

Create unit tests for each parser to ensure they handle various receipt formats and edge cases correctly.
Validate output against known correct data to ensure accuracy.

Documentation and Maintenance

Document each parser's functionality, including specific patterns or rules used for extraction.
Keep parsers updated as supermarkets may change their receipt formats over time.

Receipt Image Management

Overview

The receipt image management system provides efficient storage, optimization, and display of receipt images. It uses a multi-version approach to balance performance with quality, ensuring fast loading times while maintaining high-quality originals for detailed viewing.

Core Components

Image Service

class ImageService {
  async processImage(file: File, options: ImageProcessingOptions): Promise<ProcessedImage>;
  async createThumbnail(file: File): Promise<ProcessedImage>;
  async storeReceiptImage(receiptId: number, file: File): Promise<number>;
  async getReceiptImage(receiptId: number): Promise<{original: Blob; thumbnail: Blob} | null>;
}

Singleton service pattern
Image processing and optimization
Storage management
Memory efficiency

Data Structure

interface ReceiptImage {
  id?: number;
  receiptId: number;
  originalImage: Blob;
  thumbnailImage: Blob;
  mimeType: string;
  size: number;
  createdAt: Date;
}

Separate storage from receipt data
Multiple image versions
Metadata tracking
Efficient querying

UI Components

interface ReceiptImageViewerProps {
  receiptId: number;
  open: boolean;
  onClose: () => void;
}

Modal image viewer
Thumbnail previews
Loading states
Error handling

Processing Pipeline

Image Upload
- File validation
- Type checking
- Size verification
- Initial metadata extraction
Image Processing
- Resolution optimization
- Quality adjustment
- Thumbnail generation
- Format conversion
Storage Management
- Blob storage
- IndexedDB integration
- Version control
- Cleanup routines

Design Decisions

Multi-Version Storage
- Decision: Store both original and optimized versions
- Rationale:
  - Fast thumbnail loading
  - High-quality viewing when needed
  - Bandwidth optimization
  - Better mobile experience
Blob Storage
- Decision: Use Blob storage over base64
- Rationale:
  - 33% smaller storage footprint
  - Better memory efficiency
  - Native browser handling
  - Improved performance
Separate Storage
- Decision: Dedicated table for images
- Rationale:
  - Better query performance
  - Simplified backup strategy
  - Easier maintenance
  - Future extensibility
Canvas Processing
- Decision: Use Canvas API for image processing
- Rationale:
  - Client-side optimization
  - Real-time preview
  - Quality control
  - Format flexibility

Performance Optimizations

Loading Strategy
- Lazy loading for thumbnails
- Progressive loading for originals
- Memory management
- Cache utilization
Resource Management
- URL object cleanup
- Memory monitoring
- Batch processing
- Background operations

Error Handling

Upload Validation
- File type verification
- Size constraints
- Format validation
- Corruption detection
Processing Errors
- Fallback strategies
- User notifications
- Recovery options
- Logging and monitoring

User Experience

Viewing Features
- Smooth transitions
- Loading indicators
- Error messages
- Progress feedback
Interaction Design
- Intuitive controls
- Responsive layout
- Touch support
- Accessibility

Future Enhancements

Image Features
- Rotation controls
- Zoom capabilities
- Cropping tools
- Filter options
Storage Options
- Cloud backup
- Compression improvements
- Format optimization
- Archive functionality

Testing Requirements

Unit Tests
- Processing functions
- Storage operations
- Error handling
- UI components
Integration Tests
- Upload workflow
- Display functionality
- Error scenarios
- Performance metrics

Security Considerations

Upload Security
- File validation
- Size limits
- Type restrictions
- Sanitization
Storage Security
- Access control
- Data encryption
- Secure deletion
- Privacy compliance

Architecture Updates

Service Integration

GLHF Service: Transitioned to direct fetch calls for API requests, similar to LMStudio and Local-LM services. This change simplifies the request handling and aligns all services to a consistent pattern.

Environment Configuration

Environment Variables: Updated variable names and structure for clarity. Added comments and example values in .env.example to guide configuration.

UI Changes

Button Labels: Updated labels in UploadButton.tsx and RecentScans.tsx for consistency and clarity, changing 'Try AI Extraction' to 'Try AI'.

Proxy Server Adjustments

GLHF Endpoint: Modified endpoint handling in proxy-server.js to reflect new environment variable names and direct request handling.

These updates enhance the maintainability, clarity, and consistency of the codebase, ensuring a more streamlined development process and a better user experience.

Conclusion

The receipt image management system provides a robust foundation for handling receipt images efficiently while maintaining a balance between performance and quality. The architecture supports future enhancements and ensures a seamless user experience across different devices and network conditions.

Conclusion

By following this strategy, we can systematically implement parsers for all major German supermarkets, enhancing the application's ability to accurately process diverse receipt formats. These updates aim to enhance the robustness and usability of the receipt parsing functionality within Nutri-Scanorama v2, ensuring a better user experience and more accurate data extraction.

Files

architecture.md

Latest commit

History

architecture.md

File metadata and controls

Architecture Document for Supermarket Receipt Parsers and Nutri-Scanorama v2

Overview

List of Major German Supermarkets

Strategy for Implementing Supermarket Parsers

Receipt Processing Architecture

Store Name Handling

Default Receipt Parser Implementation

AI Integration

Overview

Ollama Service Implementation

User Interface Integration

Data Flow

AI Integration Enhancements

Model Switching

Duplicate Prevention

Enhanced Filtering

Environment Management

Store Images Implementation

Image Storage

Image Service

UI Components

Integration Points

Performance Considerations

Error Handling

AI Text Extraction and Processing

Overview

Core Components

Processing Pipeline

Design Decisions

Error Handling

Performance Optimization

Future Enhancements

Testing Strategy

Security Considerations

Category Management System

Overview

Core Components

User Interface Design

Data Flow

Design Decisions

Future Considerations

Validation and Testing

Current Application Features

Home Page

Scanned Items Page

Data Management

User Interface

AI Integration

Recent Decisions and Updates

Receipt Parsing Logic Enhancements

Future Considerations

Testing and Validation

Documentation and Maintenance

Receipt Image Management

Overview

Core Components

Processing Pipeline

Design Decisions

Performance Optimizations

Error Handling

User Experience

Future Enhancements

Testing Requirements

Security Considerations

Architecture Updates

Service Integration

Environment Configuration

UI Changes

Proxy Server Adjustments

Conclusion

Conclusion