This document outlines the strategy for implementing parsers for various German supermarkets in the receipt processing application. The goal is to extend the current functionality to support multiple supermarket receipt formats, ensuring accurate data extraction and processing.
- REWE
- Aldi (Aldi Nord and Aldi Süd)
- Lidl
- Edeka
- Penny
- Netto
- Kaufland
- Real
- dm (Drogerie Markt)
- Rossmann
-
Define Parser Requirements
- Each parser should extract relevant data: store name, address, purchase date, item details (name, quantity, price), total amount, and tax details.
- Analyze sample receipts to identify patterns and common elements for each supermarket.
-
Create a Generic Parser Interface
- Define a common interface or base class for all parsers to ensure consistency and easier management.
- Example interface:
interface SupermarketParser { parseReceipt(text: string, receiptId: number): Promise<ParsedReceipt>; }
-
Implement Individual Parsers
- Implement a dedicated parser for each supermarket that adheres to the defined interface, handling the specific receipt format.
- Example functions:
parseAldiReceipt
,parseLidlReceipt
, etc.
-
Modify the Existing Codebase
- Update the upload and parsing logic to accommodate the new parsers. Modify the
handleFileUpload
function to check for keywords or patterns that identify the supermarket and call the appropriate parser. - Example modification:
const isAldi = result.data.text.includes('Aldi'); const isLidl = result.data.text.includes('Lidl'); const parsedData = await (isAldi ? parseAldiReceipt(result.data.text, receiptId) : isLidl ? parseLidlReceipt(result.data.text, receiptId) : parseReweReceipt(result.data.text, receiptId));
- Update the upload and parsing logic to accommodate the new parsers. Modify the
- The logic for determining the store name based on parsed receipt data will be handled in the
UploadButton
component after the parsing step. - If the store name is identified as 'Other', the user will be prompted to enter the correct store name.
- This change ensures that the parsers remain focused solely on parsing tasks without incorporating business logic related to store identification.
- Purpose: Created a default receipt parser to handle unknown receipts. This parser prompts the user for missing information, such as the store name and address, if items are found in the receipt.
- Error Handling: If no items are found, the parser throws a
ReceiptValidationError
indicating that no valid items were detected. - Integration: The default parser is invoked when no recognized store is identified during the parsing process, ensuring that user input is captured for completeness.
The AI integration has been fully implemented with the following features:
- Automatic item extraction from receipt text
- Smart retry system with up to 3 attempts
- Real-time progress tracking and user feedback
- Automatic category assignment
- Integration with sync queue and local database
- Processes raw receipt text to extract structured item data
- Handles validation and error cases
- Provides detailed feedback for debugging
- Supports both initial processing and manual retries
- Progress indicators during extraction
- Clear success/error notifications
- Manual retry option with attempt tracking
- Real-time updates of extracted items
- Receipt text → Ollama processing → Structured items
- Automatic category assignment
- Database updates and sync queue integration
- UI state management and updates
The application now supports switching between different AI models (fast and precise) for receipt processing. This allows users to optimize processing speed and accuracy based on their needs.
Implemented logic to clear existing items before adding new ones during AI extraction. This ensures that running AI extraction multiple times on the same receipt does not result in duplicate items.
Improved parser logic to filter out non-relevant data such as location names and receipt metadata (e.g., "Düsseldorf") from AI extraction results. This enhances the accuracy of the extracted items.
Integrated dotenv
for managing environment variables, enhancing configurability and security. This allows for secure and flexible configuration of API keys and URLs.
The image handling system has been implemented with the following features:
- Dual Image Storage: Each receipt image is stored in two formats:
- Thumbnail (50px wide, 30% quality) for quick loading and preview
- Full-size (1200px wide, 80% quality) for detailed viewing
- Database Schema: The
ReceiptImage
model includes fields for both versions:interface ReceiptImage { id?: number; receiptId: number; thumbnail: Blob; // Small version for icon fullsize: Blob; // High quality version for viewing mimeType: string; size: number; createdAt: Date; }
- Processing: The
ImageService
handles image resizing and compression:interface ImageProcessingOptions { maxWidth?: number; quality?: number; }
- URL Management: Handles Blob URL creation and cleanup
- Memory Optimization: Automatic cleanup of unused Blob URLs
-
ReceiptImageThumbnail
- Displays small receipt preview
- Handles loading states
- Memory-efficient Blob URL management
- Click handling for full-size view
-
ReceiptImageViewer
- Modal dialog for full-size image viewing
- High-quality image display
- Error handling and loading states
- Clean URL management
-
Upload Process
- Processes both thumbnail and full-size versions
- Stores both versions in IndexedDB
- Provides upload progress feedback
-
Display Locations
- Receipt details dialog
- Items page header
- Recent scans list
- Thumbnail size optimized for quick loading
- Full-size image loaded only when needed
- Automatic cleanup of Blob URLs
- Progressive loading with loading states
- Graceful fallback for missing images
- Loading state indicators
- Clear error messages
- Backward compatibility with old image format
The AI text extraction system uses a combination of OCR and LLM processing to accurately extract and categorize receipt data. The system is designed to handle various receipt formats while maintaining high accuracy and performance.
-
OCR Processing
interface OCRResult { text: string; confidence: number; blocks: Array<{ text: string; bbox: BoundingBox; confidence: number; }>; }
- Tesseract.js for text extraction
- Block-level confidence scoring
- Position information preservation
- Multi-language support
-
LLM Processing
interface ProcessedReceipt { storeName: string; items: Array<{ name: string; category: CategoryName; price: number; quantity?: number; unit?: string; pricePerUnit?: number; taxRate?: string; }>; metadata: { storeAddress?: string; date?: string; totalAmount: number; taxDetails?: { taxRateA: { rate: number, net: number, tax: number, gross: number }; taxRateB?: { rate: number, net: number, tax: number, gross: number }; }; }; }
- Structured data extraction
- Category assignment
- Price and quantity parsing
- Tax information extraction
-
Model Selection
type ModelType = 'fast' | 'precise'; const MODELS = { fast: 'meta-llama-3.2-1b', precise: 'qwen2.5-coder-32b-instruct' };
- Dual model approach
- Performance vs accuracy tradeoff
- Automatic fallback mechanisms
-
Image Preprocessing
- Resolution optimization
- Contrast enhancement
- Noise reduction
- Orientation correction
-
Text Extraction
- OCR processing
- Confidence filtering
- Layout analysis
- Text cleaning
-
Data Structuring
- Store detection
- Item parsing
- Price extraction
- Category assignment
-
Dual Model Strategy
- Decision: Implement both fast and precise models
- Rationale:
- Balances speed and accuracy
- Handles varying receipt complexity
- Optimizes resource usage
- Provides user choice
-
Strict Category Enforcement
- Decision: Use predefined category set
- Rationale:
- Ensures data consistency
- Improves categorization accuracy
- Simplifies reporting
- Better user experience
-
Structured Response Format
- Decision: Enforce strict JSON schema
- Rationale:
- Reliable parsing
- Type safety
- Error prevention
- Easy validation
-
Progressive Processing
- Decision: Multi-stage extraction pipeline
- Rationale:
- Better error handling
- Incremental feedback
- Recovery options
- Performance optimization
-
OCR Failures
- Confidence thresholds
- Retry mechanisms
- Alternative processing paths
- User feedback
-
LLM Processing
- Response validation
- Fallback processing
- Format correction
- Error reporting
-
Processing Strategy
- Parallel processing where possible
- Caching of intermediate results
- Resource usage monitoring
- Background processing
-
Memory Management
- Efficient data structures
- Stream processing
- Resource cleanup
- Memory limits
-
Model Improvements
- Custom model training
- Receipt-specific fine-tuning
- Multi-language support
- Performance optimization
-
Feature Additions
- Advanced tax handling
- Currency conversion
- Receipt comparison
- Fraud detection
-
Unit Tests
- OCR accuracy
- Parser reliability
- Category assignment
- Error handling
-
Integration Tests
- End-to-end processing
- Model switching
- Error recovery
- Performance metrics
-
Data Protection
- Personal information handling
- Data retention policies
- Access controls
- Encryption
-
Model Security
- Input validation
- Output sanitization
- Resource limits
- Version control
The category management system is designed to provide a flexible and maintainable way to categorize items from receipts, both automatically and manually. The system consists of several key components that work together to provide a seamless categorization experience.
-
Category Data Structure
type CategoryName = | 'Fruits' | 'Vegetables' | 'Dairy' | 'Meat' | 'Bakery' | 'Beverages' | 'Snacks' | 'Cereals' | 'Other' | 'Sweets' | 'Oils';
- Fixed set of categories to ensure consistency
- Each category has an associated color for visual identification
- 'Other' category serves as a fallback for uncategorized items
-
Category Mapping System
- Maintains a database of keyword-to-category mappings
- Supports both manual and AI-generated mappings
- Uses case-insensitive matching for better accuracy
- Mappings are stored in IndexedDB for offline access
-
AI Integration
- Uses Ollama service for intelligent categorization
- Strict prompt engineering to ensure category consistency
- Fallback mechanisms for handling unknown items
- Real-time processing with user feedback
-
Category Manager Component
- Collapsible category sections for better organization
- Preview mode showing limited items per category
- "Show More" functionality for detailed viewing
- Immediate feedback for all user actions
- Integrated AI categorization for bulk processing
-
Settings Integration
- Category management placed in Settings for easy access
- Clear separation from storage management
- Intuitive interface for adding/removing mappings
- Visual feedback through toast notifications
-
Manual Categorization
User Input → Keyword/Category Selection → Database Update → UI Refresh
-
AI Categorization
Text Input → Ollama Processing → Mapping Creation → Database Update → UI Refresh
-
Category Mapping Usage
Receipt Upload → Item Extraction → Category Lookup → Default/AI Assignment
-
Fixed Category Set
- Decision: Use a fixed set of categories rather than user-defined categories
- Rationale:
- Ensures consistency across the application
- Simplifies AI training and categorization
- Prevents category proliferation
- Makes statistics and visualization more meaningful
-
Two-Tier Categorization
- Decision: Implement both manual and AI-powered categorization
- Rationale:
- Manual mappings provide precise control
- AI categorization handles bulk processing
- Hybrid approach maximizes accuracy and efficiency
-
Collapsible UI
- Decision: Use collapsible sections with preview mode
- Rationale:
- Reduces visual clutter
- Improves navigation in large datasets
- Maintains access to full information when needed
-
Local Storage
- Decision: Store category mappings in IndexedDB
- Rationale:
- Enables offline functionality
- Provides fast access to mappings
- Supports large numbers of mappings
-
Performance Optimization
- Implement pagination for large mapping sets
- Add caching for frequently used mappings
- Optimize database queries for faster lookups
-
Feature Enhancements
- Add bulk import/export of mappings
- Implement mapping suggestions based on user patterns
- Add category statistics and insights
- Support for subcategories if needed
-
AI Improvements
- Fine-tune AI categorization based on user corrections
- Add confidence scores for AI categorizations
- Implement batch processing for large datasets
-
Category Mapping Tests
- Verify case-insensitive matching
- Test duplicate handling
- Validate category constraints
-
AI Integration Tests
- Test prompt effectiveness
- Verify category consistency
- Measure categorization accuracy
-
UI Testing
- Verify responsive design
- Test accessibility features
- Validate user interaction flows
-
Recent Scans
- Displays recently scanned receipts with detailed item information
- Shows category icons and names for each item
- Provides a clean, modern interface for viewing scanned items
-
Top Categories
- Shows the most frequently occurring categories
- Displays category icons with corresponding colors
- Helps users track their shopping patterns
-
Upload Functionality
- Allows users to scan and upload receipts
- Processes receipts using OCR and AI extraction
- Automatically categorizes items based on content
- All Items View
- Lists all scanned items chronologically
- Displays item name, category (with icon), and price
- Matches the home page styling for consistency
- Provides a comprehensive view of all scanned items
-
Local Database
- Uses Dexie.js for IndexedDB management
- Stores items with categories and metadata
- Enables offline functionality
-
Category System
- Predefined categories with custom icons
- Color-coded category indicators
- Consistent category display across all views
-
Navigation
- Bottom navigation bar for easy access
- Intuitive icons for different sections
- Responsive design for mobile use
-
Styling
- Modern, clean interface
- Consistent color scheme
- Backdrop blur effects for visual appeal
- Proper spacing and padding throughout
-
Receipt Processing
- OCR for text extraction
- AI-powered item categorization
- Smart total validation
-
Data Extraction
- Intelligent item name parsing
- Price extraction and validation
- Category suggestion based on item content
-
Case-Insensitive String Comparisons: Implemented case-insensitive checks for store names and other string comparisons to improve parsing accuracy.
-
Partial Success Logic: Added a mechanism to handle partial success in receipt parsing. If the sum of extracted item prices differs from the total amount, a warning message is logged to inform the user to review the items for completeness.
-
Item Extraction Improvements: Enhanced the item extraction logic to better capture item names and prices from various receipt formats. This includes refining regex patterns and handling different formats in the raw text.
-
Logging Enhancements: Added structured logging for extracted details, including store name, total amount, store address, and items. Each log entry is tagged with
[ALDI_RECEIPT]
for easier filtering. -
Invalid Item Handling: Updated the validation logic to allow for partial successes. Invalid items are logged without failing the entire receipt processing, enabling users to see which items may need correction.
- Further Refinement of Regex Patterns: Continue to refine regex patterns used for item extraction to accommodate more variations in receipt formats.
- User Interface Updates: Consider adding UI elements to allow users to manually edit or confirm extracted items when discrepancies are detected.
- Create unit tests for each parser to ensure they handle various receipt formats and edge cases correctly.
- Validate output against known correct data to ensure accuracy.
- Document each parser's functionality, including specific patterns or rules used for extraction.
- Keep parsers updated as supermarkets may change their receipt formats over time.
The receipt image management system provides efficient storage, optimization, and display of receipt images. It uses a multi-version approach to balance performance with quality, ensuring fast loading times while maintaining high-quality originals for detailed viewing.
-
Image Service
class ImageService { async processImage(file: File, options: ImageProcessingOptions): Promise<ProcessedImage>; async createThumbnail(file: File): Promise<ProcessedImage>; async storeReceiptImage(receiptId: number, file: File): Promise<number>; async getReceiptImage(receiptId: number): Promise<{original: Blob; thumbnail: Blob} | null>; }
- Singleton service pattern
- Image processing and optimization
- Storage management
- Memory efficiency
-
Data Structure
interface ReceiptImage { id?: number; receiptId: number; originalImage: Blob; thumbnailImage: Blob; mimeType: string; size: number; createdAt: Date; }
- Separate storage from receipt data
- Multiple image versions
- Metadata tracking
- Efficient querying
-
UI Components
interface ReceiptImageViewerProps { receiptId: number; open: boolean; onClose: () => void; }
- Modal image viewer
- Thumbnail previews
- Loading states
- Error handling
-
Image Upload
- File validation
- Type checking
- Size verification
- Initial metadata extraction
-
Image Processing
- Resolution optimization
- Quality adjustment
- Thumbnail generation
- Format conversion
-
Storage Management
- Blob storage
- IndexedDB integration
- Version control
- Cleanup routines
-
Multi-Version Storage
- Decision: Store both original and optimized versions
- Rationale:
- Fast thumbnail loading
- High-quality viewing when needed
- Bandwidth optimization
- Better mobile experience
-
Blob Storage
- Decision: Use Blob storage over base64
- Rationale:
- 33% smaller storage footprint
- Better memory efficiency
- Native browser handling
- Improved performance
-
Separate Storage
- Decision: Dedicated table for images
- Rationale:
- Better query performance
- Simplified backup strategy
- Easier maintenance
- Future extensibility
-
Canvas Processing
- Decision: Use Canvas API for image processing
- Rationale:
- Client-side optimization
- Real-time preview
- Quality control
- Format flexibility
-
Loading Strategy
- Lazy loading for thumbnails
- Progressive loading for originals
- Memory management
- Cache utilization
-
Resource Management
- URL object cleanup
- Memory monitoring
- Batch processing
- Background operations
-
Upload Validation
- File type verification
- Size constraints
- Format validation
- Corruption detection
-
Processing Errors
- Fallback strategies
- User notifications
- Recovery options
- Logging and monitoring
-
Viewing Features
- Smooth transitions
- Loading indicators
- Error messages
- Progress feedback
-
Interaction Design
- Intuitive controls
- Responsive layout
- Touch support
- Accessibility
-
Image Features
- Rotation controls
- Zoom capabilities
- Cropping tools
- Filter options
-
Storage Options
- Cloud backup
- Compression improvements
- Format optimization
- Archive functionality
-
Unit Tests
- Processing functions
- Storage operations
- Error handling
- UI components
-
Integration Tests
- Upload workflow
- Display functionality
- Error scenarios
- Performance metrics
-
Upload Security
- File validation
- Size limits
- Type restrictions
- Sanitization
-
Storage Security
- Access control
- Data encryption
- Secure deletion
- Privacy compliance
- GLHF Service: Transitioned to direct fetch calls for API requests, similar to LMStudio and Local-LM services. This change simplifies the request handling and aligns all services to a consistent pattern.
- Environment Variables: Updated variable names and structure for clarity. Added comments and example values in
.env.example
to guide configuration.
- Button Labels: Updated labels in
UploadButton.tsx
andRecentScans.tsx
for consistency and clarity, changing 'Try AI Extraction' to 'Try AI'.
- GLHF Endpoint: Modified endpoint handling in
proxy-server.js
to reflect new environment variable names and direct request handling.
These updates enhance the maintainability, clarity, and consistency of the codebase, ensuring a more streamlined development process and a better user experience.
The receipt image management system provides a robust foundation for handling receipt images efficiently while maintaining a balance between performance and quality. The architecture supports future enhancements and ensures a seamless user experience across different devices and network conditions.
By following this strategy, we can systematically implement parsers for all major German supermarkets, enhancing the application's ability to accurately process diverse receipt formats. These updates aim to enhance the robustness and usability of the receipt parsing functionality within Nutri-Scanorama v2, ensuring a better user experience and more accurate data extraction.