Skip to content

Latest commit

 

History

History
74 lines (53 loc) · 7.42 KB

README.md

File metadata and controls

74 lines (53 loc) · 7.42 KB

Demonstration

https://devpost.com/software/kitchenvision

App Homepage Sample Receipt Decipher

Inspiration

In the hustle and bustle of modern life, managing groceries and meal planning often feels like a daunting task. We wanted to create a solution that would transform these everyday challenges into a streamlined, enjoyable experience. Inspired by the growing need for smart kitchen management, we envisioned KitchenVision as a way to merge advanced technology with practical use. Our aim was to make it easier for people to track their grocery inventory, discover new recipes, and manage their meals—all from the convenience of their smartphones.

What it does

KitchenVision is an Android mobile app designed to revolutionize kitchen management with the following features:

  • Receipt Scanning: Utilize your phone’s camera to scan and digitize grocery receipts through the CEIR (Capturing Effective Information from Receipt) computer vision model. This converts physical receipts into actionable text files, making it easy to keep track of purchases.
  • Inventory Management: Automatically updates your local inventory database with the extracted data, ensuring accurate tracking of your grocery items.
  • Recipe Exploration: Browse a comprehensive database of over 30,000 recipes. The app provides detailed cooking instructions and sample images generated by Cohere’s LLM, making recipe discovery and preparation simpler than ever.
  • Custom Recipes: Allows users to add and manage their own recipes alongside the pre-loaded ones, offering personalized meal planning options.

How we built it

  • Receipt Scanning:

    • Technology: Developed in Java for Android, with the CEIR deep learning model implemented on a Python server and accessed via HTTP.
    • Process: Integrated camera functionality with the CEIR system, which consists of two deep learning neural networks (CTPN and CRNN) for Optical Character Recognition (OCR) to extract text from receipts. Feedback mechanisms ensure data accuracy, and a manual entry option is available when OCR encounters challenges.
  • Backend Processing:

    • Technology: Apache server with PHP and Python, using HTTP and SQLite3.
    • Process: Receipt data is sent to our server through HTTP requests. Python scripts process this data using Cohere’s LLM, which categorizes and analyzes the information based on a local recipe and ingredient database. Databricks enhances data management for scalable processing.
  • Recipe and Inventory Management:

    • Technology: Databricks database integrated with the Android app.
    • Process: Users can track inventory, search for recipes, and add personal recipes, all seamlessly managed through the app’s updated features.
  • AI-Enhanced Recipe Suggestions:

    • Technology: Cohere’s LLM.
    • Process: Provides detailed cooking instructions and sample images for recipes, offering personalized guidance based on user preferences and inventory. Build Hashtables from recipe and ingredients, go over the ingredient and mark %completion for each recipe, and sort the recipes based on %completion.

Challenges we ran into

  • Receipt Variability: Different receipt formats and print qualities impacted OCR accuracy. We fine-tuned the pre-trained CEIR model with grocery receipt samples to enhance recognition capabilities.
  • Image Quality: Factors like rotation, tilt, or lighting affected OCR accuracy. We applied image processing techniques such as GaussianBlur and morphological operations to enhance text feature extraction.
  • Data Integration: Combining Cohere’s LLM with our extensive recipe and ingredient database required optimization for efficient data processing and fast response times.
  • User Experience: Balancing complex backend processes with a user-friendly interface involved several iterations of design and testing.

Accomplishments that we’re proud of

  • Accurate OCR Processing: Successfully implemented the CEIR model for high-accuracy text extraction from varied receipt types.
  • Seamless AI Integration: Effectively utilized Cohere’s LLM to provide personalized recipe instructions and recommendations.
  • Comprehensive App Functionality: Developed a robust app that integrates inventory management, recipe browsing, and cooking guidance.
  • Efficient Backend System: Built a scalable backend system ensuring smooth communication and data processing between the app and server.

What we learned

Developing KitchenVision deepened our understanding of integrating diverse technologies into a cohesive system. We learned to manage complex data processing, handle AI-driven insights, and create a user-centric interface. The project underscored the importance of optimizing AI models and backend infrastructure for a seamless user experience.

What’s next for KitchenVision

  • Smart Pantry Suggestions: Real-time grocery shopping suggestions based on current inventory and frequently cooked recipes.
  • Voice Command Integration: Adding voice commands for improved accessibility and user interaction.
  • Meal Planning & Nutrition Tracking: Features to help users plan weekly meals and track nutritional information.
  • Cross-Platform Expansion: Developing versions for iOS and web to broaden accessibility and functionality.

KitchenVision is set to redefine kitchen management with future improvements, making meal planning and grocery management more intuitive and enjoyable.


Sponsor-Specific Highlights

Use of Databricks

KitchenVision leverages Databricks to process and organize our extensive recipe dataset. We converted about 3000 complex categorical vectors from around 30,000 samples into a streamlined two-column dataset using PySpark modules and the Databricks Community platform. This integration optimizes data management and enhances the efficiency and scalability of our backend processing.

Use of Cohere and Codegen

We utilize Cohere’s LLM to process the text extracted from the CEIR model. Since all recognizable text information from the receipt is obtained, we apply dynamic attention to different components of the text for recipe recommendations and personalized cooking instructions. Cohere’s LLM selects grocery-item-related information from text files for future inference.

Additionally, we applied Cohere’s LLM to process our recipe database, which contains about 3,000 example ingredients. A recipe recommendation system is constructed based on whether the items from the receipt can be used to prepare specific dishes. Some items may be operationally similar to ingredients in the dataset and serve as substitutes. To address this, Cohere’s LLM categorizes tokenized ingredients and relates grocery items to existing categories for generating recipe recommendations.

For Codegen, Cohere’s LLM enables advanced text extraction and culinary semantic alignment between tokenized text systems. This interconnected processing system allows our project to perform more sophisticated tasks such as recipe recommendation.

Working in an Air-Gapped Environment

Our app is designed to function efficiently even when disconnected from the internet. The recipe search and personalized cooking instructions are localized on the phone, allowing the app to operate independently. If the mobile device has enough computational power, the deep learning OCR model can be offloaded, ensuring full functionality in an air-gapped environment.