AI/ML Engineer • GenAI Developer • Computer Vision & Deep Learning Specialist
I design and develop intelligent systems that learn, reason, and solve real-world problems.
My work spans real-time computer vision, multimodal LLM systems, OCR automation, and 3D deep-learning pipelines.
I enjoy converting raw data into deployable, production-ready AI applications.
- Focused on building scalable AI systems with real-time inference capability.
- Currently exploring advanced MLOps workflows, multimodal AI, and RAG-based architectures.
- Passionate about applying ML, CV, and LLMs to practical use-cases with measurable impact.
- Open to collaborations in AI/ML research, computer vision, and GenAI product development.
- Multimodal AI systems (text–image–video pipelines)
- 3D spatial deep learning models
- OCR-based workflow automation
- LLM-driven content intelligence and retrieval systems
- MLOps and model lifecycle management
- LangChain and RAG pipelines
- LLM optimization and quantization
- Scalable deployment architectures (cloud + containers)
- GenAI and LLM integration
- Computer vision models and pipelines
- OCR and document intelligence
- Deep learning architectures
- ML workflows and data engineering
- Python
- C++
- JavaScript
- TensorFlow, PyTorch, Keras
- Scikit-Learn, NumPy, Pandas
- OpenCV
- CNN, RNN, LSTM, GRU
- Graph Attention Networks (GAT)
- Google Gemini
- LLaMA
- Ollama
- Transformers
- FAISS
- NLP pipelines
- PDFMiner
- BeautifulSoup
- YouTube Transcript API
- Streamlit
- Babylon.js
- HTML / CSS / JavaScript
- Linux
- AWS (Basics)
- Jupyter
A gesture classification system achieving over 92% accuracy with real-time performance using CNN + CV pipelines.
Predicts room centroids and generates editable 3D layouts with real-time Babylon.js rendering.
A multimodal intelligence engine for extracting and analyzing content from YouTube, PDFs, and websites using Gemini, LLaMA, and Ollama.
OCR-driven form extraction, field detection, and automated filling using OpenCV, PyTesseract, and coordinate mapping.
Built a classification pipeline using VGG16 and custom CNN models achieving up to 99% accuracy on ADNI datasets.
Time-series forecasting with LSTM and GRU models, achieving top performance in MAE evaluation.
Email: prasad.mitnapure01@gmail.com
I turn messy, unstructured real-world data into clean, automated AI workflows.


