ScaleDP is an Open-Source extension of Apache Spark for Document Processing
nlp pdf machine-learning ocr spark data-extraction nlp-machine-learning vlm ocr-recognition pdf-document-processor ocr-python easyocr huggingface-models llm llm-inference suryaocr doctrocr
-
Updated
Feb 22, 2025 - Python