A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
python
pdf
parser
ocr
pdf-converter
extract-data
document-analysis
pdf-parser
layout-analysis
ai4science
pdf-extractor-rag
pdf-extractor-llm
pdf-extractor-pretrain
-
Updated
Nov 10, 2024 - Python