CatchTheTornado/text-extract-api

Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown

PythonStars 2951Forks 250Watchers 2951Open issues 47License MIT License
Details
仓库信息
OwnerCatchTheTornado
Last pushed2025-12-08
Last updated2025-12-14
Issues fetched at

Stats

Community at a glance

Loading...

Loading

--

Loading

--

Loading

--

Loading

--