Repository Issues

Unstructured-IO/unstructured

Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.

Stars
 (14,711 stars)
Forks
 (1,232 forks)
Indexed issues
 (1 indexed issue)
open beginner issues
 (1 open beginner issue)
Latest indexed
Jun 13, 2026
Last GitHub push
May 13, 2026
Contributing guide
Contributing guide
Code of conduct
Code of conduct
Dominant language
HTML
PR merge metrics
 (Avg merge 21h 46m) (7 merged PRs in 30d)
Beginner labels
good first issue

Issues

1 open indexed issue

Open
feat/clean_newline
enhancementgood first issue

Unstructured-IO/unstructured #2,513 opened Feb 6, 2024 · HTML · 14,711 stars

Why recommendedNo assignee yet · Marked good first issue
No assignee yetMarked good first issueContributing guide available
4 comments0 reactions0 assignees