docling-project/docling

tables in json format inside md or txt file

Open

#3082 opened on Mar 7, 2026

View on GitHub
 (9 comments) (0 reactions) (1 assignee)Python (59,751 stars) (4,140 forks)batch import
docling-documentenhancementgood first issue

Description

all work fine so fare

but image count like i mentiont and same(similar) for tables

yes you can convert tables in markdown directly but if you make a json <b<much better for embedding you have trouble to find the right place copy back to markdown.

i made out of the json tables in json format

        tables_data = []
        for table_ix, table in enumerate(conv_res.document.tables):
            if not hasattr(table, 'export_to_dataframe'):
                _log.warning(f"Table {table_ix} has no export method.")
                continue
                
            try:
                # Assuming clean_text is defined globally or passed in. 
                # If missing, you must define it or use str(val).replace('\n', ' ')
                table_df = table.export_to_dataframe()
                num_rows = len(table_df)
                if hasattr(table_df, 'columns'):
                    columns_list = list(table_df.columns)
                    records = [
                        {col: clean_text(str(val)) for col, val in row.items()} 
                        for row in table_df.to_dict(orient="records")
                    ]

                    table_info = {
                        "table_index": table_ix + 1,
                        "num_rows": num_rows,
                        "num_columns": len(columns_list),
                        "columns": columns_list,
                        "data": records,
                    }
                    tables_data.append(table_info)
            except Exception as e:
                _log.error(f"Error processing table {table_ix}: {e}")

        # Now doc_filename is safe to use here
        json_filename = output_dir / f"{doc_filename}-tables.json" 

Is there an easier way to save the tables in JSON format together with the plain text?

Contributor guide