docling-project/docling

Markdown serialization of headings in rich table cells

Open

#2,722 创建于 2025年12月4日

在 GitHub 查看
 (3 评论) (0 反应) (1 负责人)Python (59,751 star) (4,140 fork)batch import
buggood first issuemarkdown

描述

Bug

In docling-core, the MarkdownTableSerializer transforms DoclingDocument headings and titles into markdown headings (i.e., text with preceding # symbols). According to the markdown specs, you can’t use headings, blockquotes, lists, horizontal rules, images, or most HTML tags when formatting text within tables and most applications will not render those headings properly.

Steps to reproduce

Convert Docling's test file table_with_heading_02.html to markdown or check its ground truth file table_with_heading_02.html.md

Docling version

Docling version: 2.64.0 Docling Core version: 2.51.1 Docling IBM Models version: 3.10.2 Docling Parse version: 4.7.1 Python: cpython-313 (3.13.5) Platform: macOS-14.7.1-arm64-arm-64bit-Mach-O

Python version

Python 3.13.5

贡献者指南

Markdown serialization of headings in rich table cells · docling-project/docling#2722 | Good First Issue