docling-project/docling

Improved Footnote Serialization in `MarkdownDocSerializer`

Open

#3,128 opened on 2026年3月14日

GitHub で見る
 (13 comments) (0 reactions) (1 assignee)Python (59,751 stars) (4,140 forks)batch import
docling-documentenhancementgood first issue

説明

Requested feature

Currently, footnotes are serialized as part of MarkdownDocSerializer more or less as-is:

Serialized as:

5 https://github.com/tesseract-ocr/tesseract

6 https://github.com/VikParuchuri/surya

7 https://github.com/lukas-blecher/LaTeX-OCR

Alternatives

For downstream LLM-based applications it would be helpful if footnotes were serialized as actual footnotes in Markdown Syntax for the LLM to indentify them as footnotes (and not as a numbered list, for example).

^[5 https://github.com/tesseract-ocr/tesseract]

^[6 https://github.com/VikParuchuri/surya]

^[7 https://github.com/lukas-blecher/LaTeX-OCR]

コントリビューターガイド

Improved Footnote Serialization in `MarkdownDocSerializer` · docling-project/docling#3128 | Good First Issue