docling-project/docling

Improved Footnote Serialization in `MarkdownDocSerializer`

Open

#3,128 建立於 2026年3月14日

在 GitHub 查看
 (13 留言) (0 反應) (1 負責人)Python (59,751 star) (4,140 fork)batch import
docling-documentenhancementgood first issue

描述

Requested feature

Currently, footnotes are serialized as part of MarkdownDocSerializer more or less as-is:

Serialized as:

5 https://github.com/tesseract-ocr/tesseract

6 https://github.com/VikParuchuri/surya

7 https://github.com/lukas-blecher/LaTeX-OCR

Alternatives

For downstream LLM-based applications it would be helpful if footnotes were serialized as actual footnotes in Markdown Syntax for the LLM to indentify them as footnotes (and not as a numbered list, for example).

^[5 https://github.com/tesseract-ocr/tesseract]

^[6 https://github.com/VikParuchuri/surya]

^[7 https://github.com/lukas-blecher/LaTeX-OCR]

貢獻者指南

Improved Footnote Serialization in `MarkdownDocSerializer` · docling-project/docling#3128 | Good First Issue