docling-project/docling
View on GitHubWhen converting a docx document, inexplicable blank images appear, and a line of text disappears.
Open
#3315 opened on Apr 16, 2026
bugdocxgood first issue
Description
Bug
When converting a docx document, inexplicable blank images appear, and a line of text disappears.
Steps to reproduce
- download the zip from
https://www.3gpp.org/ftp/tsg_ran/WG1_RL1/TSGR1_124b/Docs/R1-2601816.zip, unzip this zip file and then you can get
a file namedR1-2601816 Discussion on other aspects of CSI acquisition and report for 6GR.docx. download the zip fromhttps://www.3gpp.org/ftp/tsg_ran/WG1_RL1/TSGR1_124b/Docs/R1-2601793.zip, unzip this zip file and then you can get
a file namedR1-2601793.docx. - use docling to convert the docx to markdown file
cmd = [
"docling",
"--from", "docx",
"--to", "md",
"--output", os.path.dirname(md_path),
doc_path
]
print(f"run command: {' '.join(cmd)}")
result = subprocess.run(
cmd,
capture_output=True,
text=True,
check=True
)
- I get the markdown, I have obtained the markdown file and carefully compared the contents of the docx file and the markdown file (I use typora to read markdown file).
- I found the error. The details are as follows:
"Agenda Item : 10.5.3.3" disappear "3GPP TSG RAN WG1 Meeting #124bis R1-2601793" dosappear inexplicable blank images appear
Docling version
Docling version: 2.88.0 Docling Core version: 2.72.0 Docling IBM Models version: 3.13.0 Docling Parse version: 5.8.0 Python: cpython-312 (3.12.12) Platform: Linux-6.8.0-90-generic-x86_64-with-glibc2.39
Python version
Python 3.12.12