docling-project/docling

Issues with Installing and Running Docling - Help Needed with Docling Execution

Open

#1,120 建立於 2025年3月5日

在 GitHub 查看
 (3 留言) (0 反應) (0 負責人)Python (59,751 star) (4,140 fork)batch import
help wantedquestion

描述

Dear Docling Community,

I’m trying to use Docling for document processing, but I’ve encountered a series of issues during installation and execution. I could really use some guidance for a Windows environment.

I followed the instructions to set up Docling with an embedded Python installation, but I’ve run into multiple issues that I can’t resolve. The embedded Python setup was the first obstacle.

Here’s what I’ve tried regarding Python so far (before I think I finally succeeded?):

Installation:

  • I downloaded the embedded zip file and extracted it, but I wasn’t sure what to do with the python3.x.x.zip file. I ended up unzipping it in the same folder, but the process was unclear.
  • After this, I attempted to install Docling using pip install docling. This failed because pip wasn’t recognized.
  • I followed instructions to manually install pip using get-pip.py. That part seemed to work, but when I tried running pip install docling again, it still didn’t recognize pip.

Path and Environment Issues:

  • I modified the Python environment settings as suggested in online guides (editing the pythonxx._pth file and creating a sitecustomize.py file), but I couldn’t get pip to work without navigating directly to the Scripts folder.
  • Eventually, I was able to install Docling by running the command from the Python\Scripts folder.
  • Finally, I installed the Docling library in C:\Docling\Main with Python in C:\Docling\Python.

Execution Problems: To my understanding, to run the following code:

from docling.document_converter import DocumentConverter
source = "https://arxiv.org/pdf/2408.09869"  # Document as local path or URL
converter = DocumentConverter()
result = converter.convert(source)
print(result.document.export_to_markdown())  # Output: "## Docling Technical Report[...]"

Python script file needs to be created. I made the following convert_pdf.py file:

from docling.document_converter import DocumentConverter
source = r"C:\Users\username\Desktop\my_test_pdf.pdf"
converter = DocumentConverter()
result = converter.convert(source)
print(result.document.export_to_markdown()

However, when I try to run it with: docling.exe C:\Users\username\Desktop\convert_pdf.py It does not work. :-(

Further Attempts:

  • I’ve checked the documentation, but it wasn’t clear enough for me to understand how to resolve this issue.
  • I’ve tried various solutions, such as pointing to the Scripts folder and ensuring all files were in the correct locations, but I continue to run into errors.

My Request: Could anyone provide detailed steps on how to properly execute Docling in an embedded Python environment? Specifically, I need guidance on:

  • How to execute Docling properly.
  • How to make sure Docling is installed in the right directory and is accessible.
  • The correct steps to install pip and ensure it’s recognized.

Any help or additional documentation would be greatly appreciated!

Thanks in advance!

PS: I may appear simple to many people, and my mistakes may seem like basic misunderstandings. However, I would really appreciate your help! I’m just a user (not an IT expert or programmer) learning AI and LLMs, with thanks to LM Studio and Text Generation WebUI.

貢獻者指南