pypa/pip

Performance Issue: Too many hashes in RequirementPreparer

Open

#12,589 建立於 2024年3月23日

在 GitHub 查看
 (1 留言) (0 反應) (0 負責人)Python (8,952 star) (3,032 fork)batch import
help wantedtype: performance

描述

Description

RequirementPreparer hashes more than necessary. This leads to poor performance on large wheels in the GB size range.

A call to prepare_linked_requirement calls down to _checked_download_dir. https://github.com/pypa/pip/blob/f5e4ee104e7b171a7cfb2843c9c602abf7a4e346/src/pip/_internal/operations/prepare.py#L501

If the file exists in the download_dir, hashing is triggered. The file is marked as downloaded. https://github.com/pypa/pip/blob/f5e4ee104e7b171a7cfb2843c9c602abf7a4e346/src/pip/_internal/operations/prepare.py#L516

Then we head into _prepare_linked_requirement and eventually hash again. https://github.com/pypa/pip/blob/main/src/pip/_internal/operations/prepare.py#L612

Potential Fix

Files which have passed the hash check can be marked as such, to prevent rehashing.

Expected behavior

RequirementPreparer hashes each file at most once.

pip version

24.0

Python version

3.11

OS

Windows 10

How to Reproduce

Construct a RequirementPreparer supplied with a download_dir. Run prepare_linked_requirement() for a link available as a wheel in the download_dir.

Output

No response

Code of Conduct

貢獻者指南

Performance Issue: Too many hashes in RequirementPreparer · pypa/pip#12589 | Good First Issue