pypa/pip

Performance Issue: Too many hashes in RequirementPreparer

Open

#12.589 aperta il 23 mar 2024

Vedi su GitHub
 (4 commenti) (0 reazioni) (0 assegnatari)Python (3032 fork)batch import
help wantedtype: performance

Metriche repository

Star
 (8952 star)
Metriche merge PR
 (Merge medio 13g 7h) (20 PR mergiate in 30 g)

Descrizione

Description

RequirementPreparer hashes more than necessary. This leads to poor performance on large wheels in the GB size range.

A call to prepare_linked_requirement calls down to _checked_download_dir. https://github.com/pypa/pip/blob/f5e4ee104e7b171a7cfb2843c9c602abf7a4e346/src/pip/_internal/operations/prepare.py#L501

If the file exists in the download_dir, hashing is triggered. The file is marked as downloaded. https://github.com/pypa/pip/blob/f5e4ee104e7b171a7cfb2843c9c602abf7a4e346/src/pip/_internal/operations/prepare.py#L516

Then we head into _prepare_linked_requirement and eventually hash again. https://github.com/pypa/pip/blob/main/src/pip/_internal/operations/prepare.py#L612

Potential Fix

Files which have passed the hash check can be marked as such, to prevent rehashing.

Expected behavior

RequirementPreparer hashes each file at most once.

pip version

24.0

Python version

3.11

OS

Windows 10

How to Reproduce

Construct a RequirementPreparer supplied with a download_dir. Run prepare_linked_requirement() for a link available as a wheel in the download_dir.

Output

No response

Code of Conduct

Guida contributor