pypa/pip

Performance Issue: Too many hashes in RequirementPreparer

Open

#12 589 ouverte le 23 mars 2024

Voir sur GitHub
 (4 commentaires) (0 réactions) (0 assignés)Python (3 032 forks)batch import
help wantedtype: performance

Métriques du dépôt

Stars
 (8 952 stars)
Métriques de merge PR
 (Merge moyen 13j 7h) (20 PRs mergées en 30 j)

Description

Description

RequirementPreparer hashes more than necessary. This leads to poor performance on large wheels in the GB size range.

A call to prepare_linked_requirement calls down to _checked_download_dir. https://github.com/pypa/pip/blob/f5e4ee104e7b171a7cfb2843c9c602abf7a4e346/src/pip/_internal/operations/prepare.py#L501

If the file exists in the download_dir, hashing is triggered. The file is marked as downloaded. https://github.com/pypa/pip/blob/f5e4ee104e7b171a7cfb2843c9c602abf7a4e346/src/pip/_internal/operations/prepare.py#L516

Then we head into _prepare_linked_requirement and eventually hash again. https://github.com/pypa/pip/blob/main/src/pip/_internal/operations/prepare.py#L612

Potential Fix

Files which have passed the hash check can be marked as such, to prevent rehashing.

Expected behavior

RequirementPreparer hashes each file at most once.

pip version

24.0

Python version

3.11

OS

Windows 10

How to Reproduce

Construct a RequirementPreparer supplied with a download_dir. Run prepare_linked_requirement() for a link available as a wheel in the download_dir.

Output

No response

Code of Conduct

Guide contributeur