Sort dependencies in requirements.txt/environment.yml to increase cache hit
#749 opened on Jul 18, 2019
Description
Proposed change
Sort the contents of environment.yml and /or requirements.txt before copying it to the image. The goal here is to make two repos that both require numpy and pandas but write them in different order in one of the two files mentioned share a cache layer.
Comes via https://github.com/jupyter/repo2docker/pull/743#discussion_r304020569 and follow up comment about first doing a survey.
Alternative options
Do nothing.
Who would use this feature?
People who build repositories that use a (very) common set of dependencies.
How much effort will adding it take?
The change itself would be easy to implement. The main effort is in determining if this would benefit anyone or if it is very unlikely. In which case we should not implement this.
Who can do this work?
People with scripting skills who can put together a script to look at https://archive.analytics.mybinder.org/ to find example repositories to analyse and then compare the environment.yml/requirements.txt in these repositories.