説明
New feature
Hey! There is a CLI argument to clone repositories at a specified depth in a nextflow run call:
-d, -deep
Create a shallow clone of the specified depth
but it seems like this does not propagate to any git submodules. Additionally, submodule configuration for shallow clones via the shallow = true configuration option in a .gitmodules file does not seem to be honored.
I propose that the specified depth also apply to all git submodules in the cloned repository. I also propose that when not given as an argument the clone should honor the configuration option in the .gitmodules file for each submodule.
Use case
The pipeline we are developing has multiple submodules, some with a large git history (27k commits!). We also have a submodule that we use to share common test cases and associated data across multiple repositories (this one has 13k commits). We'd like to avoid cloning the entirety of the commit history for these repositories, since the only thing that matters in the context of a given nextflow run call is the specific commit being pointed to for each submodule.
Our current workaround is to clone the repository ahead of time with --shallow-submodules and then point to that path, but it would be nice to have it all configurable on the nextflow CLI.
Suggested implementation
I have tried looking through the source code in here related to cloning git repositories, as well as through jgit, which seems to be the library being used to orchestrate cloning. My java/groovy skills are not the greatest so correct me if I am wrong, but it seems like to me that jgit does not support shallow cloning of submodules or setting submodule depths. I'm not sure what the best path forward is with that in mind.