Description
Description of the bug:
There appears to be a bug where globs and the implicit watches they register in a repository will always trigger repository invalidation. It seems related to symlinks and change detection of the files implementating a repository.
Given a repo rule that looks roughly like this:
def _impl(rctx):
rctx.file("BUILD", "glob('foo/**')")
rctx.file("cache/marker", "")
rctx.symlink("cache/", "foo/cache")
print("hit")
Expected behavior:
bazel clean
bazel build # prints hit
touch <repo>/cache/x.txt
bazel build # doesn't print hit
<modify impl>
bazel build # prints hit
touch <repo>/cache/x.txt
bazel build # doesn't print hit
Actual behavior:
bazel clean
bazel build # prints hit
touch <repo>/cache/x.txt
bazel build # doesn't print hit
<modify repo rule bzl file>
bazel build # prints hit
touch <repo>/cache/x.txt
bazel build # prints hit
Note how in expected, it properly caching the repo after 1 invocation. In the actual, it has ceased caching entirely. In the actual output, when it warms the repo is re-running due to external modification, it prints a message like warning: external/<repo name>/cache/x.txt changed. This is feels notable because usually the file path it prints doesn't include external/<repo name>. Gives me "something is confused by a symlink" vibes.
Why is the "touch file in repo" step there? That may seem a bit strange, but it how Python behaves. Given lib/python3.11/foo.py, it creates lib/python3.11/__pycache__/foo.pyc. In the actual code I ran into this bug with, the symlink() call is linking e.g. lib/python3.11/__pycache__ -> unwatched_dir/lib/python3.11/__pycache__, and there is a glob(lib/**)
This seems to only happen with the combination of a glob, symlinks, and modifying the bzl file implementing a repo rule.
Which category does this issue belong to?
No response
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
Use this commit: https://github.com/rickeylev/rules_python/commit/9c077f2fbff1143f0b3a6d6dd7696e67f06e95d5
And run tests/repro/repro.sh. Modify the python/private/python_repository.bzl file, then run repro.sh again. On the second and subsequent runs, the repo will refuse the stay cached
git clone https://github.com/rickeylev/rules_python/
git checkout 9c077f2fbff1143f0b3a6d6dd7696e67f06e95d5
tests/repro/repro.sh
<modify Python/private/python_repository.bzl; modify the print statement>
tests/repro/repro.sh
Which operating system are you running Bazel on?
linux
What is the output of bazel info release?
9.0
If bazel info release returns development version or (@non-git), tell us how you built Bazel.
No response
What's the output of git remote get-url origin; git rev-parse HEAD ?
If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.
No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
No response