apache/airflow

DAG-Bundle : Git connections ignore credentials for public repositories, causing anonymous rate limit issues

Open

#54829 opened on Aug 22, 2025

View on GitHub
 (24 comments) (0 reactions) (0 assignees)Python (44,809 stars) (16,781 forks)batch import
area:providersgood first issuekind:bugprovider:git

Description

Hi everyone,

Description

I recently came across the problem, that the git-fetch-requests of the dag-bundle hit the rate-limit of the git-server. I found out that Airflow cannot take advantage of a higher rate-limit for authorized git-requests if the repo is public as it never sends the credentials to the server.

But Airflow could support to force authentication for a git connection type via a custom http.extraHeader as additional argument to the git command. This would allow authenticated Git operations even for public repositories.

Use case/motivation

Git does not authenticate when syncing public repositories, even if login and password are set. This is due to Git’s HTTPS behavior: it always tries anonymous access first and only sends credentials if the server responds with a 401 Unauthorized. => Public repos typically do not issue a 401, so credentials are never used.

This is problematic because GitLab/GitHub apply different rate limits for authenticated vs anonymous requests. As a result, Airflow tasks can hit anonymous rate limits even when valid credentials are provided.

A known workaround is to use Git’s http.extraHeader option to force authentication:

B64=$(printf 'x:%s' "<PROJECT_ACCESS_TOKEN>" | base64)
git -c http.extraHeader="Authorization: Basic $B64" clone https://gitlab.com/<group>/<repo>.git

This ensures that all requests are authenticated from the first request, avoiding anonymous rate limits.

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!
  • I am still thinking about a possible solution ...

This functionality probably could be added here 🤔 https://github.com/apache/airflow/blob/main/providers/git/src/airflow/providers/git/hooks/git.py Perhaps by additional extra arguments

This would allow to pass the authentication header to git like you would do it in the shell:

export MY_HEADER="Authorization: Basic <BASE64>"
git --config-env=http.extraheader=MY_HEADER clone <repo>

This would be a very general approach which could also be used for other git-features. But of course it is less intuitiv.

Code of Conduct

Contributor guide