Constraint enforcer counts completed ReplicatedJob tasks against node memory
Aperta il 19 mag 2026
Descrizione
Description
Swarm mode constraint enforcer rejects running tasks because completed replicated-job tasks are included in the node's reservation sum. Over time, accumulated Completed job tasks push the enforcer's view of "reserved memory" beyond node capacity, and deployments fail with tasks getting rejected:
assigned node no longer meets constraints
The node has plenty of actual free memory, the enforcer's sum is wrong.
Root cause
If I understand correctly, docker swarm mode wraps the "moby/swarmkit" project. In rejectNoncompliantTasks (https://github.com/moby/swarmkit/blob/12ce3490ef26cbca6ef9b243cb013fffbfe6a6cb/manager/orchestrator/constraintenforcer/constraint_enforcer.go#L117), the task loop filters only on DesiredState:
for _, t := range tasks {
if t.DesiredState < api.TaskStateAssigned || t.DesiredState > api.TaskStateCompleted {
continue
}
...
available.MemoryBytes -= t.Spec.Resources.Reservations.MemoryBytes
available.NanoCPUs -= t.Spec.Resources.Reservations.NanoCPUs
}
For services with mode: replicated-job, finished tasks remain in the store with DesiredState = Completed (this is the terminal/normal state for a job task, not pruned by TaskHistoryRetentionLimit). They satisfy the filter and have their reservations subtracted from available, even though nothing is actually running.
Our observation
We had a stack with a "post_deploy" job that runs data migrations etc.
swarm (4 nodes, Docker 29.4.1). The worst-affected node had:
| Source | Memory reserved (enforcer view) |
|---|---|
| Live running tasks | 21 GB |
Completed *_post_deploy job tasks (20 of them across 3 services) |
10 GB |
| Total | 31 GB |
| Node capacity | 32 GB |
Every stack deploy with services with heavy memory reservations (over 1 GB) that happed to get scheduled to a node with many completed post-deploy jobs would fail with "assigned node no longer meets constraints". The same cluster has another node close to capacity, that one "only" has occasional rejections. The two remaining nodes without accumulated jobs are fine (or we haven't noticed any issues). There seemed to be a correlation with accumulated Completed job tasks.
Removing and re-creating the job-mode services (clearing the Completed task history) immediately stopped the rejections.
Versions
- Docker 29.4.1 on all nodes (managers and workers).
Reproduce
- Create a
ReplicatedJobservice with non-trivialResources.Reservations.MemoryBytes(e.g. 512 MB). - Run it to completion many times (e.g. as a post-deploy hook in CI) so the node accumulates Completed job tasks with
DesiredState = Completed. - Run unrelated replicated services on the same node such that
sum(reservations of running tasks) + sum(reservations of Completed job tasks)exceeds the node's total memory. - Trigger any node update (label change, heartbeat, restart). Watch live tasks get rejected with
assigned node no longer meets constraints, even thoughfree -mshows plenty of memory.
Expected behavior
No response
docker version
Client: Docker Engine - Community
Version: 29.4.1
API version: 1.54
Go version: go1.26.2
Git commit: 055a478
Built: Mon Apr 20 16:32:37 2026
OS/Arch: linux/amd64
Context: default
Server: Docker Engine - Community
Engine:
Version: 29.4.1
API version: 1.54 (minimum version 1.40)
Go version: go1.26.2
Git commit: 6c91b92
Built: Mon Apr 20 16:32:37 2026
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: v2.2.3
GitCommit: 77c84241c7cbdd9b4eca2591793e3d4f4317c590
runc:
Version: 1.3.5
GitCommit: v1.3.5-0-g488fc13e
docker-init:
Version: 0.19.0
GitCommit: de40ad0
docker info
Client: Docker Engine - Community
Version: 29.4.1
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.33.0
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v5.1.3
Path: /usr/libexec/docker/cli-plugins/docker-compose
Server:
Containers: 192
Running: 52
Paused: 0
Stopped: 140
Images: 198
Server Version: 29.4.1
Storage Driver: overlayfs
driver-type: io.containerd.snapshotter.v1
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
CDI spec directories:
/etc/cdi
/var/run/cdi
Swarm: active
NodeID: 5t209n1f9ddy46jt14fcoxs3m
Is Manager: false
Node Address: -
Manager Addresses: -
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 77c84241c7cbdd9b4eca2591793e3d4f4317c590
runc version: v1.3.5-0-g488fc13e
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: builtin
cgroupns
Kernel Version: 6.8.0-110-generic
Operating System: Ubuntu 24.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.34GiB
Name: swarm4
ID: 89a641b4-1d45-47d6-b41c-f5f006557916
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
::1/128
127.0.0.0/8
Live Restore Enabled: false
Firewall Backend: iptables
Additional Info
No response