eliminate use of `systemctl try-restart`
#1,711 opened on Aug 14, 2021
Description
Today we use systemctl try-restart to attempt a service restart after applying settings. Partly this is because we process settings early in the boot, when the affected services haven't been started yet and aren't intended to start.
However, this causes trouble when changing settings at runtime, because if the service isn't running, the command will do nothing.
Services might not be running for a few reasons:
- they failed to start after bad settings were previously applied
- they are starting after new settings are applied, but aren't yet started all the way
In a host container running at boot, @vignesh-goutham discovered the following race:
- host container queries systemd for the status of
kubelet - waits for it to finish activating (
ActiveState=activeandSubState=running) - issues
apiclient setcommands to reconfigurekubelet apiserverexecutes restart commandssystemctl try-restartdoes nothingsystemdlogs the firstStarted Kubeletaround 1 second later
From this we can infer that two calls to apiclient set kubernetes.<blah> in quick succession will not always result in two kubelet restarts, leaving that service in an undefined state.
For changing settings at runtime, we really need something more like force-stop and force-start to ensure that the restart commands are fully enacted for each transaction.