mirror of
https://github.com/dehydrated-io/dehydrated.git
synced 2026-01-11 22:30:44 +01:00
Stale lock file prevents dehydrated from running #524
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @jomat on GitHub (Apr 8, 2021).
5c1551e946/dehydrated (L539-L541)dehydrated sometimes doesn't start because of stale lock files. I haven't investigated further, but I assume it happens when a server is restarted while dehydrated is running. Can be reproduced with a SIGKILL.
@kousu commented on GitHub (Apr 24, 2021):
Locking is really crufty and lockfiles are the perhaps the best option unfortunately: https://apenwarr.ca/log/20101213
I haven't run into this yet, but can you adjust your server's update/reboot/whatever cycles to be opposite dehydrated's? And/or could you add a boot script that deletes stale lock files?
@jomat commented on GitHub (Apr 25, 2021):
There are several hundred servers with domains in the four-digit range, so it takes some time for dehydrated to finish, and the cron job is distributed on the servers throughout the day, and there are no planned reboots (keyword ksplice), so, no, I can't adjust that.
The problem isn't that big, as I'm also monitoring certificate expiry and we get a notification 29 days in advance.
A reboot script would be a workaround I don't want to use. Currently I've deployed the mentioned fork/PR as our servers are quite homogeneous and the lock file isn't on a nfs.
@kousu commented on GitHub (Apr 25, 2021):
That's cool, I hope it works out for you.
I don't have that many servers under dehydrated yet, so maybe I'll have to keep my eye out for this as I expand.
@lukas2511 commented on GitHub (Apr 25, 2021):
If this is a problem you only ever have on reboots you might want to configure dehydrated to put the lockfile into a directory that's mounted in memory (e.g.
/dev/shmor/run), that way it can't persist over a reboot. Alternatively you could try running dehydrated using systemd services and timers, that way systemd should be able to wait for dehydrated to finish or at least stop it in a way that would trigger the exit trap.I'm leaving your pull-request #814 open for now. This is something I really really need to test on lots of platforms before I can merge or implement something similar to it. Having a simple lockfile is just one of the easiest solutions that I'm quite sure will work on older and embedded Linux systems, weird WSL things, BSD systems, etc.
@jomat commented on GitHub (Jun 29, 2021):
It also happens in low memory situations:
Imho it'd be better to close my PR when it's not suitable and let this issue open?