Stale lock file prevents dehydrated from running #524

Closed
opened 2025-12-29 01:26:45 +01:00 by adam · 5 comments
Owner

Originally created by @jomat on GitHub (Apr 8, 2021).

5c1551e946/dehydrated (L539-L541)

dehydrated sometimes doesn't start because of stale lock files. I haven't investigated further, but I assume it happens when a server is restarted while dehydrated is running. Can be reproduced with a SIGKILL.

Originally created by @jomat on GitHub (Apr 8, 2021). https://github.com/dehydrated-io/dehydrated/blob/5c1551e946456f534cf46b6ebabe4353bf0b0530/dehydrated#L539-L541 dehydrated sometimes doesn't start because of stale lock files. I haven't investigated further, but I assume it happens when a server is restarted while dehydrated is running. Can be reproduced with a SIGKILL.
adam closed this issue 2025-12-29 01:26:45 +01:00
Author
Owner

@kousu commented on GitHub (Apr 24, 2021):

Locking is really crufty and lockfiles are the perhaps the best option unfortunately: https://apenwarr.ca/log/20101213

I haven't run into this yet, but can you adjust your server's update/reboot/whatever cycles to be opposite dehydrated's? And/or could you add a boot script that deletes stale lock files?

@kousu commented on GitHub (Apr 24, 2021): Locking is really crufty and lockfiles are the perhaps the best option unfortunately: https://apenwarr.ca/log/20101213 I haven't run into this yet, but can you adjust your server's update/reboot/whatever cycles to be opposite dehydrated's? And/or could you add a boot script that deletes stale lock files?
Author
Owner

@jomat commented on GitHub (Apr 25, 2021):

but can you adjust your server's update/reboot/whatever cycles to be opposite dehydrated's

There are several hundred servers with domains in the four-digit range, so it takes some time for dehydrated to finish, and the cron job is distributed on the servers throughout the day, and there are no planned reboots (keyword ksplice), so, no, I can't adjust that.
The problem isn't that big, as I'm also monitoring certificate expiry and we get a notification 29 days in advance.

A reboot script would be a workaround I don't want to use. Currently I've deployed the mentioned fork/PR as our servers are quite homogeneous and the lock file isn't on a nfs.

@jomat commented on GitHub (Apr 25, 2021): > but can you adjust your server's update/reboot/whatever cycles to be opposite dehydrated's There are several hundred servers with domains in the four-digit range, so it takes some time for dehydrated to finish, and the cron job is distributed on the servers throughout the day, and there are no planned reboots (keyword ksplice), so, no, I can't adjust that. The problem isn't that big, as I'm also monitoring certificate expiry and we get a notification 29 days in advance. A reboot script would be a workaround I don't want to use. Currently I've deployed the mentioned fork/PR as our servers are quite homogeneous and the lock file isn't on a nfs.
Author
Owner

@kousu commented on GitHub (Apr 25, 2021):

That's cool, I hope it works out for you.

I don't have that many servers under dehydrated yet, so maybe I'll have to keep my eye out for this as I expand.

@kousu commented on GitHub (Apr 25, 2021): That's cool, I hope it works out for you. I don't have that many servers under dehydrated yet, so maybe I'll have to keep my eye out for this as I expand.
Author
Owner

@lukas2511 commented on GitHub (Apr 25, 2021):

If this is a problem you only ever have on reboots you might want to configure dehydrated to put the lockfile into a directory that's mounted in memory (e.g. /dev/shm or /run), that way it can't persist over a reboot. Alternatively you could try running dehydrated using systemd services and timers, that way systemd should be able to wait for dehydrated to finish or at least stop it in a way that would trigger the exit trap.

I'm leaving your pull-request #814 open for now. This is something I really really need to test on lots of platforms before I can merge or implement something similar to it. Having a simple lockfile is just one of the easiest solutions that I'm quite sure will work on older and embedded Linux systems, weird WSL things, BSD systems, etc.

@lukas2511 commented on GitHub (Apr 25, 2021): If this is a problem you only ever have on reboots you might want to configure dehydrated to put the lockfile into a directory that's mounted in memory (e.g. `/dev/shm` or `/run`), that way it can't persist over a reboot. Alternatively you could try running dehydrated using systemd services and timers, that way systemd should be able to wait for dehydrated to finish or at least stop it in a way that would trigger the exit trap. I'm leaving your pull-request #814 open for now. This is something I really really need to test on lots of platforms before I can merge or implement something similar to it. Having a simple lockfile is just one of the easiest solutions that I'm quite sure will work on older and embedded Linux systems, weird WSL things, BSD systems, etc.
Author
Owner

@jomat commented on GitHub (Jun 29, 2021):

It also happens in low memory situations:

 + Checking domain name(s) of existing cert... unchanged.
 + Checking expire date of existing cert...
 + Valid till Jul  6 14:00:42 2021 GMT (Less than 30 days). Renewing!
 + Signing domains...
 + Generating private key...
 + Generating signing request...
 + Requesting new certificate order from CA...
/opt/dehydrated/dehydrated: line 964: /usr/bin/tr: Cannot allocate memory
/opt/dehydrated/dehydrated: fork: Cannot allocate memory
/opt/dehydrated/dehydrated: fork: Cannot allocate memory
/opt/dehydrated/dehydrated: fork: Cannot allocate memory
/opt/dehydrated/dehydrated: fork: Cannot allocate memory
/opt/dehydrated/dehydrated -c -g  23.13s user 3.86s system 9% cpu 4:39.88 total
254 root@server ~ # /opt/dehydrated/dehydrated -c -g
# INFO: Using main config file /opt/dehydrated/config
ERROR: Lock file '/opt/dehydrated/lock' present, aborting.

Imho it'd be better to close my PR when it's not suitable and let this issue open?

@jomat commented on GitHub (Jun 29, 2021): It also happens in low memory situations: ``` + Checking domain name(s) of existing cert... unchanged. + Checking expire date of existing cert... + Valid till Jul 6 14:00:42 2021 GMT (Less than 30 days). Renewing! + Signing domains... + Generating private key... + Generating signing request... + Requesting new certificate order from CA... /opt/dehydrated/dehydrated: line 964: /usr/bin/tr: Cannot allocate memory /opt/dehydrated/dehydrated: fork: Cannot allocate memory /opt/dehydrated/dehydrated: fork: Cannot allocate memory /opt/dehydrated/dehydrated: fork: Cannot allocate memory /opt/dehydrated/dehydrated: fork: Cannot allocate memory /opt/dehydrated/dehydrated -c -g 23.13s user 3.86s system 9% cpu 4:39.88 total 254 root@server ~ # /opt/dehydrated/dehydrated -c -g # INFO: Using main config file /opt/dehydrated/config ERROR: Lock file '/opt/dehydrated/lock' present, aborting. ``` Imho it'd be better to close my PR when it's not suitable and let this issue open?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/dehydrated#524