Crash due to number of open files #42

Closed
opened 2025-12-29 01:20:53 +01:00 by adam · 7 comments
Owner

Originally created by @diseq on GitHub (Oct 1, 2021).

headscale serve seems to accumulate number of open files and crashes.
What number is to be expected?

btw. headscale is awesome work!!

Version: 0.9.2

Oct 01 13:18:50 server-1 headscale[8588]: 2021-10-01T13:18:50Z ERR Error accessing db error="unable to open database file: too many open files"
Oct 01 13:18:50 server-1 headscale[8588]: 2021-10-01T13:18:50Z ERR Cannot fetch peers error="unable to open database file: too many open files" func=getMapResponse

for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done

PID   8588 has 1024 FDs

ps -ef |grep 8588

headsca+    8588       1 13 Sep30 ?        03:02:09 /usr/sbin/headscale serve

Config

{
    "server_url": "https://server.domain.com",
    "listen_addr": "0.0.0.0:8080",
    "private_key_path": "/etc/headscale/private.key",
    "derp_map_path": "/etc/headscale/derp.yaml",
    "ephemeral_node_inactivity_timeout": "30m",
    "db_type": "sqlite3",
    "db_path": "/mnt/data/headscale/db.sqlite",
    "tls_letsencrypt_hostname": "",
    "tls_letsencrypt_listen": ":http",
    "tls_letsencrypt_cache_dir": ".cache",
    "tls_letsencrypt_challenge_type": "HTTP-01",
    "tls_cert_path": "",
    "tls_key_path": "",
    "acl_policy_path": "/mnt/data/headscale/policy.hujson",
    "dns_config": {
        "nameservers": [
            "1.1.1.1"
        ]
    }
}
Originally created by @diseq on GitHub (Oct 1, 2021). headscale serve seems to accumulate number of open files and crashes. What number is to be expected? btw. headscale is awesome work!! Version: 0.9.2 ``` Oct 01 13:18:50 server-1 headscale[8588]: 2021-10-01T13:18:50Z ERR Error accessing db error="unable to open database file: too many open files" Oct 01 13:18:50 server-1 headscale[8588]: 2021-10-01T13:18:50Z ERR Cannot fetch peers error="unable to open database file: too many open files" func=getMapResponse ``` for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done ``` PID 8588 has 1024 FDs ``` ps -ef |grep 8588 ``` headsca+ 8588 1 13 Sep30 ? 03:02:09 /usr/sbin/headscale serve ``` Config ``` { "server_url": "https://server.domain.com", "listen_addr": "0.0.0.0:8080", "private_key_path": "/etc/headscale/private.key", "derp_map_path": "/etc/headscale/derp.yaml", "ephemeral_node_inactivity_timeout": "30m", "db_type": "sqlite3", "db_path": "/mnt/data/headscale/db.sqlite", "tls_letsencrypt_hostname": "", "tls_letsencrypt_listen": ":http", "tls_letsencrypt_cache_dir": ".cache", "tls_letsencrypt_challenge_type": "HTTP-01", "tls_cert_path": "", "tls_key_path": "", "acl_policy_path": "/mnt/data/headscale/policy.hujson", "dns_config": { "nameservers": [ "1.1.1.1" ] } } ```
adam closed this issue 2025-12-29 01:20:54 +01:00
Author
Owner

@qbit commented on GitHub (Oct 1, 2021):

I hit this as well. Mine is fronted by nginx - i meant to check and see if nginx is misbehaving or headscale but I haven't had time to track it down.

@qbit commented on GitHub (Oct 1, 2021): I hit this as well. Mine is fronted by nginx - i meant to check and see if nginx is misbehaving or headscale but I haven't had time to track it down.
Author
Owner

@diseq commented on GitHub (Oct 1, 2021):

same here. nginx in front.
it seems to happen after ephemeral nodes are added. Not sure if this is related. Might be a coincidence.

for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809
PID  78809 has   10 FDs

for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809
PID  78809 has   10 FDs

for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809
PID  78809 has   10 FDs

for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809
PID  78809 has   10 FDs

for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809
PID  78809 has   14 FDs

for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809
PID  78809 has   15 FDs

for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809
PID  78809 has   16 FDs

for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809
PID  78809 has   19 FDs
@diseq commented on GitHub (Oct 1, 2021): same here. nginx in front. it seems to happen after ephemeral nodes are added. Not sure if this is related. Might be a coincidence. ``` for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809 PID 78809 has 10 FDs for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809 PID 78809 has 10 FDs for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809 PID 78809 has 10 FDs for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809 PID 78809 has 10 FDs for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809 PID 78809 has 14 FDs for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809 PID 78809 has 15 FDs for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809 PID 78809 has 16 FDs for pid in /proc/[0-9]*; do printf "PID %6d has %4d FDs\n" $(basename $pid) $(ls $pid/fd | wc -l); done |grep 78809 PID 78809 has 19 FDs ```
Author
Owner

@juanfont commented on GitHub (Oct 1, 2021):

@qbit is it also happening for you with ephemeral nodes?

@juanfont commented on GitHub (Oct 1, 2021): @qbit is it also happening for you with ephemeral nodes?
Author
Owner

@qbit commented on GitHub (Oct 1, 2021):

I am not using pre-auth keys, so I don't think it is.

@qbit commented on GitHub (Oct 1, 2021): I am not using pre-auth keys, so I don't think it is.
Author
Owner

@qbit commented on GitHub (Oct 4, 2021):

I switched to a non-nginx configuration and things seem happy. I wonder if diddling nginx timeouts would resolve this?

Maybe proxy_connect_timeout 300; or something in the location block?

(I'll try to test the above, but it might be a bit before I can)

@qbit commented on GitHub (Oct 4, 2021): I switched to a non-nginx configuration and things seem happy. I wonder if diddling nginx timeouts would resolve this? Maybe `proxy_connect_timeout 300;` or something in the location block? (I'll try to test the above, but it might be a bit before I can)
Author
Owner

@diseq commented on GitHub (Oct 7, 2021):

0.9.3 seems to have resolved the accumulating open fds.
Testing on the same configuration, so I can rule out other changes.

will leave it for some time running.

@diseq commented on GitHub (Oct 7, 2021): 0.9.3 seems to have resolved the accumulating open fds. Testing on the same configuration, so I can rule out other changes. will leave it for some time running.
Author
Owner

@diseq commented on GitHub (Oct 8, 2021):

issue is gone. thanks!

@diseq commented on GitHub (Oct 8, 2021): issue is gone. thanks!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/headscale#42