mirror of
https://github.com/juanfont/headscale.git
synced 2026-01-11 20:00:28 +01:00
0.22.1 uses way, way more memory than 0.21? #491
Closed
opened 2025-12-29 02:19:02 +01:00 by adam
·
13 comments
No Branch/Tag Specified
main
update_flake_lock_action
gh-pages
kradalby/release-v0.27.2
dependabot/go_modules/golang.org/x/crypto-0.45.0
dependabot/go_modules/github.com/opencontainers/runc-1.3.3
copilot/investigate-headscale-issue-2788
copilot/investigate-visibility-issue-2788
copilot/investigate-issue-2833
copilot/debug-issue-2846
copilot/fix-issue-2847
dependabot/go_modules/github.com/go-viper/mapstructure/v2-2.4.0
dependabot/go_modules/github.com/docker/docker-28.3.3incompatible
kradalby/cli-experiement3
doc/0.26.1
doc/0.25.1
doc/0.25.0
doc/0.24.3
doc/0.24.2
doc/0.24.1
doc/0.24.0
kradalby/build-docker-on-pr
topic/docu-versioning
topic/docker-kos
juanfont/fix-crash-node-id
juanfont/better-disclaimer
update-contributors
topic/prettier
revert-1893-add-test-stage-to-docs
add-test-stage-to-docs
remove-node-check-interval
fix-empty-prefix
fix-ephemeral-reusable
bug_report-debuginfo
autogroups
logs-to-stderr
revert-1414-topic/fix_unix_socket
rename-machine-node
port-embedded-derp-tests-v2
port-derp-tests
duplicate-word-linter
update-tailscale-1.36
warn-against-apache
ko-fi-link
more-acl-tests
fix-typo-standalone
parallel-nolint
tparallel-fix
rerouting
ssh-changelog-docs
oidc-cleanup
web-auth-flow-tests
kradalby-gh-runner
fix-proto-lint
remove-funding-links
go-1.19
enable-1.30-in-tests
0.16.x
cosmetic-changes-integration
tmp-fix-integration-docker
fix-integration-docker
configurable-update-interval
show-nodes-online
hs2021
acl-syntax-fixes
ts2021-implementation
fix-spurious-updates
unstable-integration-tests
mandatory-stun
embedded-derp
prtemplate-fix
v0.28.0-beta.1
v0.27.2-rc.1
v0.27.1
v0.27.0
v0.27.0-beta.2
v0.27.0-beta.1
v0.26.1
v0.26.0
v0.26.0-beta.2
v0.26.0-beta.1
v0.25.1
v0.25.0
v0.25.0-beta.2
v0.24.3
v0.25.0-beta.1
v0.24.2
v0.24.1
v0.24.0
v0.24.0-beta.2
v0.24.0-beta.1
v0.23.0
v0.23.0-rc.1
v0.23.0-beta.5
v0.23.0-beta.4
v0.23.0-beta3
v0.23.0-beta2
v0.23.0-beta1
v0.23.0-alpha12
v0.23.0-alpha11
v0.23.0-alpha10
v0.23.0-alpha9
v0.23.0-alpha8
v0.23.0-alpha7
v0.23.0-alpha6
v0.23.0-alpha5
v0.23.0-alpha4
v0.23.0-alpha4-docker-ko-test9
v0.23.0-alpha4-docker-ko-test8
v0.23.0-alpha4-docker-ko-test7
v0.23.0-alpha4-docker-ko-test6
v0.23.0-alpha4-docker-ko-test5
v0.23.0-alpha-docker-release-test-debug2
v0.23.0-alpha-docker-release-test-debug
v0.23.0-alpha4-docker-ko-test4
v0.23.0-alpha4-docker-ko-test3
v0.23.0-alpha4-docker-ko-test2
v0.23.0-alpha4-docker-ko-test
v0.23.0-alpha3
v0.23.0-alpha2
v0.23.0-alpha1
v0.22.3
v0.22.2
v0.23.0-alpha-docker-release-test
v0.22.1
v0.22.0
v0.22.0-alpha3
v0.22.0-alpha2
v0.22.0-alpha1
v0.22.0-nfpmtest
v0.21.0
v0.20.0
v0.19.0
v0.19.0-beta2
v0.19.0-beta1
v0.18.0
v0.18.0-beta4
v0.18.0-beta3
v0.18.0-beta2
v0.18.0-beta1
v0.17.1
v0.17.0
v0.17.0-beta5
v0.17.0-beta4
v0.17.0-beta3
v0.17.0-beta2
v0.17.0-beta1
v0.17.0-alpha4
v0.17.0-alpha3
v0.17.0-alpha2
v0.17.0-alpha1
v0.16.4
v0.16.3
v0.16.2
v0.16.1
v0.16.0
v0.16.0-beta7
v0.16.0-beta6
v0.16.0-beta5
v0.16.0-beta4
v0.16.0-beta3
v0.16.0-beta2
v0.16.0-beta1
v0.15.0
v0.15.0-beta6
v0.15.0-beta5
v0.15.0-beta4
v0.15.0-beta3
v0.15.0-beta2
v0.15.0-beta1
v0.14.0
v0.14.0-beta2
v0.14.0-beta1
v0.13.0
v0.13.0-beta3
v0.13.0-beta2
v0.13.0-beta1
upstream/v0.12.4
v0.12.4
v0.12.3
v0.12.2
v0.12.2-beta1
v0.12.1
v0.12.0-beta2
v0.12.0-beta1
v0.11.0
v0.10.8
v0.10.7
v0.10.6
v0.10.5
v0.10.4
v0.10.3
v0.10.2
v0.10.1
v0.10.0
v0.9.3
v0.9.2
v0.9.1
v0.9.0
v0.8.1
v0.8.0
v0.7.1
v0.7.0
v0.6.1
v0.6.0
v0.5.2
v0.5.1
v0.5.0
v0.4.0
v0.3.6
v0.3.5
v0.3.4
v0.3.3
v0.3.2
v0.3.1
v0.3.0
v0.2.2
v0.2.1
v0.2.0
v0.1.1
v0.1.0
Labels
Clear labels
CLI
DERP
DNS
Nix
OIDC
SSH
bug
database
documentation
duplicate
enhancement
faq
good first issue
grants
help wanted
might-come
needs design doc
needs investigation
no-stale-bot
out of scope
performance
policy 📝
pull-request
question
regression
routes
stale
tags
tailscale-feature-gap
well described ❤️
wontfix
Mirrored from GitHub Pull Request
No Label
bug
Milestone
No items
No Milestone
Projects
Clear projects
No project
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: starred/headscale#491
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @linsomniac on GitHub (Apr 28, 2023).
I was running 0.21 on a instance with 2GB of RAM. I upgraded to 0.22.1 and it immediately thrashed itself to death. I upgraded the instance to 4GB and it still is pretty quickly thrashing. At the moment AWS won't let me upgrade it to 16GB. I have ~100 nodes in my headscale.
Is this known and expected?
@loprima-l commented on GitHub (Apr 28, 2023):
As far as I know, I'm only aware of CPU trouble on large installations, but that sounds "normal" as Headscale isn't optimized yet for large installations.
Are you sure that the problem came from RAM ?
@linsomniac commented on GitHub (Apr 28, 2023):
Yep, I'm sure the problem was RAM. I was getting OOM messages on the console.
I've regularly run into memory issues, originally was running on a 1GB machine, but started having both CPU and RAM issues when I added ~100 nodes. So I upped it. I do have fairly high disc I/O, I had to reduce my update time, I think I went from 10s to 30s.
vmstat during this is showing memory free (free+buf+cache) going down to ~100MB, and after I kill headscale it goes back up to 3.6GB free. During that time it was doing super heavy "block in" and wait cpu time was ~80%, so heavy disc activity, heavy read, heavy memory use.
Rememinder: I was running 0.21 in 2GB on this system, installed 0.22.1 and restarted headscale, and started getting OOMs. Doubled RAM and also was getting OOMs. Switched back to 0.21 and now have been running several hours and have 3GB free, 470M in buff/cache, 396MB in "used".
Seems to point to 0.22.1 having some dramatically higher memory use.
@loprima-l commented on GitHub (Apr 28, 2023):
Have you successfully backed up your system to the previous version ? I think it's the better option now.
I think your issue is related to another issue that we choose to not fix yet.
Fixing those performance issues means a lot to me as big environnement made easier to find bugs but can't be our priority. I'm gonna check it when as son as possible.
@loprima-l commented on GitHub (Apr 28, 2023):
Also, can you introduce a bit more to your Headscale instance, like why are you using Headscale, and what are your users ? Is it a prod environment ? Ect...
I'm interested to know what type of large infra are using Headscale
@loprima-l commented on GitHub (Apr 29, 2023):
Hi, I think you should give #1377 a try if you have a bunch of ACLs, because I think with 100+ machines you must have a lot of ACLs
@linsomniac commented on GitHub (Apr 29, 2023):
Yes, I have successfully returned to 0.21, I just had to wait for the OOM killer to make the system responsive enough to get a window to stop headscale and revert.
Why am I using headscale? I couldn't get buy in to purchase tailscale.
Size of ACLs: I have 3 groups, 5 subnets, 22 ACL rules, my entire acls.yaml is ~170 lines.
"headscale node list | wc" is 115 lines.
My environment is dev, staging, and production, mostly virtual machines and some AWS EC2 instances, mostly Linux. I deployed tailscale to all the dev/stg instances, and a handful of production instances (mostly administrative things and the firewalls as subnet routers). The users primarily are me and one of the other operations people, I'm still in a proof of concept mode. The longer term plan would be to bring on the ~8 developers and maybe a couple Q&A people, maybe up to 10 more.
@linsomniac commented on GitHub (Apr 29, 2023):
I've switched my EC2 instance to a t3a.xlarge with 16GB of RAM, and restarted headscale with 0.22.1, and watched as the free memory dipped down to 2GB, then it gradually returned to 14GB. Here's a sampling of vmstat output during this run:
Looks like it does that every time I restart it (was wondering if there was a one-time housekeeping).
Maybe it's some combination of 110-ish hosts and 20-ish ACLs? But something changed between 0.21, which I've been able to successfully run in 2GB of RAM, and 0.22.1, which is requiring ~14GB.
@loprima-l commented on GitHub (Apr 30, 2023):
Thanks for your reply, have u tried the patch in #1337 ?
@linsomniac commented on GitHub (May 1, 2023):
I haven't, I will give it a try probably this evening.
@linsomniac commented on GitHub (May 2, 2023):
It looks like #1377 is merged into main, so I grabbed that and built it and it does indeed seem to have solved the memory issue.
@loprima-l commented on GitHub (May 2, 2023):
Súper ! Are the performance better or érode than on 0.21 ?
@linsomniac commented on GitHub (May 2, 2023):
I only ran it a little bit, but performance seemed similar to 0.21. I really didn't do much testing of it. I had kind of a janky build, built against libraries in /nix, and I decided to go back to running 0.21 at the moment until the next release comes out. I couldn't seem to get the build to work, or at least couldn't find the resulting binary, when I did "go build", it wasn't writing to ~/go/bin like I was expecting.
@kradalby commented on GitHub (May 10, 2023):
We will release with #1377 in a bit, please test that and reopen if it still is an issue.