mirror of
https://github.com/juanfont/headscale.git
synced 2026-01-11 20:00:28 +01:00
Closed
opened 2025-12-29 02:20:19 +01:00 by adam
·
30 comments
No Branch/Tag Specified
main
update_flake_lock_action
gh-pages
kradalby/release-v0.27.2
dependabot/go_modules/golang.org/x/crypto-0.45.0
dependabot/go_modules/github.com/opencontainers/runc-1.3.3
copilot/investigate-headscale-issue-2788
copilot/investigate-visibility-issue-2788
copilot/investigate-issue-2833
copilot/debug-issue-2846
copilot/fix-issue-2847
dependabot/go_modules/github.com/go-viper/mapstructure/v2-2.4.0
dependabot/go_modules/github.com/docker/docker-28.3.3incompatible
kradalby/cli-experiement3
doc/0.26.1
doc/0.25.1
doc/0.25.0
doc/0.24.3
doc/0.24.2
doc/0.24.1
doc/0.24.0
kradalby/build-docker-on-pr
topic/docu-versioning
topic/docker-kos
juanfont/fix-crash-node-id
juanfont/better-disclaimer
update-contributors
topic/prettier
revert-1893-add-test-stage-to-docs
add-test-stage-to-docs
remove-node-check-interval
fix-empty-prefix
fix-ephemeral-reusable
bug_report-debuginfo
autogroups
logs-to-stderr
revert-1414-topic/fix_unix_socket
rename-machine-node
port-embedded-derp-tests-v2
port-derp-tests
duplicate-word-linter
update-tailscale-1.36
warn-against-apache
ko-fi-link
more-acl-tests
fix-typo-standalone
parallel-nolint
tparallel-fix
rerouting
ssh-changelog-docs
oidc-cleanup
web-auth-flow-tests
kradalby-gh-runner
fix-proto-lint
remove-funding-links
go-1.19
enable-1.30-in-tests
0.16.x
cosmetic-changes-integration
tmp-fix-integration-docker
fix-integration-docker
configurable-update-interval
show-nodes-online
hs2021
acl-syntax-fixes
ts2021-implementation
fix-spurious-updates
unstable-integration-tests
mandatory-stun
embedded-derp
prtemplate-fix
v0.28.0-beta.1
v0.27.2-rc.1
v0.27.1
v0.27.0
v0.27.0-beta.2
v0.27.0-beta.1
v0.26.1
v0.26.0
v0.26.0-beta.2
v0.26.0-beta.1
v0.25.1
v0.25.0
v0.25.0-beta.2
v0.24.3
v0.25.0-beta.1
v0.24.2
v0.24.1
v0.24.0
v0.24.0-beta.2
v0.24.0-beta.1
v0.23.0
v0.23.0-rc.1
v0.23.0-beta.5
v0.23.0-beta.4
v0.23.0-beta3
v0.23.0-beta2
v0.23.0-beta1
v0.23.0-alpha12
v0.23.0-alpha11
v0.23.0-alpha10
v0.23.0-alpha9
v0.23.0-alpha8
v0.23.0-alpha7
v0.23.0-alpha6
v0.23.0-alpha5
v0.23.0-alpha4
v0.23.0-alpha4-docker-ko-test9
v0.23.0-alpha4-docker-ko-test8
v0.23.0-alpha4-docker-ko-test7
v0.23.0-alpha4-docker-ko-test6
v0.23.0-alpha4-docker-ko-test5
v0.23.0-alpha-docker-release-test-debug2
v0.23.0-alpha-docker-release-test-debug
v0.23.0-alpha4-docker-ko-test4
v0.23.0-alpha4-docker-ko-test3
v0.23.0-alpha4-docker-ko-test2
v0.23.0-alpha4-docker-ko-test
v0.23.0-alpha3
v0.23.0-alpha2
v0.23.0-alpha1
v0.22.3
v0.22.2
v0.23.0-alpha-docker-release-test
v0.22.1
v0.22.0
v0.22.0-alpha3
v0.22.0-alpha2
v0.22.0-alpha1
v0.22.0-nfpmtest
v0.21.0
v0.20.0
v0.19.0
v0.19.0-beta2
v0.19.0-beta1
v0.18.0
v0.18.0-beta4
v0.18.0-beta3
v0.18.0-beta2
v0.18.0-beta1
v0.17.1
v0.17.0
v0.17.0-beta5
v0.17.0-beta4
v0.17.0-beta3
v0.17.0-beta2
v0.17.0-beta1
v0.17.0-alpha4
v0.17.0-alpha3
v0.17.0-alpha2
v0.17.0-alpha1
v0.16.4
v0.16.3
v0.16.2
v0.16.1
v0.16.0
v0.16.0-beta7
v0.16.0-beta6
v0.16.0-beta5
v0.16.0-beta4
v0.16.0-beta3
v0.16.0-beta2
v0.16.0-beta1
v0.15.0
v0.15.0-beta6
v0.15.0-beta5
v0.15.0-beta4
v0.15.0-beta3
v0.15.0-beta2
v0.15.0-beta1
v0.14.0
v0.14.0-beta2
v0.14.0-beta1
v0.13.0
v0.13.0-beta3
v0.13.0-beta2
v0.13.0-beta1
upstream/v0.12.4
v0.12.4
v0.12.3
v0.12.2
v0.12.2-beta1
v0.12.1
v0.12.0-beta2
v0.12.0-beta1
v0.11.0
v0.10.8
v0.10.7
v0.10.6
v0.10.5
v0.10.4
v0.10.3
v0.10.2
v0.10.1
v0.10.0
v0.9.3
v0.9.2
v0.9.1
v0.9.0
v0.8.1
v0.8.0
v0.7.1
v0.7.0
v0.6.1
v0.6.0
v0.5.2
v0.5.1
v0.5.0
v0.4.0
v0.3.6
v0.3.5
v0.3.4
v0.3.3
v0.3.2
v0.3.1
v0.3.0
v0.2.2
v0.2.1
v0.2.0
v0.1.1
v0.1.0
Labels
Clear labels
CLI
DERP
DNS
Nix
OIDC
SSH
bug
database
documentation
duplicate
enhancement
faq
good first issue
grants
help wanted
might-come
needs design doc
needs investigation
no-stale-bot
out of scope
performance
policy 📝
pull-request
question
regression
routes
stale
tags
tailscale-feature-gap
well described ❤️
wontfix
Mirrored from GitHub Pull Request
Milestone
No items
No Milestone
Projects
Clear projects
No project
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: starred/headscale#559
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @vsychov on GitHub (Sep 26, 2023).
Bug description
Hello there,
After #1492 was merged, I noticed the following two issues, which may be related, so I'm posting this as a single problem:
The
offers exit nodeflag takes quite some time to appear after route approvals, and nodes status displayed as "offline" (and this is main problem), I tested this with the following configuration:tailscale up --auth-key XXXX --advertise-exit-node --login-server=https://headscale-test.example.com/ --advertise-routes=10.0.0.0/8 --accept-routes --accept-dns --ssh --shields-up=false(one user and one reusable auth-key)1.50.0of tailscale for all clients, but the same problem is present on1.48.2I connected all clients, confirmed all routes, and at
2023-09-26T10:32:26Zgot the following result of theheadscale route listcommand:At
2023-09-26T10:32:45Z, I ran the command/Applications/Tailscale.app/Contents/MacOS/Tailscale statuson macOS, and saw the following result (tmp-tailscale-fra1-01aleady displayed as offline):At
2023-09-26T10:33:15Z, the nodetmp-tailscale-fra1-02displayed as offline, and still, no one was offering an exit-node:At
2023-09-26T10:34:39Z, the nodetmp-tailscale-fra1-03started to be displayed as offering an exit-node (and I noticed, that its hostname changed):The situation remained the same until
2023-09-26T10:39:50Zwhen I stopped the test, here is the result ofheadscale nodes listat that time:All 3 Linux clients are on the same network, have the same connectivity with headscale (the command works the same on all machines, example with
tmp-tailscale-fra1-03):with version 0.22.3 it's works well.
Also there is headscale logs, that shows, that cominnication with nodes displayed as "offline" is was going.
Environment
main(01b85e5232)1.50.0@vsychov commented on GitHub (Sep 26, 2023):
Headscale logs
@vsychov commented on GitHub (Sep 26, 2023):
@kradalby
@kradalby commented on GitHub (Sep 26, 2023):
@vsychov Thanks for the awesome and comprehensive writeup, I truly appreciate it.
I will try to get to this shortly, I have some theories, but need some time to sit down with it.
@kradalby commented on GitHub (Dec 9, 2023):
I've merged a bunch of fixes in #1564, please give it a go and come back to me.
@kradalby commented on GitHub (Dec 10, 2023):
0.23.0-alpha2 addresses a series of issues with node synchronisation, online status and subnet routers, please test this release and report back if the issue still persist.
@kradalby commented on GitHub (Feb 15, 2024):
Please give https://github.com/juanfont/headscale/releases/tag/v0.23.0-alpha4 a swing
@almereyda commented on GitHub (Feb 15, 2024):
To note, the tag of the container image moved from
0.23.0-alpha3tov0.23.0-alpha4by introducing thevin the beginning.We also had to change the command that is run in the container from
headscale servetoserve.Also the syntax from database configuration in
config.yamlchanged slightly in94b30abf56#1700.This also comes with a new parameter
automatically_add_embedded_derp_region: trueone wants to copy. I'm regularily runningdiff config/config.yaml config-example.yamlon a fresh download of the example to find eventual, sic, differences.Although the image possibly misses a directory at
/var/run/headscale/, since the daemon cannot bind the gRPC socket to the expected location, due tono such file or directory. We are solving this by mapping an empty directory at the location.All of these are probably worth noting as breaking changes, as they will possibly prohibit a seamless upgrade path.
@TotoTheDragon commented on GitHub (Feb 15, 2024):
Made an issue regarding config migrations for you https://github.com/juanfont/headscale/issues/1758
Regarding the docker part, that is of course not officially supported and not well documented. If you feel it is a big issue as of now, feel free to create an issue (and possibly PR).
@kradalby commented on GitHub (Feb 19, 2024):
Could you please test if this is still the case with https://github.com/juanfont/headscale/releases/tag/v0.23.0-alpha5 ?
@almereyda commented on GitHub (Feb 19, 2024):
Thank you, very well.
Upgrading
alpha4toalpha5(in a container) has Headscale end up in a startup crash loop:This is alleviated with changing the name of the configuration key
ip_prefixestoprefixesand changing it into a dictionary with separate keys forv6andv4, as the diff tells.It will be good to include this information in the migration guide as well.
If there is interest, I can take look at https://github.com/juanfont/headscale/blob/main/docs/running-headscale-container.md again and prepare a PR including the current changes.
And because we fixed the broken migration manually, we cannot run the new migration, as the column to be deleted does not exist right now #1748.
Running
alter table nodes add column last_successful_update timestamp;allowed to apply the migration in our special case andalpha5is now running as expected.@kradalby commented on GitHub (Apr 17, 2024):
Could you please try the newest alpha (https://github.com/juanfont/headscale/releases/tag/v0.23.0-alpha6) and report back?
@vsychov commented on GitHub (Apr 18, 2024):
Thanks @kradalby , I'll make tests today or tomorrow
@vsychov commented on GitHub (Apr 20, 2024):
Hello @kradalby , I tried testing it on the
alpha-8revision but couldn't manage it due to an issue with ACL. Here's the ACL file:I created two nodes:
tmp-tailscale-1-ams3-dotmp-tailscale-2-ams3-doThen, I connected them to headscale using a pre-auth key for the user
user.example.com:Node details:
However, both nodes are invisible to each other:
Additionally, I noticed that there are no IP addresses in
headscale node ls, which might also be incorrect.@kradalby commented on GitHub (Apr 29, 2024):
Yes the IPs is strange I suspect they might be the case. Is this sqlite or postgres? do you have the config?
Edit: the config might not have the new IP prefix syntax?
@vsychov commented on GitHub (Apr 29, 2024):
@kradalby, you are right! There were missing
prefixesin configuration, probably it's also a validation bug. I'll try testing with the new config.@kradalby commented on GitHub (Apr 29, 2024):
Yea, we need to throw an error if there are no prefixes!
@kradalby commented on GitHub (Apr 30, 2024):
https://github.com/juanfont/headscale/releases/tag/v0.23.0-alpha10 was also released to address a couple of regressions in the ACL.
@kradalby commented on GitHub (Apr 30, 2024):
and addressed no prefix issue in https://github.com/juanfont/headscale/pull/1918
@kradalby commented on GitHub (May 1, 2024):
@vsychov let me know when you have had a time to give this a go, if these issues are resolved, I will tag a beta release after resolving one other issue.
@vsychov commented on GitHub (May 2, 2024):
@kradalby , I was just about to write that everything was fine, but it seems something went wrong. I deployed a test environment consisting of 3 machines and one headscale control server.
I left it running for about 3 days, and for the first 24 hours, it worked super stably. Subnet routers failed over very quickly, there were no problems with nodes going offline, and so on. Everything seemed just fine. But literally just before writing here, I decided to do a retest and found that both subnet routers became
Primary == false.Consequently, from the clients, the subnet 10.0.0.0/8 became inaccessible.
Testing was performed on version 0.23.0-alpha8.
If I can provide any additional information that would help identify the cause of this behavior, let me know, and I'll try to get it (perhaps logs would be useful or something else).
@kradalby commented on GitHub (May 2, 2024):
@vsychov do you have the log of the machine? that would be helpful.
If you cannot share it, it would be useful to see if you see a lot log lines with
rejectedin them.@vsychov commented on GitHub (May 2, 2024):
@kradalby, which machine's log? Where was headscale run? Or machine with tailscale?
@kradalby commented on GitHub (May 2, 2024):
Headscale please
@vsychov commented on GitHub (May 2, 2024):
I noticed that it was at the 'INFO' level, so there are no lines with
rejected. I'll switch it totracelevel and rerun the tests. Hopefully, I can reproduce it again. However, there's a full log available; perhaps it contains something useful. Additionally, would it make sense to downgrade logs like 'Sending Changed MapResponse' or 'received stream update' to debug level from info?headscale.log
@vsychov commented on GitHub (May 2, 2024):
I've switched the logs to trace mode, and I suggest waiting a few more days in an attempt to reproduce the issue. Meanwhile, as far as I can see, headscale doesn't seem to perform route reselection upon restart, or it seems better to do it once in a while, for example, every 10 seconds (the time could be configurable), by going through all routes and finding those without any 'Primary' nodes, and forcibly performing reselections. This way, it seems possible to protect against some failures when routes are chosen based on events (as I understand it, that's how it's currently done).
@vsychov commented on GitHub (May 8, 2024):
I'm writing an update on the testing results. Since I started running
headscalewith logs intracemode, I still haven't been able to reproduce this issue. It seems to be a very rare case, and everything else looks very stable at the moment.I'll continue to keep the
headscaletest instance running in the hope that I can reproduce this situation and gather more logs. Perhaps I'll increase the size of the test infrastructure, add more clients, and try to reproduce the problem in that scenario. However, I still believe that a better solution would be to write a background goroutine that monitors all routes and, in case a route doesn't have anPrimarynode, selects one. I can try to submit a PR with this approach if you approve @kradalby .@kradalby commented on GitHub (May 24, 2024):
I've fixed up some of the things that could cause a deadlock in https://github.com/juanfont/headscale/releases/tag/v0.23.0-alpha12, I will leave this open for you to test, and hopefully we can close this.
@almereyda commented on GitHub (May 24, 2024):
Extremely thorough progress, plus attentive and reactive intervention. The way this came out gives a lot of confidence with regards to #1072.
@juanfont commented on GitHub (Jul 22, 2024):
We reckon this is now fixed, after a significant redesign of the state machine.
Can you open a new issue should this be present in the new release?
@vsychov commented on GitHub (Jul 22, 2024):
I completed all my tests, and am not able to reproduce this issue anymore. Thanks for your great job!