[BUG] Lose Update to 0.23.0 (lost mtu) #861

New Issue

adam · 2025-12-29T02:24:57+01:00

adam commented

2025-12-29 02:24:57 +01:00

Originally created by @pstvasko on GitHub (Nov 21, 2024).

Is this a support request?

This is not a support request

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

The tunnel breaks on gate1.

Expected Behavior

Dont lost traffic

Steps To Reproduce

Hi. After updating to version 23, there is an issue in the Tailscale network.
I have a complex network connecting two Tailscale installations:
100.64.0.0 - headscale1 - gate1 - gate2 - headscale2 - 100.80.0.0

When I download between 100.64.0.0 and 100.80.0.0, the speed reaches a maximum of 2 Gbps, and some issues start occurring. Packets stop flowing on the segment 100.64.0.0 - headscale1 (although if I reduce the MTU to 932, pings work).

There are about 500 clients in the network. Could you advise in which direction I should investigate?

Environment

- OS: AlmaLinux8
- Headscale version: 0.23.0
- Tailscale version: 1.76.6

Runtime environment

Headscale is behind a (reverse) proxy
Headscale runs in a container

Anything else?

Headscale

2024-11-21T21:31:33Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771
2024-11-21T21:31:35Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771
2024-11-21T21:31:37Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771
2024-11-21T21:31:38Z ERR Failed to fetch node from the database with node key: nodekey:10715a5defd407c11146b436449e3fdc771d8e4adc68b8dac0077e5e3d64d370 handler=NoisePollNetMap
2024-11-21T21:31:39Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771
2024-11-21T21:31:40Z INF home/runner/work/headscale/headscale/hscontrol/auth_noise.go:44 > unsupported client connected client_version=58 min_version=61
2024-11-21T21:31:41Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771
2024-11-21T21:31:41Z INF home/runner/work/headscale/headscale/hscontrol/auth_noise.go:44 > unsupported client connected client_version=58 min_version=61
2024-11-21T21:31:42Z INF home/runner/work/headscale/headscale/hscontrol/auth.go:28 > Successfully sent auth url: https://headscale.*****/oidc/register/mkey:ad30ca2d2f62ca426624930d6455211e40554add598c1a99420ffc8e6a2d8c0c expiry=-62135596800 followup=https://headscale.*****/oidc/register/mkey:ad30ca2d2f62ca426624930d6455211e40554add598c1a99420ffc8e6a2d8c0c machine_key=[rTDKL] node=vm-po4 node_key=[QxxVm] node_key_old=[bYjMr]
2024-11-21T21:31:43Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771
2024-11-21T21:31:45Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771```

tailscale:

```22 00:31:10 tailscaled[1378041]: wgengine: idle peer [Jpvng] now active, reconfiguring WireGuard 
22 00:31:10 tailscaled[1378041]: wgengine: Reconfig: configuring userspace WireGuard config (with 70/459 peers) 
22 00:31:20 tailscaled[1378041]: wgengine: Reconfig: configuring userspace WireGuard config (with 69/459 peers) 
22 00:31:35 tailscaled[1378041]: open-conn-track: flow TCP (TCP 10.10.0.105:44438 => 100.80.0.2:9188) got RST by peer 
22 00:31:38 tailscaled[1378041]: open-conn-track: flow TCP (TCP 10.10.0.105:55990 => 100.80.0.2:9187) got RST by peer 
22 00:31:38 tailscaled[1378041]: control: NetInfo: NetInfo{varies=false hairpin= ipv6=false ipv6os=false udp=true icmpv4=false derp=#999 portmap= link="" firewallmode="ipt-default"} 
22 00:31:53 tailscaled[1378041]: wgengine: idle peer [Qif6f] now active, reconfiguring WireGuard 
22 00:31:53 tailscaled[1378041]: wgengine: Reconfig: configuring userspace WireGuard config (with 69/459 peers) 
22 00:32:05 tailscaled[1378041]: open-conn-track: flow TCP (TCP 10.10.0.105:36450 => 100.80.0.2:9188) got RST by peer 
22 00:32:08 tailscaled[1378041]: open-conn-track: flow TCP (TCP 10.10.0.105:43332 => 100.80.0.2:9187) got RST by peer 
22 00:32:10 tailscaled[1378041]: wgengine: idle peer [qoqFH] now active, reconfiguring WireGuard 
22 00:32:10 tailscaled[1378041]: wgengine: Reconfig: configuring userspace WireGuard config (with 70/459 peers)```

Originally created by @pstvasko on GitHub (Nov 21, 2024). ### Is this a support request? - [X] This is not a support request ### Is there an existing issue for this? - [X] I have searched the existing issues ### Current Behavior The tunnel breaks on gate1. ### Expected Behavior Dont lost traffic ### Steps To Reproduce Hi. After updating to version 23, there is an issue in the Tailscale network. I have a complex network connecting two Tailscale installations: 100.64.0.0 - headscale1 - gate1 - gate2 - headscale2 - 100.80.0.0 When I download between 100.64.0.0 and 100.80.0.0, the speed reaches a maximum of 2 Gbps, and some issues start occurring. Packets stop flowing on the segment 100.64.0.0 - headscale1 (although if I reduce the MTU to 932, pings work). There are about 500 clients in the network. Could you advise in which direction I should investigate? ### Environment ```markdown - OS: AlmaLinux8 - Headscale version: 0.23.0 - Tailscale version: 1.76.6 ``` ### Runtime environment - [ ] Headscale is behind a (reverse) proxy - [X] Headscale runs in a container ### Anything else? Headscale ```2024-11-21T21:31:32Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771 2024-11-21T21:31:33Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771 2024-11-21T21:31:35Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771 2024-11-21T21:31:37Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771 2024-11-21T21:31:38Z ERR Failed to fetch node from the database with node key: nodekey:10715a5defd407c11146b436449e3fdc771d8e4adc68b8dac0077e5e3d64d370 handler=NoisePollNetMap 2024-11-21T21:31:39Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771 2024-11-21T21:31:40Z INF home/runner/work/headscale/headscale/hscontrol/auth_noise.go:44 > unsupported client connected client_version=58 min_version=61 2024-11-21T21:31:41Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771 2024-11-21T21:31:41Z INF home/runner/work/headscale/headscale/hscontrol/auth_noise.go:44 > unsupported client connected client_version=58 min_version=61 2024-11-21T21:31:42Z INF home/runner/work/headscale/headscale/hscontrol/auth.go:28 > Successfully sent auth url: https://headscale.*****/oidc/register/mkey:ad30ca2d2f62ca426624930d6455211e40554add598c1a99420ffc8e6a2d8c0c expiry=-62135596800 followup=https://headscale.*****/oidc/register/mkey:ad30ca2d2f62ca426624930d6455211e40554add598c1a99420ffc8e6a2d8c0c machine_key=[rTDKL] node=vm-po4 node_key=[QxxVm] node_key_old=[bYjMr] 2024-11-21T21:31:43Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771 2024-11-21T21:31:45Z ERR update not sent, context cancelled error="context deadline exceeded" node.id=771``` tailscale: ```22 00:31:10 tailscaled[1378041]: wgengine: idle peer [Jpvng] now active, reconfiguring WireGuard 22 00:31:10 tailscaled[1378041]: wgengine: Reconfig: configuring userspace WireGuard config (with 70/459 peers) 22 00:31:20 tailscaled[1378041]: wgengine: Reconfig: configuring userspace WireGuard config (with 69/459 peers) 22 00:31:35 tailscaled[1378041]: open-conn-track: flow TCP (TCP 10.10.0.105:44438 => 100.80.0.2:9188) got RST by peer 22 00:31:38 tailscaled[1378041]: open-conn-track: flow TCP (TCP 10.10.0.105:55990 => 100.80.0.2:9187) got RST by peer 22 00:31:38 tailscaled[1378041]: control: NetInfo: NetInfo{varies=false hairpin= ipv6=false ipv6os=false udp=true icmpv4=false derp=#999 portmap= link="" firewallmode="ipt-default"} 22 00:31:53 tailscaled[1378041]: wgengine: idle peer [Qif6f] now active, reconfiguring WireGuard 22 00:31:53 tailscaled[1378041]: wgengine: Reconfig: configuring userspace WireGuard config (with 69/459 peers) 22 00:32:05 tailscaled[1378041]: open-conn-track: flow TCP (TCP 10.10.0.105:36450 => 100.80.0.2:9188) got RST by peer 22 00:32:08 tailscaled[1378041]: open-conn-track: flow TCP (TCP 10.10.0.105:43332 => 100.80.0.2:9187) got RST by peer 22 00:32:10 tailscaled[1378041]: wgengine: idle peer [qoqFH] now active, reconfiguring WireGuard 22 00:32:10 tailscaled[1378041]: wgengine: Reconfig: configuring userspace WireGuard config (with 70/459 peers)```

adam added the stale bug labels 2025-12-29 02:24:57 +01:00

adam closed this issue

2025-12-29 02:24:57 +01:00

adam commented

2025-12-29 02:24:57 +01:00

@kradalby commented on GitHub (Nov 22, 2024):

Have you changed the Tailscale version between these nodes recently? I am not rolling out that it would be some parameter changed in Headscale, but I am surprised if that change would have any real impact on the client side, changes on the client would be more expected if I was to guess.

Can you try with multiple Tailscale versions vs multiple Headscale versions?

@kradalby commented on GitHub (Nov 22, 2024): Have you changed the Tailscale version between these nodes recently? I am not rolling out that it would be some parameter changed in Headscale, but I am surprised if that change would have any real impact on the client side, changes on the client would be more expected if I was to guess. Can you try with multiple Tailscale versions vs multiple Headscale versions?

adam commented

2025-12-29 02:24:58 +01:00

@pstvasko commented on GitHub (Nov 26, 2024):

I tested all possible versions supported by 0.23.
Additionally, the connection is restored after reconnecting to the network.

I’m having issues with Tailscale. Specifically, after reaching a speed of 2 Gbps, the connection drops after about a minute and doesn’t recover until I restart Telnet itself.

At the same time, I can ping the node like this:
ping -s 900 100.64.0.1

But not like this:
ping -s 1400 100.64.0.1

I have around 500 clients and a complex network between data centers.

@pstvasko commented on GitHub (Nov 26, 2024): I tested all possible versions supported by 0.23. Additionally, the connection is restored after reconnecting to the network. I’m having issues with Tailscale. Specifically, after reaching a speed of 2 Gbps, the connection drops after about a minute and doesn’t recover until I restart Telnet itself. At the same time, I can ping the node like this: ping -s 900 100.64.0.1 But not like this: ping -s 1400 100.64.0.1 I have around 500 clients and a complex network between data centers. ![image](https://github.com/user-attachments/assets/aede9382-1402-4c70-af07-233930bc9e0c)

adam commented

2025-12-29 02:24:58 +01:00

@github-actions[bot] commented on GitHub (Feb 25, 2025):

This issue is stale because it has been open for 90 days with no activity.

@github-actions[bot] commented on GitHub (Feb 25, 2025): This issue is stale because it has been open for 90 days with no activity.

adam commented

2025-12-29 02:24:58 +01:00

@github-actions[bot] commented on GitHub (Mar 4, 2025):

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions[bot] commented on GitHub (Mar 4, 2025): This issue was closed because it has been inactive for 14 days since being marked as stale.

adam commented

2025-12-29 02:24:59 +01:00

@andreyrd commented on GitHub (Jun 12, 2025):

Seeing something like this. As soon as I start seeing the "context deadline exceeded" errors in my Headscale logs, I'm almost guaranteed to start having connectivity issues between nodes, but "timeout" instead of "RST".

@andreyrd commented on GitHub (Jun 12, 2025): Seeing something like this. As soon as I start seeing the "context deadline exceeded" errors in my Headscale logs, I'm almost guaranteed to start having connectivity issues between nodes, but "timeout" instead of "RST".

Sign in to join this conversation.

Branches Tags

main

update_flake_lock_action

gh-pages

kradalby/release-v0.27.2

dependabot/go_modules/golang.org/x/crypto-0.45.0

dependabot/go_modules/github.com/opencontainers/runc-1.3.3

copilot/investigate-headscale-issue-2788

copilot/investigate-visibility-issue-2788

copilot/investigate-issue-2833

copilot/debug-issue-2846

copilot/fix-issue-2847

dependabot/go_modules/github.com/go-viper/mapstructure/v2-2.4.0

dependabot/go_modules/github.com/docker/docker-28.3.3incompatible

kradalby/cli-experiement3

doc/0.26.1

doc/0.25.1

doc/0.25.0

doc/0.24.3

doc/0.24.2

doc/0.24.1

doc/0.24.0

kradalby/build-docker-on-pr

topic/docu-versioning

topic/docker-kos

juanfont/fix-crash-node-id

juanfont/better-disclaimer

update-contributors

topic/prettier

revert-1893-add-test-stage-to-docs

add-test-stage-to-docs

remove-node-check-interval

fix-empty-prefix

fix-ephemeral-reusable

bug_report-debuginfo

autogroups

logs-to-stderr

revert-1414-topic/fix_unix_socket

rename-machine-node

port-embedded-derp-tests-v2

port-derp-tests

duplicate-word-linter

update-tailscale-1.36

warn-against-apache

ko-fi-link

more-acl-tests

fix-typo-standalone

parallel-nolint

tparallel-fix

rerouting

ssh-changelog-docs

oidc-cleanup

web-auth-flow-tests

kradalby-gh-runner

fix-proto-lint

remove-funding-links

go-1.19

enable-1.30-in-tests

0.16.x

cosmetic-changes-integration

tmp-fix-integration-docker

fix-integration-docker

configurable-update-interval

show-nodes-online

hs2021

acl-syntax-fixes

ts2021-implementation

fix-spurious-updates

unstable-integration-tests

mandatory-stun

embedded-derp

prtemplate-fix

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: starred/headscale#861