[Bug] Headscale serving subnet from offline node. #1092

Closed
opened 2025-12-29 02:28:13 +01:00 by adam · 7 comments
Owner

Originally created by @pupaxxo on GitHub (Aug 22, 2025).

Is this a support request?

  • This is not a support request

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

Headscale is serving routes from an offline node:

The nodes list command output is:

ID | Hostname                    | Name                        | MachineKey | NodeKey | User     | IP addresses                    | Ephemeral | Last seen           | Expiration          | Connected | Expired
3  | <redacted>                         | <redacted>                         | [0eCEd]    | [/eEPF] | internal | 100.65.0.4, fd7a:115c:a1e1::4   | false     | 2025-08-22 10:04:40 | N/A                 | online    | no
4  | <redacted> | <redacted> | [1T/pb]    | [n9Srg] | cassa    | 100.65.0.5, fd7a:115c:a1e1::5   | false     | 2025-08-22 09:45:02 | N/A                 | online    | no
6  | <redacted>        | <redacted>           | [lrXNR]    | [yhjhi] | cassa    | 100.65.0.7, fd7a:115c:a1e1::7   | false     | 2025-07-04 08:17:20 | 1970-01-01 00:02:03 | offline   | yes
7  | <redacted>                      | <redacted>                      | [Cpdpz]    | [f6fqC] | cassa    | 100.65.0.1, fd7a:115c:a1e1::1   | false     | 2025-08-22 10:05:40 | N/A                 | offline   | no
8  | <redacted>                      | <redacted>                      | [vqbGX]    | [N+zcw] | cassa    | 100.65.0.6, fd7a:115c:a1e1::6   | false     | 2025-08-22 10:03:27 | N/A                 | offline   | no
9  | <redacted>                      | <redacted>                      | [MNbbI]    | [88f3J] | cassa    | 100.65.0.8, fd7a:115c:a1e1::8   | false     | 2025-08-22 10:12:26 | N/A                 | online    | no
10 | <redacted>                      | <redacted>                      | [WwX6Q]    | [WEaUd] | cassa    | 100.65.0.9, fd7a:115c:a1e1::9   | false     | 2025-08-22 10:08:11 | N/A                 | online    | no
11 | <redacted>                     | <redacted>                     | [xkeLX]    | [5XTVi] | cassa    | 100.65.0.11, fd7a:115c:a1e1::b  | false     | 2025-08-22 10:10:01 | N/A                 | online    | no
12 | <redacted>         | <redacted>         | [Cv2I7]    | [g5cTw] | cassa    | 100.65.0.12, fd7a:115c:a1e1::c  | false     | 2025-08-22 10:13:49 | N/A                 | offline   | no
13 | <redacted>              | <redacted>              | [sbrlf]    | [YQdPd] | cassa    | 100.65.0.13, fd7a:115c:a1e1::d  | false     | 2025-08-22 10:13:05 | N/A                 | offline   | no
14 | <redacted>                | <redacted>                | [elQmK]    | [2sMcI] | cassa    | 100.65.0.16, fd7a:115c:a1e1::10 | false     | 2025-08-22 09:24:10 | N/A                 | offline   | no

The routes list response is:

ID | Hostname                    | Approved                                                           | Available                                                          | Serving (Primary)
4  | <redacted> | 192.168.200.80/32                                                  | 192.168.200.80/32                                                  | 192.168.200.80/32
6  | <redacted>           | 192.168.2.82/32                                                    | 192.168.2.82/32                                                    |
7  | <redacted>                      | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 |
8  | <redacted>                      | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 |
9  | <redacted>                      | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 |
10 | <redacted>                      | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 |
11 | <redacted>                     | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 |
12 | <redacted>         | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 |
13 | <redacted>              | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 |
14 | <redacted>                | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32

As you can see the node 14 is serving the primary route, but the node 14 is marked as "offline".

Expected Behavior

The serving routes switches to an online node.

Steps To Reproduce

  1. Approve the same route on multiple nodes.
  2. Put a node offline.

Environment

- OS: Debian
- Headscale version: 0.26.1
- Tailscale version: Latest

Runtime environment

  • Headscale is behind a (reverse) proxy
  • Headscale runs in a container

Debug information

The policy allows all traffic. Headscale has the default configuration.

Originally created by @pupaxxo on GitHub (Aug 22, 2025). ### Is this a support request? - [x] This is not a support request ### Is there an existing issue for this? - [x] I have searched the existing issues ### Current Behavior Headscale is serving routes from an offline node: The `nodes list` command output is: ``` ID | Hostname | Name | MachineKey | NodeKey | User | IP addresses | Ephemeral | Last seen | Expiration | Connected | Expired 3 | <redacted> | <redacted> | [0eCEd] | [/eEPF] | internal | 100.65.0.4, fd7a:115c:a1e1::4 | false | 2025-08-22 10:04:40 | N/A | online | no 4 | <redacted> | <redacted> | [1T/pb] | [n9Srg] | cassa | 100.65.0.5, fd7a:115c:a1e1::5 | false | 2025-08-22 09:45:02 | N/A | online | no 6 | <redacted> | <redacted> | [lrXNR] | [yhjhi] | cassa | 100.65.0.7, fd7a:115c:a1e1::7 | false | 2025-07-04 08:17:20 | 1970-01-01 00:02:03 | offline | yes 7 | <redacted> | <redacted> | [Cpdpz] | [f6fqC] | cassa | 100.65.0.1, fd7a:115c:a1e1::1 | false | 2025-08-22 10:05:40 | N/A | offline | no 8 | <redacted> | <redacted> | [vqbGX] | [N+zcw] | cassa | 100.65.0.6, fd7a:115c:a1e1::6 | false | 2025-08-22 10:03:27 | N/A | offline | no 9 | <redacted> | <redacted> | [MNbbI] | [88f3J] | cassa | 100.65.0.8, fd7a:115c:a1e1::8 | false | 2025-08-22 10:12:26 | N/A | online | no 10 | <redacted> | <redacted> | [WwX6Q] | [WEaUd] | cassa | 100.65.0.9, fd7a:115c:a1e1::9 | false | 2025-08-22 10:08:11 | N/A | online | no 11 | <redacted> | <redacted> | [xkeLX] | [5XTVi] | cassa | 100.65.0.11, fd7a:115c:a1e1::b | false | 2025-08-22 10:10:01 | N/A | online | no 12 | <redacted> | <redacted> | [Cv2I7] | [g5cTw] | cassa | 100.65.0.12, fd7a:115c:a1e1::c | false | 2025-08-22 10:13:49 | N/A | offline | no 13 | <redacted> | <redacted> | [sbrlf] | [YQdPd] | cassa | 100.65.0.13, fd7a:115c:a1e1::d | false | 2025-08-22 10:13:05 | N/A | offline | no 14 | <redacted> | <redacted> | [elQmK] | [2sMcI] | cassa | 100.65.0.16, fd7a:115c:a1e1::10 | false | 2025-08-22 09:24:10 | N/A | offline | no ``` The routes list response is: ``` ID | Hostname | Approved | Available | Serving (Primary) 4 | <redacted> | 192.168.200.80/32 | 192.168.200.80/32 | 192.168.200.80/32 6 | <redacted> | 192.168.2.82/32 | 192.168.2.82/32 | 7 | <redacted> | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 8 | <redacted> | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 9 | <redacted> | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 10 | <redacted> | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 11 | <redacted> | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 12 | <redacted> | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 13 | <redacted> | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 14 | <redacted> | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 | 192.168.3.44/32, 192.168.3.60/32, 192.168.3.63/32, 192.168.3.68/32 ``` As you can see the node 14 is serving the primary route, but the node 14 is marked as "offline". ### Expected Behavior The serving routes switches to an online node. ### Steps To Reproduce 1. Approve the same route on multiple nodes. 2. Put a node offline. ### Environment ```markdown - OS: Debian - Headscale version: 0.26.1 - Tailscale version: Latest ``` ### Runtime environment - [x] Headscale is behind a (reverse) proxy - [ ] Headscale runs in a container ### Debug information The policy allows all traffic. Headscale has the default configuration.
adam added the questionbugroutes labels 2025-12-29 02:28:13 +01:00
adam closed this issue 2025-12-29 02:28:13 +01:00
Author
Owner

@nblock commented on GitHub (Aug 22, 2025):

The serving routes switches to an online node.

How went the node 14 offline (os shutdown, stop tailscaled, poweroff, …)? Do you see anything related in the logs for node 14? Does the switch to another node work if you "properly" stop tailscaled on node 14?

You might be hit by this: https://headscale.net/development/ref/routes/#high-availability

@nblock commented on GitHub (Aug 22, 2025): > The serving routes switches to an online node. How went the node 14 offline (os shutdown, stop tailscaled, poweroff, …)? Do you see anything related in the logs for node 14? Does the switch to another node work if you "properly" stop tailscaled on node 14? You might be hit by this: https://headscale.net/development/ref/routes/#high-availability
Author
Owner

@pupaxxo commented on GitHub (Aug 22, 2025):

Hi,

the node was not properly shutdown, but, from the headscale nodes list command output, headscale seems to detect the node as offline. The other node that was not properly shutdown required a few minutes to start beign detected as offline.

@pupaxxo commented on GitHub (Aug 22, 2025): Hi, the node was not properly shutdown, but, from the headscale nodes list command output, headscale seems to detect the node as offline. The other node that was not properly shutdown required a few minutes to start beign detected as offline.
Author
Owner

@nblock commented on GitHub (Aug 23, 2025):

the node was not properly shutdown, but, from the headscale nodes list command output, headscale seems to detect the node as offline. The other node that was not properly shutdown required a few minutes to start beign detected as offline.

Can you check if node switching is fast when the primary node is properly shutdown? It should be fairly quick.

It seems to be related to: #2129

@nblock commented on GitHub (Aug 23, 2025): > the node was not properly shutdown, but, from the headscale nodes list command output, headscale seems to detect the node as offline. The other node that was not properly shutdown required a few minutes to start beign detected as offline. Can you check if node switching is fast when the primary node is properly shutdown? It should be fairly quick. It seems to be related to: #2129
Author
Owner

@kradalby commented on GitHub (Sep 9, 2025):

Do you have the opportunity to test main? The logic has changed a bit, and might have improved with latest changes.

@kradalby commented on GitHub (Sep 9, 2025): Do you have the opportunity to test `main`? The logic has changed a bit, and might have improved with latest changes.
Author
Owner

@nblock commented on GitHub (Oct 19, 2025):

@pupaxxo It'd be great if you could test with 0.27.0-beta.1.

@nblock commented on GitHub (Oct 19, 2025): @pupaxxo It'd be great if you could test with [0.27.0-beta.1](https://github.com/juanfont/headscale/releases/tag/v0.27.0-beta.1).
Author
Owner

@pupaxxo commented on GitHub (Oct 20, 2025):

Hi! Sorry for the dalayed response, the system is actually beign used and it's not so easy to test, I'l try to schedule a live-test with the customer in the next days.

@pupaxxo commented on GitHub (Oct 20, 2025): Hi! Sorry for the dalayed response, the system is actually beign used and it's not so easy to test, I'l try to schedule a live-test with the customer in the next days.
Author
Owner

@nblock commented on GitHub (Nov 12, 2025):

It should work in 0.27.1. Let us know if you still find this issue in your environment.

@nblock commented on GitHub (Nov 12, 2025): It should work in 0.27.1. Let us know if you still find this issue in your environment.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/headscale#1092