[Bug] Sometimes the advertised route is not set as primary route #1147

Closed
opened 2025-12-29 02:28:34 +01:00 by adam · 7 comments
Owner

Originally created by @YouSysAdmin on GitHub (Nov 15, 2025).

Is this a support request?

  • This is not a support request

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

Sometimes for a new node with an automatically approved route, a route is not set as the primary what affect access of users to network behind this node.
This is fixed by restarting Headscale, after restart a routes are set correctly.

ID Hostname Approved Available Serving (Primary)
81 bastion 10.4.0.0/16 10.4.0.0/16

Looks like a regression for v0.27.*, I haven't seen this behavior before.

This is a fairly rare occurrence, I've get it happen a few times while testing the installation and logout the client via EC2 User-Data.
Approximately 1:10.

P.S. I haven't checked what's in the database at this moment, I'll try to check that.

Expected Behavior

ID Hostname Approved Available Serving (Primary)
81 bastion 10.4.0.0/16 10.4.0.0/16 10.4.0.0/16

Steps To Reproduce

Add a new node with an advertised route

Environment

- OS:
- Headscale version:
- Tailscale version:

Runtime environment

  • Headscale is behind a (reverse) proxy
  • Headscale runs in a container

Debug information

the same for both cases

Originally created by @YouSysAdmin on GitHub (Nov 15, 2025). ### Is this a support request? - [x] This is not a support request ### Is there an existing issue for this? - [x] I have searched the existing issues ### Current Behavior Sometimes for a new node with an automatically approved route, a route is not set as the primary what affect access of users to network behind this node. This is fixed by restarting Headscale, after restart a routes are set correctly. |ID | Hostname | Approved | Available | Serving (Primary)| |-|-|-|-|-| |81 | bastion | 10.4.0.0/16 | 10.4.0.0/16 | | Looks like a regression for v0.27.*, I haven't seen this behavior before. This is a fairly rare occurrence, I've get it happen a few times while testing the installation and logout the client via EC2 User-Data. Approximately 1:10. P.S. I haven't checked what's in the database at this moment, I'll try to check that. ### Expected Behavior |ID | Hostname | Approved | Available | Serving (Primary)| |-|-|-|-|-| |81 | bastion | 10.4.0.0/16 | 10.4.0.0/16 | 10.4.0.0/16 | ### Steps To Reproduce Add a new node with an advertised route ### Environment ```markdown - OS: - Headscale version: - Tailscale version: ``` ### Runtime environment - [ ] Headscale is behind a (reverse) proxy - [ ] Headscale runs in a container ### Debug information the same for both cases
adam added the bugregression labels 2025-12-29 02:28:34 +01:00
adam closed this issue 2025-12-29 02:28:34 +01:00
Author
Owner

@tobi-dub commented on GitHub (Nov 17, 2025):

Same behavior on my network.
When a node with already approved route restarts, the route is not served again.

Restarting headscale solves the issue. It takes a couple of seconds until the route gets served automatically.

@tobi-dub commented on GitHub (Nov 17, 2025): Same behavior on my network. When a node with already approved route restarts, the route is not served again. Restarting headscale solves the issue. It takes a couple of seconds until the route gets served automatically.
Author
Owner

@kradalby commented on GitHub (Nov 25, 2025):

@tobi-dub I've pushed some more changes to https://github.com/juanfont/headscale/pull/2890, can you try again?

@kradalby commented on GitHub (Nov 25, 2025): @tobi-dub I've pushed some more changes to https://github.com/juanfont/headscale/pull/2890, can you try again?
Author
Owner

@YouSysAdmin commented on GitHub (Nov 26, 2025):

Hi @kradalby
I built headscale from your branch and tested it

u@ip:~/headscale$ git branch
* kradalby/2888-oidc-pol

looks like the behavior persists

I got the same result - the route was applied normally once, but not the second time.

tailscale up --advertise-routes="10.4.0.0/16"  --auth-key="secret" --login-server="example.com" 
...
tailscale logout

Attempt 1:

ID Hostname Approved Available Serving (Primary)
98 bastion 10.4.0.0/16 10.4.0.0/16 10.4.0.0/16

Attempt 2:

ID Hostname Approved Available Serving (Primary)
99 bastion 10.4.0.0/16 10.4.0.0/16

I'll try and provide anonymized logs later, unfortunately I don't have time to deal with logs right now.

@YouSysAdmin commented on GitHub (Nov 26, 2025): Hi @kradalby I built headscale from your branch and tested it ``` u@ip:~/headscale$ git branch * kradalby/2888-oidc-pol ``` looks like the behavior persists I got the same result - the route was applied normally once, but not the second time. ``` tailscale up --advertise-routes="10.4.0.0/16" --auth-key="secret" --login-server="example.com" ... tailscale logout ``` Attempt 1: |ID | Hostname | Approved | Available | Serving (Primary)| |-|-|-|-|-| |98 | bastion | 10.4.0.0/16 | 10.4.0.0/16 | 10.4.0.0/16 | Attempt 2: |ID | Hostname | Approved | Available | Serving (Primary)| |-|-|-|-|-| |99 | bastion | 10.4.0.0/16 | 10.4.0.0/16 | | I'll try and provide anonymized logs later, unfortunately I don't have time to deal with logs right now.
Author
Owner

@kradalby commented on GitHub (Nov 30, 2025):

I've made a rc.1 release for 0.27.2 with fixes, would be great if you can test this and then close this (or give feedback so I can).

@kradalby commented on GitHub (Nov 30, 2025): I've made a [rc.1 release for 0.27.2](https://github.com/juanfont/headscale/releases/tag/v0.27.2-rc.1) with fixes, would be great if you can test this and then close this (or give feedback so I can).
Author
Owner

@tobi-dub commented on GitHub (Nov 30, 2025):

I've made a rc.1 release for 0.27.2 with fixes, would be great if you can test this and then close this (or give feedback so I can).

I tested the rc.1 in my setup successfully. The subnet router is switched automatically to the backup node when the primary one is unavailable.

Don't know why but it now also worked for v0.27.1. @YouSysAdmin Can you also confirm this maybe?

@tobi-dub commented on GitHub (Nov 30, 2025): > I've made a [rc.1 release for 0.27.2](https://github.com/juanfont/headscale/releases/tag/v0.27.2-rc.1) with fixes, would be great if you can test this and then close this (or give feedback so I can). I tested the rc.1 in my setup successfully. The subnet router is switched automatically to the backup node when the primary one is unavailable. Don't know why but it now also worked for v0.27.1. @YouSysAdmin Can you also confirm this maybe?
Author
Owner

@kradalby commented on GitHub (Dec 1, 2025):

Don't know why but it now also worked for v0.27.1. @YouSysAdmin Can you also confirm this maybe?

hmm, sounds like there might be something flaky somewhere. But happy to hear it works.

@kradalby commented on GitHub (Dec 1, 2025): > Don't know why but it now also worked for v0.27.1. @YouSysAdmin Can you also confirm this maybe? hmm, sounds like there might be something flaky somewhere. But happy to hear it works.
Author
Owner

@YouSysAdmin commented on GitHub (Dec 9, 2025):

Hi @kradalby
I tested v0.27.2-rc.1, new nodes work fine, but for ephemeral nodes there are problems.
(Perhaps this does not apply specifically to ephemeral ones and is also true for ordinary ones, sorry, I have not checked and will not be able to check in the near future)

  1. Connect a new "ephemeral" client.
  2. Check that all routes are automatically approved.
  3. Disconnect a client (logout).
  4. Verify that a node and route have been deleted.
  5. Connect a new client with the same routes.

After client logout in log output error

{"level":"error","error":"generating map response for node 102: generating map response for nodeID 102: multiple errors:\n\tnode not found\n\tnode not found\n\tnode not found\n\tnode not found\n\tnode not found\n\tnode not found","worker.id":1,"node.id":0,"change":"Full","time":1765269026,"message":"failed to apply change"}

After reconnecting a client, have the same behavior - Serving (Primary) is not set until restart Headscale.
If I reboot Headscale before connecting a client, everything will work as expected.

{"level":"info","node.id":102,"node.name":"bastion","time":1765269025,"message":"Deleting ephemeral node during logout"}
{"level":"error","error":"generating map response for node 102: generating map response for nodeID 102: multiple errors:\n\tnode not found\n\tnode not found\n\tnode not found\n\tnode not found\n\tnode not found\n\tnode not found","worker.id":1,"node.id":0,"change":"Full","time":1765269026,"message":"failed to apply change"}
{"level":"error","caller":"/home/runner/work/headscale/headscale/hscontrol/poll.go:401","omitPeers":false,"stream":true,"node.id":102,"node.name":"bastion","error":"node not found: 102","time":1765269034,"message":"Failed to disconnect node bastion"}
{"level":"info","caller":"/home/runner/work/headscale/headscale/hscontrol/poll.go:383","omitPeers":false,"stream":true,"node.id":102,"node.name":"bastion","time":1765269034,"message":"node has disconnected, mapSession: 0xc0003a4f00, chan: 0xc0003f4770"}
{"level":"error","error":"generating map response for node 102: generating map response for nodeID 102: multiple errors:\n\tnode not found\n\tnode not found","worker.id":1,"node.id":24,"change":"NodeNewOrUpdate","time":1765269049,"message":"failed to apply change"}
@YouSysAdmin commented on GitHub (Dec 9, 2025): Hi @kradalby I tested v0.27.2-rc.1, new nodes work fine, but for `ephemeral` nodes there are problems. _(Perhaps this does not apply specifically to ephemeral ones and is also true for ordinary ones, sorry, I have not checked and will not be able to check in the near future)_ 1. Connect a new "ephemeral" client. 2. Check that all routes are automatically approved. 3. Disconnect a client (logout). 4. Verify that a node and route have been deleted. 5. Connect a new client with the same routes. After client logout in log output error ```json {"level":"error","error":"generating map response for node 102: generating map response for nodeID 102: multiple errors:\n\tnode not found\n\tnode not found\n\tnode not found\n\tnode not found\n\tnode not found\n\tnode not found","worker.id":1,"node.id":0,"change":"Full","time":1765269026,"message":"failed to apply change"} ``` After reconnecting a client, have the same behavior - `Serving (Primary)` is not set until restart Headscale. If I reboot Headscale before connecting a client, everything will work as expected. ```json {"level":"info","node.id":102,"node.name":"bastion","time":1765269025,"message":"Deleting ephemeral node during logout"} {"level":"error","error":"generating map response for node 102: generating map response for nodeID 102: multiple errors:\n\tnode not found\n\tnode not found\n\tnode not found\n\tnode not found\n\tnode not found\n\tnode not found","worker.id":1,"node.id":0,"change":"Full","time":1765269026,"message":"failed to apply change"} {"level":"error","caller":"/home/runner/work/headscale/headscale/hscontrol/poll.go:401","omitPeers":false,"stream":true,"node.id":102,"node.name":"bastion","error":"node not found: 102","time":1765269034,"message":"Failed to disconnect node bastion"} {"level":"info","caller":"/home/runner/work/headscale/headscale/hscontrol/poll.go:383","omitPeers":false,"stream":true,"node.id":102,"node.name":"bastion","time":1765269034,"message":"node has disconnected, mapSession: 0xc0003a4f00, chan: 0xc0003f4770"} {"level":"error","error":"generating map response for node 102: generating map response for nodeID 102: multiple errors:\n\tnode not found\n\tnode not found","worker.id":1,"node.id":24,"change":"NodeNewOrUpdate","time":1765269049,"message":"failed to apply change"} ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/headscale#1147