[Bug] panic: public key moved between nodeIDs (dup node entry) #701

Closed
opened 2025-12-29 02:22:27 +01:00 by adam · 3 comments
Owner

Originally created by @raffis on GitHub (May 3, 2024).

Is this a support request?

  • This is not a support request

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

Our tailscale crashed this morning, meaning all tailscale subnetroutes/proxies were crashing with.

tailscale crashlog

2024/05/03 07:05:36 active login: subnetrouter-eu-west-1
panic: public key moved between nodeIDs (old=nodeid:29 new=nodeid:2a, key=nodekey:6fe699260a1c5811e29651e22d63bc9c7b4a232d9b6f0c0f79ef1976453ed506)

goroutine 311 [running]:
tailscale.com/wgengine/magicsock.devPanicf({0x11222b9, 0x38}, {0xc000ddbf00, 0x3, 0x3})
	tailscale.com/wgengine/magicsock/magicsock.go:2061 +0x74
tailscale.com/wgengine/magicsock.(*Conn).SetNetworkMap(0xc000170808, 0xc0030f5688)
	tailscale.com/wgengine/magicsock/magicsock.go:1982 +0x8e5
tailscale.com/wgengine.(*userspaceEngine).SetNetworkMap(0xc0001d6608, 0xc0030f5688)
	tailscale.com/wgengine/userspace.go:1212 +0x28
tailscale.com/wgengine.(*watchdogEngine).SetNetworkMap.func1()
	tailscale.com/wgengine/watchdog.go:142 +0x23
tailscale.com/wgengine.(*watchdogEngine).SetNetworkMap.(*watchdogEngine).watchdog.func2()
	tailscale.com/wgengine/watchdog.go:118 +0x13
tailscale.com/wgengine.(*watchdogEngine).watchdogErr.func2()
	tailscale.com/wgengine/watchdog.go:84 +0x23
created by tailscale.com/wgengine.(*watchdogEngine).watchdogErr in goroutine 179
	tailscale.com/wgengine/watchdog.go:83 +0x25e
boot: 2024/05/03 07:05:41 failed to watch tailscaled for updates: Failed to connect to local Tailscale daemon for /localapi/v0/watch-ipn-bus; not running? Error: dial unix /tmp/tailscaled.sock: connect: connection refused

After wrongly assuming its these clients I realized this was triggered by a new user login and it somehow created two node entries with different public keys in the same second (nano second is different though). This looks like a race condition to me.

Expected Behavior

Preferably no crash from tailscale clients but this is out of scope here. So the expected behavior is do not create duplicate node entries under no circumstances.

Steps To Reproduce

Race condition, probably hard to reproduce.

Environment

- OS: Linux
- Headscale version: v0.22.3
- Tailscale version: 1.63.0

Runtime environment

  • Headscale is behind a (reverse) proxy
  • Headscale runs in a container

Anything else?

Not certain if this is fixed in the latest alpha? I didn't find related issues.
If not I'm also happy to dig through the code and submit a fix.

Originally created by @raffis on GitHub (May 3, 2024). ### Is this a support request? - [X] This is not a support request ### Is there an existing issue for this? - [X] I have searched the existing issues ### Current Behavior Our tailscale crashed this morning, meaning all tailscale subnetroutes/proxies were crashing with. tailscale crashlog ``` 2024/05/03 07:05:36 active login: subnetrouter-eu-west-1 panic: public key moved between nodeIDs (old=nodeid:29 new=nodeid:2a, key=nodekey:6fe699260a1c5811e29651e22d63bc9c7b4a232d9b6f0c0f79ef1976453ed506) goroutine 311 [running]: tailscale.com/wgengine/magicsock.devPanicf({0x11222b9, 0x38}, {0xc000ddbf00, 0x3, 0x3}) tailscale.com/wgengine/magicsock/magicsock.go:2061 +0x74 tailscale.com/wgengine/magicsock.(*Conn).SetNetworkMap(0xc000170808, 0xc0030f5688) tailscale.com/wgengine/magicsock/magicsock.go:1982 +0x8e5 tailscale.com/wgengine.(*userspaceEngine).SetNetworkMap(0xc0001d6608, 0xc0030f5688) tailscale.com/wgengine/userspace.go:1212 +0x28 tailscale.com/wgengine.(*watchdogEngine).SetNetworkMap.func1() tailscale.com/wgengine/watchdog.go:142 +0x23 tailscale.com/wgengine.(*watchdogEngine).SetNetworkMap.(*watchdogEngine).watchdog.func2() tailscale.com/wgengine/watchdog.go:118 +0x13 tailscale.com/wgengine.(*watchdogEngine).watchdogErr.func2() tailscale.com/wgengine/watchdog.go:84 +0x23 created by tailscale.com/wgengine.(*watchdogEngine).watchdogErr in goroutine 179 tailscale.com/wgengine/watchdog.go:83 +0x25e boot: 2024/05/03 07:05:41 failed to watch tailscaled for updates: Failed to connect to local Tailscale daemon for /localapi/v0/watch-ipn-bus; not running? Error: dial unix /tmp/tailscaled.sock: connect: connection refused ``` After wrongly assuming its these clients I realized this was triggered by a new user login and it somehow created two node entries with different public keys in the same second (nano second is different though). This looks like a race condition to me. ### Expected Behavior Preferably no crash from tailscale clients but this is out of scope here. So the expected behavior is do not create duplicate node entries under no circumstances. ### Steps To Reproduce Race condition, probably hard to reproduce. ### Environment ```markdown - OS: Linux - Headscale version: v0.22.3 - Tailscale version: 1.63.0 ``` ### Runtime environment - [ ] Headscale is behind a (reverse) proxy - [x] Headscale runs in a container ### Anything else? Not certain if this is fixed in the latest alpha? I didn't find related issues. If not I'm also happy to dig through the code and submit a fix.
adam added the stalebug labels 2025-12-29 02:22:27 +01:00
adam closed this issue 2025-12-29 02:22:28 +01:00
Author
Owner

@kradalby commented on GitHub (May 27, 2024):

hmm, I would be curious to know if you see or are able to reproduce in an alpha, I'm not sure if its fixed explicitly, but a lot has changed. If still happening, an integration test case would be great to reproduce.

@kradalby commented on GitHub (May 27, 2024): hmm, I would be curious to know if you see or are able to reproduce in an alpha, I'm not sure if its fixed explicitly, but a lot has changed. If still happening, an integration test case would be great to reproduce.
Author
Owner

@github-actions[bot] commented on GitHub (Aug 26, 2024):

This issue is stale because it has been open for 90 days with no activity.

@github-actions[bot] commented on GitHub (Aug 26, 2024): This issue is stale because it has been open for 90 days with no activity.
Author
Owner

@github-actions[bot] commented on GitHub (Sep 2, 2024):

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions[bot] commented on GitHub (Sep 2, 2024): This issue was closed because it has been inactive for 14 days since being marked as stale.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/headscale#701