[PR #2781] [MERGED] stability and race conditions in auth and node store #2859

Closed
opened 2025-12-29 04:19:23 +01:00 by adam · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/juanfont/headscale/pull/2781
Author: @kradalby
Created: 9/17/2025
Status: Merged
Merged: 10/16/2025
Merged by: @kradalby

Base: mainHead: kradalby/relogin


📝 Commits (10+)

  • 5e77373 integration: consistency changes to Relogin tests
  • 9308159 state: remove mutex
  • c6b0928 nodestore: return node when applied
  • 30da9e2 nodestore: tests
  • 68147d1 state: reorganise auth key path
  • 1f38f1f integration: rename sameuser test
  • e3a415b state: standardise and make less branches in auth
  • dde2628 nodestore: machinekey is not unique
  • 8fee833 integration: more tsic debug
  • 9913d2e integration: add missing auth cases

📊 Changes

34 files changed (+7390 additions, -1858 deletions)

View changed files

📝 .github/workflows/test-integration.yaml (+3 -1)
📝 hscontrol/auth.go (+210 -74)
hscontrol/auth_test.go (+3006 -0)
📝 hscontrol/grpcv1.go (+1 -1)
📝 hscontrol/poll.go (+6 -5)
📝 hscontrol/state/debug.go (+0 -6)
📝 hscontrol/state/debug_test.go (+2 -2)
hscontrol/state/ephemeral_test.go (+460 -0)
📝 hscontrol/state/maprequest.go (+2 -2)
📝 hscontrol/state/maprequest_test.go (+29 -3)
📝 hscontrol/state/node_store.go (+108 -33)
📝 hscontrol/state/node_store_test.go (+647 -8)
📝 hscontrol/state/state.go (+466 -520)
📝 hscontrol/types/node.go (+5 -0)
📝 hscontrol/types/users.go (+14 -6)
📝 hscontrol/util/util.go (+57 -0)
📝 hscontrol/util/util_test.go (+393 -0)
📝 integration/auth_key_test.go (+88 -111)
📝 integration/auth_oidc_test.go (+457 -113)
📝 integration/auth_web_flow_test.go (+205 -24)

...and 14 more files

📄 Description

This PR addresses some consistency issues that was introduced or discovered with the nodestore.

nodestore:
Now returns the node that is being put or updated when it is finished. This closes a race condition where when we read it back, we do not necessarily get the node with the given change and it ensures we get all the other updates from that batch write.

auth:
Authentication paths have been unified and simplified. It removes a lot of bad branches and ensures we only do the minimal work.
A comprehensive auth test set has been created so we do not have to run integration tests to validate auth and it has allowed us to generate test cases for all the branches we currently know of.

integration:
added a lot more tooling and checks to validate that nodes reach the expected state when they come up and down. Standardised between the different auth models. A lot of this is to support or detect issues in the changes to nodestore (races) and auth (inconsistencies after login and reaching correct state)

This PR was assisted, particularly tests, by claude code.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/juanfont/headscale/pull/2781 **Author:** [@kradalby](https://github.com/kradalby) **Created:** 9/17/2025 **Status:** ✅ Merged **Merged:** 10/16/2025 **Merged by:** [@kradalby](https://github.com/kradalby) **Base:** `main` ← **Head:** `kradalby/relogin` --- ### 📝 Commits (10+) - [`5e77373`](https://github.com/juanfont/headscale/commit/5e77373765138cf381bf27831344f09c3359ca58) integration: consistency changes to *Relogin* tests - [`9308159`](https://github.com/juanfont/headscale/commit/930815980f3332fa0bb2745a63586255e23d1895) state: remove mutex - [`c6b0928`](https://github.com/juanfont/headscale/commit/c6b09289988f34398eb3157e31ba092eb8721a9f) nodestore: return node when applied - [`30da9e2`](https://github.com/juanfont/headscale/commit/30da9e2cffd87cf2aade6896bfed82d248064a48) nodestore: tests - [`68147d1`](https://github.com/juanfont/headscale/commit/68147d1e40f42502b541fb1194d5277139054cea) state: reorganise auth key path - [`1f38f1f`](https://github.com/juanfont/headscale/commit/1f38f1faf3f049ea96e59aa91679b764fec02ad2) integration: rename sameuser test - [`e3a415b`](https://github.com/juanfont/headscale/commit/e3a415bd4810b8bc9ae0ff6b9007079d7d6170de) state: standardise and make less branches in auth - [`dde2628`](https://github.com/juanfont/headscale/commit/dde26286a5772fea753026b057f35240094e1e12) nodestore: machinekey is not unique - [`8fee833`](https://github.com/juanfont/headscale/commit/8fee83320b0e7d8f77adb2b1e3aa7f86042d7546) integration: more tsic debug - [`9913d2e`](https://github.com/juanfont/headscale/commit/9913d2e15aac3f3e59934c02cdcb4f54f0322850) integration: add missing auth cases ### 📊 Changes **34 files changed** (+7390 additions, -1858 deletions) <details> <summary>View changed files</summary> 📝 `.github/workflows/test-integration.yaml` (+3 -1) 📝 `hscontrol/auth.go` (+210 -74) ➕ `hscontrol/auth_test.go` (+3006 -0) 📝 `hscontrol/grpcv1.go` (+1 -1) 📝 `hscontrol/poll.go` (+6 -5) 📝 `hscontrol/state/debug.go` (+0 -6) 📝 `hscontrol/state/debug_test.go` (+2 -2) ➕ `hscontrol/state/ephemeral_test.go` (+460 -0) 📝 `hscontrol/state/maprequest.go` (+2 -2) 📝 `hscontrol/state/maprequest_test.go` (+29 -3) 📝 `hscontrol/state/node_store.go` (+108 -33) 📝 `hscontrol/state/node_store_test.go` (+647 -8) 📝 `hscontrol/state/state.go` (+466 -520) 📝 `hscontrol/types/node.go` (+5 -0) 📝 `hscontrol/types/users.go` (+14 -6) 📝 `hscontrol/util/util.go` (+57 -0) 📝 `hscontrol/util/util_test.go` (+393 -0) 📝 `integration/auth_key_test.go` (+88 -111) 📝 `integration/auth_oidc_test.go` (+457 -113) 📝 `integration/auth_web_flow_test.go` (+205 -24) _...and 14 more files_ </details> ### 📄 Description This PR addresses some consistency issues that was introduced or discovered with the nodestore. **nodestore**: Now returns the node that is being put or updated when it is finished. This closes a race condition where when we read it back, we do not necessarily get the node with the given change and it ensures we get all the other updates from that batch write. **auth**: Authentication paths have been unified and simplified. It removes a lot of bad branches and ensures we only do the minimal work. A comprehensive auth test set has been created so we do not have to run integration tests to validate auth and it has allowed us to generate test cases for all the branches we currently know of. **integration**: added a lot more tooling and checks to validate that nodes reach the expected state when they come up and down. Standardised between the different auth models. A lot of this is to support or detect issues in the changes to nodestore (races) and auth (inconsistencies after login and reaching correct state) This PR was assisted, particularly tests, by claude code. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
adam added the pull-request label 2025-12-29 04:19:23 +01:00
adam closed this issue 2025-12-29 04:19:23 +01:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/headscale#2859