Move per-node pending changes from a shared xsync.Map on the batcher
into multiChannelNodeConn, protected by a dedicated mutex. The new
appendPending/drainPending methods provide atomic append and drain
operations, eliminating data races in addToBatch and
processBatchedChanges.
Add sync.Once to multiChannelNodeConn.close() to make it idempotent,
preventing panics from concurrent close calls on the same channel.
Add started atomic.Bool to guard Start() against being called
multiple times, preventing orphaned goroutines.
Add comprehensive concurrency tests validating these changes.
Generalise the registration pipeline to a more general auth pipeline
supporting both node registrations and SSH check auth requests.
Rename RegistrationID to AuthID, unexport AuthRequest fields, and
introduce AuthVerdict to unify the auth finish API.
Add the urlParam generic helper for extracting typed URL parameters
from chi routes, used by the new auth request handler.
Updates #1850
Fix issues found by the upgraded golangci-lint:
- wsl_v5: add required whitespace in CLI files
- staticcheck SA4006: replace new(var.Field) with &localVar
pattern since staticcheck does not recognize Go 1.26
new(value) as a use of the variable
- staticcheck SA5011: use t.Fatal instead of t.Error for
nil guard checks so execution stops
- unused: remove dead ptrTo helper function
This commit upgrades the codebase from Go 1.25.5 to Go 1.26rc2 and
adopts new language features.
Toolchain updates:
- go.mod: go 1.25.5 → go 1.26rc2
- flake.nix: buildGo125Module → buildGo126Module, go_1_25 → go_1_26
- flake.nix: build golangci-lint from source with Go 1.26
- Dockerfile.integration: golang:1.25-trixie → golang:1.26rc2-trixie
- Dockerfile.tailscale-HEAD: golang:1.25-alpine → golang:1.26rc2-alpine
- Dockerfile.derper: golang:alpine → golang:1.26rc2-alpine
- .goreleaser.yml: go mod tidy -compat=1.25 → -compat=1.26
- cmd/hi/run.go: fallback Go version 1.25 → 1.26rc2
- .pre-commit-config.yaml: simplify golangci-lint hook entry
Code modernization using Go 1.26 features:
- Replace tsaddr.SortPrefixes with slices.SortFunc + netip.Prefix.Compare
- Replace ptr.To(x) with new(x) syntax
- Replace errors.As with errors.AsType[T]
Lint rule updates:
- Add forbidigo rules to prevent regression to old patterns
Errors should not start capitalised and they should not contain the word error
or state that they "failed" as we already know it is an error
Signed-off-by: Kristoffer Dalby <kristoffer@dalby.cc>
Add hscontrol/util/zlog package with:
- zf subpackage: field name constants for compile-time safety
- SafeHostinfo: wrapper that redacts device fingerprinting data
- SafeMapRequest: wrapper that redacts client endpoints
The zf (zerolog fields) subpackage provides short constant names
(e.g., zf.NodeID instead of inline "node.id" strings) ensuring
consistent field naming across all log statements.
Security considerations:
- SafeHostinfo never logs: OSVersion, DeviceModel, DistroName
- SafeMapRequest only logs endpoint counts, not actual IPs
This PR addresses some consistency issues that was introduced or discovered with the nodestore.
nodestore:
Now returns the node that is being put or updated when it is finished. This closes a race condition where when we read it back, we do not necessarily get the node with the given change and it ensures we get all the other updates from that batch write.
auth:
Authentication paths have been unified and simplified. It removes a lot of bad branches and ensures we only do the minimal work.
A comprehensive auth test set has been created so we do not have to run integration tests to validate auth and it has allowed us to generate test cases for all the branches we currently know of.
integration:
added a lot more tooling and checks to validate that nodes reach the expected state when they come up and down. Standardised between the different auth models. A lot of this is to support or detect issues in the changes to nodestore (races) and auth (inconsistencies after login and reaching correct state)
This PR was assisted, particularly tests, by claude code.
Initial work on a nodestore which stores all of the nodes
and their relations in memory with relationship for peers
precalculated.
It is a copy-on-write structure, replacing the "snapshot"
when a change to the structure occurs. It is optimised for reads,
and while batches are not fast, they are grouped together
to do less of the expensive peer calculation if there are many
changes rapidly.
Writes will block until commited, while reads are never
blocked.
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* utility iterator for ipset
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* split policy -> policy and v1
This commit split out the common policy logic and policy implementation
into separate packages.
policy contains functions that are independent of the policy implementation,
this typically means logic that works on tailcfg types and generic formats.
In addition, it defines the PolicyManager interface which the v1 implements.
v1 is a subpackage which implements the PolicyManager using the "original"
policy implementation.
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* use polivyv1 definitions in integration tests
These can be marshalled back into JSON, which the
new format might not be able to.
Also, just dont change it all to JSON strings for now.
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* formatter: breaks lines
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* remove compareprefix, use tsaddr version
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* remove getacl test, add back autoapprover
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* use policy manager tag handling
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* rename display helper for user
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* introduce policy v2 package
policy v2 is built from the ground up to be stricter
and follow the same pattern for all types of resolvers.
TODO introduce
aliass
resolver
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* wire up policyv2 in integration testing
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* split policy v2 tests into seperate workflow to work around github limit
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* add policy manager output to /debug
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* update changelog
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
---------
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
expand user, add claims to user
This commit expands the user table with additional fields that
can be retrieved from OIDC providers (and other places) and
uses this data in various tailscale response objects if it is
available.
This is the beginning of implementing
https://docs.google.com/document/d/1X85PMxIaVWDF6T_UPji3OeeUqVBcGj_uHRM5CI-AwlY/edit
trying to make OIDC more coherant and maintainable in addition
to giving the user a better experience and integration with a
provider.
remove usernames in magic dns, normalisation of emails
this commit removes the option to have usernames as part of MagicDNS
domains and headscale will now align with Tailscale, where there is a
root domain, and the machine name.
In addition, the various normalisation functions for dns names has been
made lighter not caring about username and special character that wont
occur.
Email are no longer normalised as part of the policy processing.
untagle oidc and regcache, use typed cache
This commits stops reusing the registration cache for oidc
purposes and switches the cache to be types and not use any
allowing the removal of a bunch of casting.
try to make reauth/register branches clearer in oidc
Currently there was a function that did a bunch of stuff,
finding the machine key, trying to find the node, reauthing
the node, returning some status, and it was called validate
which was very confusing.
This commit tries to split this into what to do if the node
exists, if it needs to register etc.
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* handle control protocol through websocket
The necessary behaviour is already in place,
but the wasm build only issued GETs, and the handler was not invoked.
* get DERP-over-websocket working for wasm clients
* Prepare for testing builtin websocket-over-DERP
Still needs some way to assert that clients are connected through websockets,
rather than the TCP hijacking version of DERP.
* integration tests: properly differentiate between DERP transports
* do not touch unrelated code
* linter fixes
* integration testing: unexport common implementation of derp server scenario
* fixup! integration testing: unexport common implementation of derp server scenario
* dockertestutil/logs: remove unhelpful comment
* update changelog
---------
Co-authored-by: Csaba Sarkadi <sarkadicsa@tutanota.de>