Commit Graph

67 Commits

Author SHA1 Message Date
Kristoffer Dalby
7bab8da366 state, policy, noise: implement SSH check period auto-approval
Add SSH check period tracking so that recently authenticated users
are auto-approved without requiring manual intervention each time.

Introduce SSHCheckPeriod type with validation (min 1m, max 168h,
"always" for every request) and encode the compiled check period
as URL query parameters in the HoldAndDelegate URL.

The SSHActionHandler checks recorded auth times before creating a
new HoldAndDelegate flow. Auth timestamps are stored in-memory:
- Default period (no explicit checkPeriod): auth covers any
  destination, keyed by source node with Dst=0 sentinel
- Explicit period: auth covers only that specific destination,
  keyed by (source, destination) pair

Auth times are cleared on policy changes.

Updates #1850
2026-02-25 21:28:05 +01:00
Kristoffer Dalby
107c2f2f70 policy, noise: implement SSH check action
Implement the SSH "check" action which requires additional
verification before allowing SSH access. The policy compiler generates
a HoldAndDelegate URL that the Tailscale client calls back to
headscale. The SSHActionHandler creates an auth session and waits for
approval via the generalised auth flow.

Sort check (HoldAndDelegate) rules before accept rules to match
Tailscale's first-match-wins evaluation order.

Updates #1850
2026-02-25 21:28:05 +01:00
Kristoffer Dalby
cb3b6949ea auth: generalise auth flow and introduce AuthVerdict
Generalise the registration pipeline to a more general auth pipeline
supporting both node registrations and SSH check auth requests.
Rename RegistrationID to AuthID, unexport AuthRequest fields, and
introduce AuthVerdict to unify the auth finish API.

Add the urlParam generic helper for extracting typed URL parameters
from chi routes, used by the new auth request handler.

Updates #1850
2026-02-25 21:28:05 +01:00
Kristoffer Dalby
8048f10d13 hscontrol/state: extract findExistingNodeForPAK to reduce complexity
Extract the existing-node lookup logic from HandleNodeFromPreAuthKey
into a separate method. This reduces the cyclomatic complexity from
32 to 28, below the gocyclo limit of 30.

Updates #3077
2026-02-20 21:51:00 +01:00
Kristoffer Dalby
75e56df9e4 hscontrol: enforce that tagged nodes never have user_id
Tagged nodes are owned by their tags, not a user. Enforce this
invariant at every write path:

- createAndSaveNewNode: do not set UserID for tagged PreAuthKey
  registration; clear UserID when advertise-tags are applied
  during OIDC/CLI registration
- SetNodeTags: clear UserID/User when tags are assigned
- processReauthTags: clear UserID/User when tags are applied
  during re-authentication
- validateNodeOwnership: reject tagged nodes with non-nil UserID
- NodeStore: skip nodesByUser indexing for tagged nodes since
  they have no owning user
- HandleNodeFromPreAuthKey: add fallback lookup for tagged PAK
  re-registration (tagged nodes indexed under UserID(0)); guard
  against nil User deref for tagged nodes in different-user check

Since tagged nodes now have user_id = NULL, ListNodesByUser
will not return them and DestroyUser naturally allows deleting
users whose nodes have all been tagged. The ON DELETE CASCADE
FK cannot reach tagged nodes through a NULL foreign key.

Also tone down shouty comments throughout state.go.

Fixes #3077
2026-02-20 21:51:00 +01:00
Kristoffer Dalby
f20bd0cf08 node: implement disable key expiry via CLI and API
Add --disable flag to "headscale nodes expire" CLI command and
disable_expiry field handling in the gRPC API to allow disabling
key expiry for nodes. When disabled, the node's expiry is set to
NULL and IsExpired() returns false.

The CLI follows the new grpcRunE/RunE/printOutput patterns
introduced in the recent CLI refactor.

Also fix NodeSetExpiry to persist directly to the database instead
of going through persistNodeToDB which omits the expiry field.

Fixes #2681

Co-authored-by: Marco Santos <me@marcopsantos.com>
2026-02-20 21:49:55 +01:00
Kristoffer Dalby
43afeedde2 all: apply golangci-lint 2.9.0 fixes
Fix issues found by the upgraded golangci-lint:
- wsl_v5: add required whitespace in CLI files
- staticcheck SA4006: replace new(var.Field) with &localVar
  pattern since staticcheck does not recognize Go 1.26
  new(value) as a use of the variable
- staticcheck SA5011: use t.Fatal instead of t.Error for
  nil guard checks so execution stops
- unused: remove dead ptrTo helper function
2026-02-19 08:21:23 +01:00
Kristoffer Dalby
0f6d312ada all: upgrade to Go 1.26rc2 and modernize codebase
This commit upgrades the codebase from Go 1.25.5 to Go 1.26rc2 and
adopts new language features.

Toolchain updates:
- go.mod: go 1.25.5 → go 1.26rc2
- flake.nix: buildGo125Module → buildGo126Module, go_1_25 → go_1_26
- flake.nix: build golangci-lint from source with Go 1.26
- Dockerfile.integration: golang:1.25-trixie → golang:1.26rc2-trixie
- Dockerfile.tailscale-HEAD: golang:1.25-alpine → golang:1.26rc2-alpine
- Dockerfile.derper: golang:alpine → golang:1.26rc2-alpine
- .goreleaser.yml: go mod tidy -compat=1.25 → -compat=1.26
- cmd/hi/run.go: fallback Go version 1.25 → 1.26rc2
- .pre-commit-config.yaml: simplify golangci-lint hook entry

Code modernization using Go 1.26 features:
- Replace tsaddr.SortPrefixes with slices.SortFunc + netip.Prefix.Compare
- Replace ptr.To(x) with new(x) syntax
- Replace errors.As with errors.AsType[T]

Lint rule updates:
- Add forbidigo rules to prevent regression to old patterns
2026-02-08 12:35:23 +01:00
Kristoffer Dalby
ce580f8245 all: fix golangci-lint issues (#3064) 2026-02-06 21:45:32 +01:00
Kristoffer Dalby
3acce2da87 errors: rewrite errors to follow go best practices
Errors should not start capitalised and they should not contain the word error
or state that they "failed" as we already know it is an error

Signed-off-by: Kristoffer Dalby <kristoffer@dalby.cc>
2026-02-06 07:40:29 +01:00
Kristoffer Dalby
4a9a329339 all: use lowercase log messages
Go style recommends that log messages and error strings should not be
capitalized (unless beginning with proper nouns or acronyms) and should
not end with punctuation.

This change normalizes all zerolog .Msg() and .Msgf() calls to start
with lowercase letters, following Go conventions and making logs more
consistent across the codebase.
2026-02-06 07:40:29 +01:00
Kristoffer Dalby
dd16567c52 hscontrol/state,db: use zf constants for logging
Replace raw string field names with zf constants in state.go and
db/node.go for consistent, type-safe logging.

state.go changes:
- User creation, hostinfo validation, node registration
- Tag processing during reauth (processReauthTags)
- Auth path and PreAuthKey handling
- Route auto-approval and MapRequest processing

db/node.go changes:
- RegisterNodeForTest logging
- Invalid hostname replacement logging
2026-02-06 07:40:29 +01:00
Kristoffer Dalby
91730e2a1d hscontrol: use EmbedObject for node logging
Replace manual Uint64("node.id")/Str("node.name") field patterns with
EmbedObject(node) which automatically includes all standard node fields
(id, name, machine key, node key, online status, tags, user).

This reduces code repetition and ensures consistent logging across:
- state.go: Connect/Disconnect, persistNodeToDB, AutoApproveRoutes
- auth.go: handleLogout, handleRegisterWithAuthKey
2026-02-06 07:40:29 +01:00
Kristoffer Dalby
ce7c256d1e state: set User pointer during tagged→user-owned conversion
processReauthTags sets UserID when converting a tagged node to
user-owned, but does not set the User pointer. When the node was
registered with a tags-only PreAuthKey (User: nil), the in-memory
NodeStore cache holds a node with User=nil. The mapper's
generateUserProfiles then calls node.Owner().Model().ID, which
dereferences the nil pointer and panics.

Set node.User alongside node.UserID in processReauthTags. Also add
defensive nil checks in generateUserProfiles to gracefully handle
nodes with invalid owners rather than panicking.

Fixes #3038
2026-02-04 15:44:55 +01:00
Kristoffer Dalby
4912ceaaf5 state: inline reauthExistingNode and convertTaggedNodeToUser
These were thin wrappers around applyAuthNodeUpdate that only added
logging. Move the logging into applyAuthNodeUpdate and call it directly
from HandleNodeFromAuthPath.

This simplifies the code structure without changing behavior.

Updates #3038
2026-02-04 15:44:55 +01:00
Kristoffer Dalby
d7f7f2c85e state: validate tags before UpdateNode to ensure consistency
Move tag validation before the UpdateNode callback in applyAuthNodeUpdate.
Previously, tag validation happened inside the callback, and the error
check occurred after UpdateNode had already committed changes to the
NodeStore. This left the NodeStore in an inconsistent state when tags
were rejected.

Now validation happens first, and UpdateNode is only called when we know
the operation will succeed. This follows the principle that UpdateNode
should only be called when we have all information and are ready to commit.

Also extract validateRequestTags as a reusable function and use it in
createAndSaveNewNode to deduplicate the tag validation logic.

Updates #3038
Updates #3048
2026-02-04 15:44:55 +01:00
Kristoffer Dalby
df184e5276 state: fix expiry handling during node tag conversion
Previously, expiry handling ran BEFORE processReauthTags(), using the
old tagged status to determine whether to set/clear expiry. This caused:

- Personal → Tagged: Expiry remained set (should be cleared to nil)
- Tagged → Personal: Expiry remained nil (should be set from client)

Move expiry handling after tag processing and handle all four transition
cases based on the new tagged status:

- Tagged → Personal: Set expiry from client request
- Personal → Tagged: Clear expiry (tagged nodes don't expire)
- Personal → Personal: Update expiry from client
- Tagged → Tagged: Keep existing nil expiry

Fixes #3048
2026-02-04 15:44:55 +01:00
Kristoffer Dalby
0630fd32e5 state: refactor HandleNodeFromAuthPath for clarity
Reorganize HandleNodeFromAuthPath (~300 lines) into a cleaner structure
with named conditions and extracted helper functions.

Changes:
- Add authNodeUpdateParams struct for shared update logic
- Extract applyAuthNodeUpdate for common reauth/convert operations
- Extract reauthExistingNode and convertTaggedNodeToUser handlers
- Extract createNewNodeFromAuth for new node creation
- Use named boolean conditions (nodeExistsForSameUser, existingNodeIsTagged,
  existingNodeOwnedByOtherUser) instead of compound if conditions
- Create logger with common fields (registration_id, user.name, machine.key,
  method) to reduce log statement verbosity

Updates #3038
2026-02-04 15:44:55 +01:00
Kristoffer Dalby
306aabbbce state: fix nil pointer panic when re-registering tagged node without user
When a node was registered with a tags-only PreAuthKey (no user
associated), the node had User=nil and UserID=nil. When attempting to
re-register this node to a different user via HandleNodeFromAuthPath,
two issues occurred:

1. The code called oldUser.Name() without checking if oldUser was valid,
   causing a nil pointer dereference panic.

2. The existing node lookup logic didn't find the tagged node because it
   searched by (machineKey, userID), but tagged nodes have no userID.
   This caused a new node to be created instead of updating the existing
   tagged node.

Fix this by restructuring HandleNodeFromAuthPath to:
1. First check if a node exists for the same user (existing behavior)
2. If not found, check if an existing TAGGED node exists with the same
   machine key (regardless of userID)
3. If a tagged node exists, UPDATE it to convert from tagged to
   user-owned (preserving the node ID)
4. Only create a new node if the existing node is user-owned by a
   different user

This ensures consistent behavior between:
- personal → tagged → personal (same node, same owner)
- tagged (no user) → personal (same node, new owner)

Add a test that reproduces the panic and conversion scenario by:
1. Creating a tags-only PreAuthKey (no user)
2. Registering a node with that key
3. Re-registering the same machine to a different user
4. Verifying the node ID stays the same (conversion, not creation)

Fixes #3038
2026-02-04 15:44:55 +01:00
Kristoffer Dalby
46daa659e2 state: omit AuthKeyID/AuthKey in node Updates to prevent FK errors
When a PreAuthKey is deleted, the database correctly sets auth_key_id
to NULL on referencing nodes via ON DELETE SET NULL. However, the
NodeStore (in-memory cache) retains the old AuthKeyID value.

When nodes send MapRequests (e.g., after tailscaled restart), GORM's
Updates() tries to persist the stale AuthKeyID, causing a foreign key
constraint error when trying to reference a deleted PreAuthKey.

Fix this by adding AuthKeyID and AuthKey to the Omit() call in all
three places where nodes are updated via GORM's Updates():
- persistNodeToDB (MapRequest processing)
- HandleNodeFromAuthPath (re-auth via web/OIDC)
- HandleNodeFromPreAuthKey (re-registration with preauth key)

This tells GORM to never touch the auth_key_id column or AuthKey
association during node updates, letting the database handle the
foreign key relationship correctly.

Added TestDeletedPreAuthKeyNotRecreatedOnNodeUpdate to verify that
deleted PreAuthKeys are not recreated when nodes send MapRequests.
2026-01-26 12:12:11 +00:00
Florian Preinstorfer
49b70db7f2 Conversion from personal to tagged node is reversible 2026-01-24 17:18:59 +01:00
Shourya Gautam
4e1834adaf db: use PolicyManager for RequestTags migration
Refactor the RequestTags migration (202601121700-migrate-hostinfo-request-tags)
to use PolicyManager.NodeCanHaveTag() instead of reimplementing tag validation.

Changes:
- NewHeadscaleDatabase now accepts *types.Config to allow migrations
  access to policy configuration
- Add loadPolicyBytes helper to load policy from file or DB based on config
- Add standalone GetPolicy(tx *gorm.DB) for use during migrations
- Replace custom tag validation logic with PolicyManager

Benefits:
- Full HuJSON parsing support (not just JSON)
- Proper group expansion via PolicyManager
- Support for nested tags and autogroups
- Works with both file and database policy modes
- Single source of truth for tag validation


Co-Authored-By: Shourya Gautam <shouryamgautam@gmail.com>
2026-01-21 15:10:29 +01:00
Kristoffer Dalby
42bd9cd058 state: add GetAPIKeyByID method
Add GetAPIKeyByID method to the state layer, delegating to the existing
database layer function. This enables API key lookup by ID in addition
to the existing prefix-based lookup.

Updates #2986
2026-01-20 17:13:38 +01:00
Kristoffer Dalby
4be13baf3f state: update policy manager when deleting users
Make DeleteUser call updatePolicyManagerUsers() to refresh the policy
manager's cached user list after user deletion. This ensures consistency
with CreateUser, UpdateUser, and RenameUser which all update the policy
manager.

Previously, DeleteUser only removed the user from the database without
updating the policy manager. This could leave stale user references in
the cached user list, potentially causing issues when policy is
re-evaluated.

The gRPC handler now uses the change returned from DeleteUser instead of
manually constructing change.UserRemoved().

Fixes #2967
2026-01-20 15:41:19 +01:00
Kristoffer Dalby
4ab06930a2 hscontrol: handle tags-only PreAuthKeys in registration
HandleNodeFromPreAuthKey assumed pak.User was always set, but
tags-only PreAuthKeys have nil User. This caused nil pointer
dereference when registering nodes with tags-only keys.

Also updates integration tests to use GetTags() instead of the
removed GetValidTags() method.

Updates #2977
2026-01-20 12:53:20 +01:00
Kristoffer Dalby
1325fd8b27 cli,hscontrol: use ID-based preauthkey operations 2026-01-20 12:53:20 +01:00
Kristoffer Dalby
3b4b9a4436 hscontrol: fix tag updates not propagating to node self view
When SetNodeTags changed a node's tags, the node's self view wasn't
updated. The bug manifested as: the first SetNodeTags call updates
the server but the client's self view doesn't update until a second
call with the same tag.

Root cause: Three issues combined to prevent self-updates:

1. SetNodeTags returned PolicyChange which doesn't set OriginNode,
   so the mapper's self-update check failed.

2. The Change.Merge function didn't preserve OriginNode, so when
   changes were batched together, OriginNode was lost.

3. generateMapResponse checked OriginNode only in buildFromChange(),
   but PolicyChange uses RequiresRuntimePeerComputation which
   bypasses that code path entirely and calls policyChangeResponse()
   instead.

The fix addresses all three:
- state.go: Set OriginNode on the returned change
- change.go: Preserve OriginNode (and TargetNode) during merge
- batcher.go: Pass isSelfUpdate to policyChangeResponse so the
  origin node gets both self info AND packet filters
- mapper.go: Add includeSelf parameter to policyChangeResponse

Fixes #2978
2026-01-20 10:13:47 +01:00
Kristoffer Dalby
0451dd4718 state: allow untagging nodes via reauth with empty RequestTags
When a node re-authenticates via OIDC/web auth with empty RequestTags
(from `tailscale up --advertise-tags= --force-reauth`), remove all tags
and return ownership to the authenticating user.

This allows nodes to transition from any tagged state (including nodes
originally registered with a tagged pre-auth key) back to user-owned.

Fixes #2979
2026-01-17 10:13:24 +01:00
Kristoffer Dalby
00f22a8443 state: disable key expiry for nodes with approved advertise-tags
Extends #2971 fix to also cover nodes that authenticate as users but
become tagged immediately via --advertise-tags. When RequestTags are
approved by policy, the node's expiry is now disabled, consistent with
nodes registered via tagged PreAuthKeys.
2026-01-16 17:05:59 +01:00
Kristoffer Dalby
1d9900273e state: disable key expiry for tagged nodes
Nodes registered with tagged PreAuthKeys now have key expiry disabled,
matching Tailscale's behavior. User-owned nodes continue to use the
client-requested expiry.

On re-authentication, tagged nodes preserve their disabled expiry while
user-owned nodes can update their expiry from the client request.

Fixes #2971
2026-01-16 17:05:59 +01:00
Kristoffer Dalby
72fcb93ef3 cli: ensure tagged-devices is included in profile list (#2991) 2026-01-09 16:31:23 +01:00
Kristoffer Dalby
f3767dddf8 batcher: ensure removal from batcher
Fixes #2924

Signed-off-by: Kristoffer Dalby <kristoffer@dalby.cc>
2025-12-17 13:19:26 +01:00
Kristoffer Dalby
5767ca5085 change: smarter change notifications
This commit replaces the ChangeSet with a simpler bool based
change model that can be directly used in the map builder to
build the appropriate map response based on the change that
has occured. Previously, we fell back to sending full maps
for a lot of changes as that was consider "the safe" thing to
do to ensure no updates were missed.

This was slightly problematic as a node that already has a list
of peers will only do full replacement of the peers if the list
is non-empty, meaning that it was not possible to remove all
nodes (if for example policy changed).

Now we will keep track of last seen nodes, so we can send remove
ids, but also we are much smarter on how we send smaller, partial
maps when needed.

Fixes #2389

Signed-off-by: Kristoffer Dalby <kristoffer@dalby.cc>
2025-12-16 10:12:36 +01:00
Kristoffer Dalby
506bd8c8eb policy: more accurate node change
This commit changes so that node changes to the policy is
calculated if any of the nodes has changed in a way that might
affect the policy.

Previously we just checked if the number of nodes had changed,
which meant that if a node was added and removed, we would be
in a bad state.

Signed-off-by: Kristoffer Dalby <kristoffer@dalby.cc>
2025-12-16 10:12:36 +01:00
Kristoffer Dalby
87bd67318b golangci-lint: use forbidigo to block time.Sleep (#2946) 2025-12-10 16:45:59 +00:00
Kristoffer Dalby
22ee2bfc9c tags: process tags on registration, simplify policy (#2931)
This PR investigates, adds tests and aims to correctly implement Tailscale's model for how Tags should be accepted, assigned and used to identify nodes in the Tailscale access and ownership model.

When evaluating in Headscale's policy, Tags are now only checked against a nodes "tags" list, which defines the source of truth for all tags for a given node. This simplifies the code for dealing with tags greatly, and should help us have less access bugs related to nodes belonging to tags or users.

A node can either be owned by a user, or a tag.

Next, to ensure the tags list on the node is correctly implemented, we first add tests for every registration scenario and combination of user, pre auth key and pre auth key with tags with the same registration expectation as observed by trying them all with the Tailscale control server. This should ensure that we implement the correct behaviour and that it does not change or break over time.

Lastly, the missing parts of the auth has been added, or changed in the cases where it was wrong. This has in large parts allowed us to delete and simplify a lot of code.
Now, tags can only be changed when a node authenticates or if set via the CLI/API. Tags can only be fully overwritten/replaced and any use of either auth or CLI will replace the current set if different.

A user owned device can be converted to a tagged device, but it cannot be changed back. A tagged device can never remove the last tag either, it has to have a minimum of one.
2025-12-08 18:51:07 +01:00
Kristoffer Dalby
eb788cd007 make tags first class node owner (#2885)
This PR changes tags to be something that exists on nodes in addition to users, to being its own thing. It is part of moving our tags support towards the correct tailscale compatible implementation.

There are probably rough edges in this PR, but the intention is to get it in, and then start fixing bugs from 0.28.0 milestone (long standing tags issue) to discover what works and what doesnt.

Updates #2417
Closes #2619
2025-12-02 12:01:25 +01:00
Kristoffer Dalby
cb4d5b1906 hscontrol/oidc: fix ACL policy not applied to new OIDC nodes (#2890)
Fixes #2888
Fixes #2896
2025-12-02 12:01:02 +01:00
Kristoffer Dalby
3cf2d7195a auth: ensure machines are allowed in when pak change (#2917) 2025-12-02 12:01:02 +01:00
Kristoffer Dalby
16d811b306 cli: remove node move command (#2922) 2025-12-01 21:43:31 +01:00
Kristoffer Dalby
eec196d200 modernize: run gopls modernize to bring up to 1.25 (#2920) 2025-12-01 19:40:25 +01:00
Kristoffer Dalby
db293e0698 hscontrol/state: make NodeStore batch configuration tunable (#2886) 2025-11-28 16:38:29 +01:00
Kristoffer Dalby
7fb0f9a501 batcher: send endpoint and derp only updates. (#2856) 2025-11-13 20:38:49 +01:00
Kristoffer Dalby
da9018a0eb types: make pre auth key use bcrypt (#2853) 2025-11-12 16:36:36 +01:00
Andrey Bobelev
299cef4e99 fix: free ips from usedIps ipset on DeleteNode 2025-11-11 17:27:00 -06:00
Kristoffer Dalby
ddd31ba774 hscontrol: use Updates() instead of Save() for partial updates
Changed UpdateUser and re-registration flows to use Updates() which only
writes modified fields, preventing unintended overwrites of unchanged fields.

Also updated UsePreAuthKey to use Model().Update() for single field updates
and removed unused NodeSave wrapper.
2025-11-11 12:47:48 -06:00
Kristoffer Dalby
4a8dc2d445 hscontrol/state,db: preserve node expiry on MapRequest updates
Fixes a regression introduced in v0.27.0 where node expiry times were
being reset to zero when tailscaled restarts and sends a MapRequest.

The issue was caused by using GORM's Save() method in persistNodeToDB(),
which overwrites ALL fields including zero values. When a MapRequest
updates a node (without including expiry information), Save() would
overwrite the database expiry field with a zero value.

Changed to use Updates() which only updates non-zero values, preserving
existing database values when struct pointer fields are nil.

In BackfillNodeIPs, we need to explicitly update IPv4/IPv6 fields even
when nil (to remove IPs), so we use Select() to specify those fields.

Added regression test that validates expiry is preserved after MapRequest.

Fixes #2862
2025-11-11 12:47:48 -06:00
Kristoffer Dalby
4728a2ba9e hscontrol/state: allow expired auth keys for node re-registration
Skip auth key validation for existing nodes re-registering with the same
NodeKey. Pre-auth keys are only required for initial authentication.

NodeKey rotation still requires a valid auth key as it is a security-sensitive
operation that changes the node's cryptographic identity.

Fixes #2830
2025-11-11 05:12:59 -06:00
Kristoffer Dalby
d7a43a7cf1 state: use AllApprovedRoutes instead of SubnetRoutes
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-11-02 13:19:59 +01:00
Kristoffer Dalby
2bf1200483 policy: fix autogroup:self propagation and optimize cache invalidation (#2807) 2025-10-23 17:57:41 +02:00