Commit Graph

165 Commits

Author SHA1 Message Date
Kristoffer Dalby
3e0a96ec3a all: fix test flakiness and improve test infrastructure
Buffer the AuthRequest verdict channel to prevent a race where the
sender blocks indefinitely if the receiver has already timed out, and
increase the auth followup test timeout from 100ms to 5s to prevent
spurious failures under load.

Skip postgres-backed tests when the postgres server is unavailable
instead of calling t.Fatal, which was preventing the rest of the test
suite from running.

Add TestMain to db, types, and policy/v2 packages to chdir to the
source directory before running tests. This ensures relative testdata/
paths resolve correctly when the test binary is executed from an
arbitrary working directory (e.g., via "go tool stress").
2026-03-14 02:52:28 -07:00
Kristoffer Dalby
cb3b6949ea auth: generalise auth flow and introduce AuthVerdict
Generalise the registration pipeline to a more general auth pipeline
supporting both node registrations and SSH check auth requests.
Rename RegistrationID to AuthID, unexport AuthRequest fields, and
introduce AuthVerdict to unify the auth finish API.

Add the urlParam generic helper for extracting typed URL parameters
from chi routes, used by the new auth request handler.

Updates #1850
2026-02-25 21:28:05 +01:00
Kristoffer Dalby
1e4fc3f179 hscontrol: add tests for deleting users with tagged nodes
Test the tagged-node-survives-user-deletion scenario at two layers:

DB layer (users_test.go):
- success_user_only_has_tagged_nodes: tagged nodes with nil
  user_id do not block user deletion and survive it
- error_user_has_tagged_and_owned_nodes: user-owned nodes
  still block deletion even when tagged nodes coexist

App layer (grpcv1_test.go):
- TestDeleteUser_TaggedNodeSurvives: full registration flow
  with tagged PreAuthKey verifies nil UserID after registration,
  absence from nodesByUser index, user deletion succeeds, and
  tagged node remains in global node list

Also update auth_tags_test.go assertions to expect nil UserID
on tagged nodes, consistent with the new invariant.

Updates #3077
2026-02-20 21:51:00 +01:00
Kristoffer Dalby
75e56df9e4 hscontrol: enforce that tagged nodes never have user_id
Tagged nodes are owned by their tags, not a user. Enforce this
invariant at every write path:

- createAndSaveNewNode: do not set UserID for tagged PreAuthKey
  registration; clear UserID when advertise-tags are applied
  during OIDC/CLI registration
- SetNodeTags: clear UserID/User when tags are assigned
- processReauthTags: clear UserID/User when tags are applied
  during re-authentication
- validateNodeOwnership: reject tagged nodes with non-nil UserID
- NodeStore: skip nodesByUser indexing for tagged nodes since
  they have no owning user
- HandleNodeFromPreAuthKey: add fallback lookup for tagged PAK
  re-registration (tagged nodes indexed under UserID(0)); guard
  against nil User deref for tagged nodes in different-user check

Since tagged nodes now have user_id = NULL, ListNodesByUser
will not return them and DestroyUser naturally allows deleting
users whose nodes have all been tagged. The ON DELETE CASCADE
FK cannot reach tagged nodes through a NULL foreign key.

Also tone down shouty comments throughout state.go.

Fixes #3077
2026-02-20 21:51:00 +01:00
Kristoffer Dalby
52d454d0c8 hscontrol/db: add migration to clear user_id on tagged nodes
Tagged nodes are owned by their tags, not a user. Previously
user_id was kept as "created by" tracking, but this prevents
deleting users whose nodes have all been tagged, and the
ON DELETE CASCADE FK would destroy the tagged nodes.

Add a migration that sets user_id = NULL on all existing tagged
nodes. Subsequent commits enforce this invariant at write time.

Updates #3077
2026-02-20 21:51:00 +01:00
Kristoffer Dalby
f20bd0cf08 node: implement disable key expiry via CLI and API
Add --disable flag to "headscale nodes expire" CLI command and
disable_expiry field handling in the gRPC API to allow disabling
key expiry for nodes. When disabled, the node's expiry is set to
NULL and IsExpired() returns false.

The CLI follows the new grpcRunE/RunE/printOutput patterns
introduced in the recent CLI refactor.

Also fix NodeSetExpiry to persist directly to the database instead
of going through persistNodeToDB which omits the expiry field.

Fixes #2681

Co-authored-by: Marco Santos <me@marcopsantos.com>
2026-02-20 21:49:55 +01:00
Kristoffer Dalby
43afeedde2 all: apply golangci-lint 2.9.0 fixes
Fix issues found by the upgraded golangci-lint:
- wsl_v5: add required whitespace in CLI files
- staticcheck SA4006: replace new(var.Field) with &localVar
  pattern since staticcheck does not recognize Go 1.26
  new(value) as a use of the variable
- staticcheck SA5011: use t.Fatal instead of t.Error for
  nil guard checks so execution stops
- unused: remove dead ptrTo helper function
2026-02-19 08:21:23 +01:00
Kristoffer Dalby
73613d7f53 db: fix database_versions table creation for PostgreSQL
Use GORM AutoMigrate instead of raw SQL to create the
database_versions table, since PostgreSQL does not support the
datetime type used in the raw SQL (it requires timestamp).
2026-02-19 08:21:23 +01:00
Kristoffer Dalby
82958835ce db: enforce strict version upgrade path
Add a version check that runs before database migrations to ensure
users do not skip minor versions or downgrade. This protects database
migrations and allows future cleanup of old migration code.

Rules enforced:
- Same minor version: always allowed (patch changes either way)
- Single minor upgrade (e.g. 0.27 -> 0.28): allowed
- Multi-minor upgrade (e.g. 0.25 -> 0.28): blocked with guidance
- Any minor downgrade: blocked
- Major version change: blocked
- Dev builds: warn but allow, preserve stored version

The version is stored in a purpose-built database_versions table
after migrations succeed. The table is created with raw SQL before
gormigrate runs to avoid circular dependencies.

Updates #3058
2026-02-19 08:21:23 +01:00
Kristoffer Dalby
0f6d312ada all: upgrade to Go 1.26rc2 and modernize codebase
This commit upgrades the codebase from Go 1.25.5 to Go 1.26rc2 and
adopts new language features.

Toolchain updates:
- go.mod: go 1.25.5 → go 1.26rc2
- flake.nix: buildGo125Module → buildGo126Module, go_1_25 → go_1_26
- flake.nix: build golangci-lint from source with Go 1.26
- Dockerfile.integration: golang:1.25-trixie → golang:1.26rc2-trixie
- Dockerfile.tailscale-HEAD: golang:1.25-alpine → golang:1.26rc2-alpine
- Dockerfile.derper: golang:alpine → golang:1.26rc2-alpine
- .goreleaser.yml: go mod tidy -compat=1.25 → -compat=1.26
- cmd/hi/run.go: fallback Go version 1.25 → 1.26rc2
- .pre-commit-config.yaml: simplify golangci-lint hook entry

Code modernization using Go 1.26 features:
- Replace tsaddr.SortPrefixes with slices.SortFunc + netip.Prefix.Compare
- Replace ptr.To(x) with new(x) syntax
- Replace errors.As with errors.AsType[T]

Lint rule updates:
- Add forbidigo rules to prevent regression to old patterns
2026-02-08 12:35:23 +01:00
Kristoffer Dalby
ce580f8245 all: fix golangci-lint issues (#3064) 2026-02-06 21:45:32 +01:00
Kristoffer Dalby
3acce2da87 errors: rewrite errors to follow go best practices
Errors should not start capitalised and they should not contain the word error
or state that they "failed" as we already know it is an error

Signed-off-by: Kristoffer Dalby <kristoffer@dalby.cc>
2026-02-06 07:40:29 +01:00
Kristoffer Dalby
4a9a329339 all: use lowercase log messages
Go style recommends that log messages and error strings should not be
capitalized (unless beginning with proper nouns or acronyms) and should
not end with punctuation.

This change normalizes all zerolog .Msg() and .Msgf() calls to start
with lowercase letters, following Go conventions and making logs more
consistent across the codebase.
2026-02-06 07:40:29 +01:00
Kristoffer Dalby
dd16567c52 hscontrol/state,db: use zf constants for logging
Replace raw string field names with zf constants in state.go and
db/node.go for consistent, type-safe logging.

state.go changes:
- User creation, hostinfo validation, node registration
- Tag processing during reauth (processReauthTags)
- Auth path and PreAuthKey handling
- Route auto-approval and MapRequest processing

db/node.go changes:
- RegisterNodeForTest logging
- Invalid hostname replacement logging
2026-02-06 07:40:29 +01:00
Kristoffer Dalby
91730e2a1d hscontrol: use EmbedObject for node logging
Replace manual Uint64("node.id")/Str("node.name") field patterns with
EmbedObject(node) which automatically includes all standard node fields
(id, name, machine key, node key, online status, tags, user).

This reduces code repetition and ensures consistent logging across:
- state.go: Connect/Disconnect, persistNodeToDB, AutoApproveRoutes
- auth.go: handleLogout, handleRegisterWithAuthKey
2026-02-06 07:40:29 +01:00
Shourya Gautam
4e1834adaf db: use PolicyManager for RequestTags migration
Refactor the RequestTags migration (202601121700-migrate-hostinfo-request-tags)
to use PolicyManager.NodeCanHaveTag() instead of reimplementing tag validation.

Changes:
- NewHeadscaleDatabase now accepts *types.Config to allow migrations
  access to policy configuration
- Add loadPolicyBytes helper to load policy from file or DB based on config
- Add standalone GetPolicy(tx *gorm.DB) for use during migrations
- Replace custom tag validation logic with PolicyManager

Benefits:
- Full HuJSON parsing support (not just JSON)
- Proper group expansion via PolicyManager
- Support for nested tags and autogroups
- Works with both file and database policy modes
- Single source of truth for tag validation


Co-Authored-By: Shourya Gautam <shouryamgautam@gmail.com>
2026-01-21 15:10:29 +01:00
Kristoffer Dalby
a194712c34 grpc: support expire/delete API keys by ID
Update ExpireApiKey and DeleteApiKey handlers to accept either ID or
prefix for identifying the API key. Returns InvalidArgument error if
neither or both are provided.

Add tests for:
- Expire by ID
- Expire by prefix (backwards compatibility)
- Delete by ID
- Delete by prefix (backwards compatibility)
- Error when neither ID nor prefix provided
- Error when both ID and prefix provided

Updates #2986
2026-01-20 17:13:38 +01:00
Kristoffer Dalby
424e26d636 db: migrate tests from check.v1 to testify
Migrate all database tests from gopkg.in/check.v1 Suite-based testing
to standard Go tests with testify assert/require.

Changes:
- Remove empty Suite files (hscontrol/suite_test.go, hscontrol/mapper/suite_test.go)
- Convert hscontrol/db/suite_test.go to modern helpers only
- Convert 6 Suite test methods in node_test.go to standalone tests
- Convert 5 Suite test methods in api_key_test.go to standalone tests
- Fix stale global variable reference in db_test.go

The legacy TestListPeers Suite method was renamed to TestListPeersManyNodes
to avoid conflict with the existing modern TestListPeers function, as they
test different aspects (basic peer listing vs ID filtering).
2026-01-20 15:41:33 +01:00
Kristoffer Dalby
c8c3c9d4a0 hscontrol: allow CreatePreAuthKey without user when tags provided
Handle case where user is 0 in gRPC layer to support tags-only
auth keys.
2026-01-20 12:53:20 +01:00
Kristoffer Dalby
1325fd8b27 cli,hscontrol: use ID-based preauthkey operations 2026-01-20 12:53:20 +01:00
Kristoffer Dalby
5103b35f3c sqliteconfig: add config opt for tx locking
Signed-off-by: Kristoffer Dalby <kristoffer@dalby.cc>
2025-12-22 14:01:40 +01:00
Kristoffer Dalby
87bd67318b golangci-lint: use forbidigo to block time.Sleep (#2946) 2025-12-10 16:45:59 +00:00
Kristoffer Dalby
eb788cd007 make tags first class node owner (#2885)
This PR changes tags to be something that exists on nodes in addition to users, to being its own thing. It is part of moving our tags support towards the correct tailscale compatible implementation.

There are probably rough edges in this PR, but the intention is to get it in, and then start fixing bugs from 0.28.0 milestone (long standing tags issue) to discover what works and what doesnt.

Updates #2417
Closes #2619
2025-12-02 12:01:25 +01:00
Kristoffer Dalby
3cf2d7195a auth: ensure machines are allowed in when pak change (#2917) 2025-12-02 12:01:02 +01:00
Kristoffer Dalby
16d811b306 cli: remove node move command (#2922) 2025-12-01 21:43:31 +01:00
Kristoffer Dalby
eec196d200 modernize: run gopls modernize to bring up to 1.25 (#2920) 2025-12-01 19:40:25 +01:00
Kristoffer Dalby
db293e0698 hscontrol/state: make NodeStore batch configuration tunable (#2886) 2025-11-28 16:38:29 +01:00
Kristoffer Dalby
4b25976288 db: add comment to always check errors in migration
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-11-13 09:46:40 -06:00
Kristoffer Dalby
1c146f70e9 db: remove _schema from migration tests
Previously we tested migrations on schemas and dumps
of old databases.

The problems with testing migrations against the schemas
is that the migration table is empty, so we try to run
migrations that are already ran on that schema, which might
blow up.

This commit removes the schema approach and just leaves all
the dumps, which include the migration table.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-11-13 09:46:40 -06:00
Kristoffer Dalby
75247f82b8 hscontrol/db: add init schema, drop pre-0.25 support (#2883) 2025-11-13 04:44:10 -06:00
Kristoffer Dalby
da9018a0eb types: make pre auth key use bcrypt (#2853) 2025-11-12 16:36:36 +01:00
Andrey Bobelev
299cef4e99 fix: free ips from usedIps ipset on DeleteNode 2025-11-11 17:27:00 -06:00
Kristoffer Dalby
3455d1cb59 hscontrol/db: fix RenameUser to use Updates()
RenameUser only modifies Name field, should use Updates() not Save().
2025-11-11 12:47:48 -06:00
Kristoffer Dalby
ddd31ba774 hscontrol: use Updates() instead of Save() for partial updates
Changed UpdateUser and re-registration flows to use Updates() which only
writes modified fields, preventing unintended overwrites of unchanged fields.

Also updated UsePreAuthKey to use Model().Update() for single field updates
and removed unused NodeSave wrapper.
2025-11-11 12:47:48 -06:00
Kristoffer Dalby
4a8dc2d445 hscontrol/state,db: preserve node expiry on MapRequest updates
Fixes a regression introduced in v0.27.0 where node expiry times were
being reset to zero when tailscaled restarts and sends a MapRequest.

The issue was caused by using GORM's Save() method in persistNodeToDB(),
which overwrites ALL fields including zero values. When a MapRequest
updates a node (without including expiry information), Save() would
overwrite the database expiry field with a zero value.

Changed to use Updates() which only updates non-zero values, preserving
existing database values when struct pointer fields are nil.

In BackfillNodeIPs, we need to explicitly update IPv4/IPv6 fields even
when nil (to remove IPs), so we use Select() to specify those fields.

Added regression test that validates expiry is preserved after MapRequest.

Fixes #2862
2025-11-11 12:47:48 -06:00
Kristoffer Dalby
28faf8cd71 db: add defensive removal of old indicies
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-11-10 20:07:29 +01:00
Kristoffer Dalby
5a2ee0c391 db: add comment about removing migrations
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-11-10 17:32:39 +01:00
Kristoffer Dalby
d7a43a7cf1 state: use AllApprovedRoutes instead of SubnetRoutes
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-11-02 13:19:59 +01:00
Andrey
f9bb88ad24 expire nodes with a custom timestamp (#2828) 2025-11-01 08:09:13 +01:00
Kristoffer Dalby
456a5d5cce db: ignore _litestream tables when validating (#2843) 2025-11-01 07:08:22 +00:00
Kristoffer Dalby
ddbd3e14ba db: remove all old, unused tables (#2844) 2025-11-01 08:03:37 +01:00
Kristoffer Dalby
1cdea7ed9b stricter hostname validation and replace (#2383) 2025-10-22 13:50:39 +02:00
Kristoffer Dalby
ed3a9c8d6d mapper: send change instead of full update (#2775) 2025-09-17 14:23:21 +02:00
Kristoffer Dalby
233dffc186 lint and leftover
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-09-09 09:40:00 +02:00
Kristoffer Dalby
9d236571f4 state/nodestore: in memory representation of nodes
Initial work on a nodestore which stores all of the nodes
and their relations in memory with relationship for peers
precalculated.

It is a copy-on-write structure, replacing the "snapshot"
when a change to the structure occurs. It is optimised for reads,
and while batches are not fast, they are grouped together
to do less of the expensive peer calculation if there are many
changes rapidly.

Writes will block until commited, while reads are never
blocked.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-09-09 09:40:00 +02:00
cuiweixie
a2a6d20218 Refactor to use reflect.TypeFor 2025-08-23 20:43:49 +02:00
Kristoffer Dalby
a058bf3cd3 mapper: produce map before poll (#2628) 2025-07-28 11:15:53 +02:00
Kian-Meng Ang
3123d5286b Fix typos
Found via `codespell -L shs,hastable,userr`
2025-07-21 12:06:07 +02:00
Kristoffer Dalby
c6d7b512bd integration: replace time.Sleep with assert.EventuallyWithT (#2680) 2025-07-10 23:38:55 +02:00
Kristoffer Dalby
73023c2ec3 all: use immutable node view in read path
This commit changes most of our (*)types.Node to
types.NodeView, which is a readonly version of the
underlying node ensuring that there is no mutations
happening in the read path.

Based on the migration, there didnt seem to be any, but the
idea here is to prevent it in the future and simplify other
new implementations.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-07-07 21:28:59 +01:00