headscale

mirror of https://github.com/juanfont/headscale.git synced 2026-03-20 00:24:20 +01:00

Author	SHA1	Message	Date
Kristoffer Dalby	b09af3846b	hscontrol/poll,state: fix grace period disconnect TOCTOU race When a node disconnects, serveLongPoll defers a cleanup that starts a grace period goroutine. This goroutine polls batcher.IsConnected() and, if the node has not reconnected within ~10 seconds, calls state.Disconnect() to mark it offline. A TOCTOU race exists: the node can reconnect (calling Connect()) between the IsConnected check and the Disconnect() call, causing the stale Disconnect() to overwrite the new session's online status. Fix with a monotonic per-node generation counter: - State.Connect() increments the counter and returns the current generation alongside the change list. - State.Disconnect() accepts the generation from the caller and rejects the call if a newer generation exists, making stale disconnects from old sessions a no-op. - serveLongPoll captures the generation at Connect() time and passes it to Disconnect() in the deferred cleanup. - RemoveNode's return value is now checked: if another session already owns the batcher slot (reconnect happened), the old session skips the grace period entirely. Update batcher_test.go to track per-node connect generations and pass them through to Disconnect(), matching production behavior. Fixes the following test failures: - server_state_online_after_reconnect_within_grace - update_history_no_false_offline - nodestore_correct_after_rapid_reconnect - rapid_reconnect_peer_never_sees_offline	2026-03-19 07:05:58 +01:00
Kristoffer Dalby	00c41b6422	hscontrol/servertest: add race, stress, and poll race tests Add three test files designed to stress the control plane under concurrent and adversarial conditions: - race_test.go: 14 tests exercising concurrent mutations, session replacement, batcher contention, NodeStore access, and map response delivery during disconnect. All pass the Go race detector. - poll_race_test.go: 8 tests targeting the poll.go grace period interleaving. These confirm a logical TOCTOU race: when a node disconnects and reconnects within the grace period, the old session's deferred Disconnect() can overwrite the new session's Connect(), leaving IsOnline=false despite an active poll session. - stress_test.go: sustained churn, rapid mutations, rolling replacement, data integrity checks under load, and verification that rapid reconnects do not leak false-offline notifications. Known failing tests (grace period TOCTOU race): - server_state_online_after_reconnect_within_grace - update_history_no_false_offline - rapid_reconnect_peer_never_sees_offline	2026-03-19 07:05:58 +01:00
Kristoffer Dalby	ab4e205ce7	hscontrol/servertest: expand issue tests to 24 scenarios, surface 4 issues Split TestIssues into 7 focused test functions to stay under cyclomatic complexity limits while testing more aggressively. Issues surfaced (4 failing tests): 1. initial_map_should_include_peer_online_status: Initial MapResponse has Online=nil for peers. Online status only arrives later via PeersChangedPatch. 2. disco_key_should_propagate_to_peers: DiscoPublicKey set by client is not visible to peers. Peers see zero disco key. 3. approved_route_without_announcement_is_visible: Server-side route approval without client-side announcement silently produces empty SubnetRoutes (intersection of empty announced + approved = empty). 4. nodestore_correct_after_rapid_reconnect: After 5 rapid reconnect cycles, NodeStore reports node as offline despite having an active poll session. The connect/disconnect grace period interleaving leaves IsOnline in an incorrect state. Passing tests (20) verify: - IP uniqueness across 10 nodes - IP stability across reconnect - New peers have addresses immediately - Node rename propagates to peers - Node delete removes from all peer lists - Hostinfo changes (OS field) propagate - NodeStore/DB consistency after route mutations - Grace period timing (8-20s window) - Ephemeral node deletion (not just offline) - 10-node simultaneous connect convergence - Rapid sequential node additions - Reconnect produces complete map - Cross-user visibility with default policy - Same-user multiple nodes get distinct IDs - Same-hostname nodes get unique GivenNames - Policy change during connect still converges - DERP region references are valid - User profiles present for self and peers - Self-update arrives after route approval - Route advertisement stored as AnnouncedRoutes	2026-03-19 07:05:58 +01:00
Kristoffer Dalby	f87b08676d	hscontrol/servertest: add policy, route, ephemeral, and content tests Extend the servertest harness with: - TestClient.Direct() accessor for advanced operations - TestClient.WaitForPeerCount and WaitForCondition helpers - TestHarness.ChangePolicy for ACL policy testing - AssertDERPMapPresent and AssertSelfHasAddresses New test suites: - content_test.go: self node, DERP map, peer properties, user profiles, update history monotonicity, and endpoint update propagation - policy_test.go: default allow-all, explicit policy, policy triggers updates on all nodes, multiple policy changes, multi-user mesh - ephemeral_test.go: ephemeral connect, cleanup after disconnect, mixed ephemeral/regular, reconnect prevents cleanup - routes_test.go: addresses in AllowedIPs, route advertise and approve, advertised routes via hostinfo, CGNAT range validation Also fix node_departs test to use WaitForCondition instead of assert.Eventually, and convert concurrent_join_and_leave to interleaved_join_and_leave with grace-period-tolerant assertions.	2026-03-19 07:05:58 +01:00
Kristoffer Dalby	ca7362e9aa	hscontrol/servertest: add control plane lifecycle and consistency tests Add three test files exercising the servertest harness: - lifecycle_test.go: connection, disconnection, reconnection, session replacement, and mesh formation at various sizes. - consistency_test.go: symmetric visibility, consistent peer state, address presence, concurrent join/leave convergence. - weather_test.go: rapid reconnects, flapping stability, reconnect with various delays, concurrent reconnects, and scale tests. All tests use table-driven patterns with subtests.	2026-03-19 07:05:58 +01:00
Kristoffer Dalby	0288614bdf	hscontrol: add servertest harness for in-process control plane testing Add a new hscontrol/servertest package that provides a test harness for exercising the full Headscale control protocol in-process, using Tailscale's controlclient.Direct as the client. The harness consists of: - TestServer: wraps a Headscale instance with an httptest.Server - TestClient: wraps controlclient.Direct with NetworkMap tracking - TestHarness: orchestrates N clients against a single server - Assertion helpers for mesh completeness, visibility, and consistency Export minimal accessor methods on Headscale (HTTPHandler, NoisePublicKey, GetState, SetServerURL, StartBatcher, StartEphemeralGC) so the servertest package can construct a working server from outside the hscontrol package. This enables fast, deterministic tests of connection lifecycle, update propagation, and network weather scenarios without Docker.	2026-03-19 07:05:58 +01:00
Kristoffer Dalby	82c7efccf8	mapper/batcher: serialize per-node work to prevent out-of-order delivery processBatchedChanges queued each pending change for a node as a separate work item. Since multiple workers pull from the same channel, two changes for the same node could be processed concurrently by different workers. This caused two problems: 1. MapResponses delivered out of order — a later change could finish generating before an earlier one, so the client sees stale state. 2. updateSentPeers and computePeerDiff race against each other — updateSentPeers does Clear() + Store() which is not atomic relative to a concurrent Range() in computePeerDiff. Bundle all pending changes for a node into a single work item so one worker processes them sequentially. Add a per-node workMu that serializes processing across consecutive batch ticks, preventing a second worker from starting tick N+1 while tick N is still in progress. Fixes #3140	2026-03-19 07:05:58 +01:00
Kristoffer Dalby	81b871c9b5	integration/acl: replace custom entrypoints with WithPackages Replace inline WithDockerEntrypoint shell scripts in TestACLTagPropagation and TestACLTagPropagationPortSpecific with the standard WithPackages and WithWebserver options. The custom entrypoints used fragile fixed sleeps and lacked the robust network/cert readiness waits that buildEntrypoint provides. Updates #3139	2026-03-16 03:57:05 -07:00
Kristoffer Dalby	e5ebe3205a	integration: standardize test infrastructure options Make embedded DERP server and TLS the default configuration for all integration tests, replacing the per-test opt-in model that led to inconsistent and flaky test behavior. Infrastructure changes: - DefaultConfigEnv() includes embedded DERP server settings - New() auto-generates a proper CA + server TLS certificate pair - CA cert is installed into container trust stores and returned by GetCert() so clients and internal tools (curl) trust the server - CreateCertificate() now returns (caCert, cert, key) instead of discarding the CA certificate - Add WithPublicDERP() and WithoutTLS() opt-out options - Remove WithTLS(), WithEmbeddedDERPServerOnly(), and WithDERPAsIP() since all their behavior is now the default or unnecessary Test cleanup: - Remove all redundant WithTLS/WithEmbeddedDERPServerOnly/WithDERPAsIP calls from test files - Give every test a unique WithTestName by parameterizing aclScenario, sshScenario, and derpServerScenario helpers - Add WithTestName to tests that were missing it - Document all non-standard options with inline comments explaining why each is needed Updates #3139	2026-03-16 03:57:05 -07:00
Kristoffer Dalby	87b8507ac9	mapper/batcher: replace connected map with per-node disconnectedAt The Batcher's connected field (xsync.Map[types.NodeID, time.Time]) encoded three states via pointer semantics: - nil value: node is connected - non-nil time: node disconnected at that timestamp - key missing: node was never seen This was error-prone (nil meaning 'connected' inverts Go idioms), redundant with b.nodes + hasActiveConnections(), and required keeping two parallel maps in sync. It also contained a bug in RemoveNode where new(time.Now()) was used instead of &now, producing a zero time. Replace the separate connected map with a disconnectedAt field on multiChannelNodeConn (atomic.Pointer[time.Time]), tracked directly on the object that already manages the node's connections. Changes: - Add disconnectedAt field and helpers (markConnected, markDisconnected, isConnected, offlineDuration) to multiChannelNodeConn - Remove the connected field from Batcher - Simplify IsConnected from two map lookups to one - Simplify ConnectedMap and Debug from two-map iteration to one - Rewrite cleanupOfflineNodes to scan b.nodes directly - Remove the markDisconnectedIfNoConns helper - Update all tests and benchmarks Fixes #3141	2026-03-16 02:22:56 -07:00
Kristoffer Dalby	60317064fd	mapper/batcher: serialize per-node work to prevent out-of-order delivery processBatchedChanges queued each pending change for a node as a separate work item. Since multiple workers pull from the same channel, two changes for the same node could be processed concurrently by different workers. This caused two problems: 1. MapResponses delivered out of order — a later change could finish generating before an earlier one, so the client sees stale state. 2. updateSentPeers and computePeerDiff race against each other — updateSentPeers does Clear() + Store() which is not atomic relative to a concurrent Range() in computePeerDiff. Bundle all pending changes for a node into a single work item so one worker processes them sequentially. Add a per-node workMu that serializes processing across consecutive batch ticks, preventing a second worker from starting tick N+1 while tick N is still in progress. Fixes #3140	2026-03-16 02:22:46 -07:00
Juan Font	4d427cfe2a	noise: limit request body size to prevent unauthenticated OOM The Noise handshake accepts any machine key without checking registration, so all endpoints behind the Noise router are reachable without credentials. Three handlers used io.ReadAll without size limits, allowing an attacker to OOM-kill the server. Fix: - Add http.MaxBytesReader middleware (1 MiB) on the Noise router. - Replace io.ReadAll + json.Unmarshal with json.NewDecoder in PollNetMapHandler and RegistrationHandler. - Stop reading the body in NotImplementedHandler entirely.	2026-03-16 09:28:31 +01:00
Kristoffer Dalby	afd3a6acbc	mapper/batcher: remove disabled X-prefixed test functions Remove XTestBatcherChannelClosingRace (~95 lines) and XTestBatcherScalability (~515 lines). These were disabled by prefixing with X (making them invisible to go test) and served as dead code. The functionality they covered is exercised by the active test suite. Updates #2545	2026-03-14 02:52:28 -07:00
Kristoffer Dalby	feaf85bfbc	mapper/batcher: clean up test constants and output L8: Rename SCREAMING_SNAKE_CASE test constants to idiomatic Go camelCase. Remove highLoad* and extremeLoad* constants that were only referenced by disabled (X-prefixed) tests. L10: Fix misleading assert message that said "1337" while checking for region ID 999. L12: Remove emoji from test log output to avoid encoding issues in CI environments. Updates #2545	2026-03-14 02:52:28 -07:00
Kristoffer Dalby	86e279869e	mapper/batcher: minor production code cleanup L1: Replace crypto/rand with an atomic counter for generating connection IDs. These identifiers are process-local and do not need cryptographic randomness; a monotonic counter is cheaper and produces shorter, sortable IDs. L5: Use getActiveConnectionCount() in Debug() instead of directly locking the mutex and reading the connections slice. This avoids bypassing the accessor that already exists for this purpose. L6: Extract the hardcoded 15*time.Minute cleanup threshold into the named constant offlineNodeCleanupThreshold. L7: Inline the trivial addWork wrapper; AddWork now calls addToBatch directly. Updates #2545	2026-03-14 02:52:28 -07:00
Kristoffer Dalby	7881f65358	mapper: extract node connection types to node_conn.go Move connectionEntry, multiChannelNodeConn, generateConnectionID, and all their methods from batcher.go into a dedicated file. This reduces batcher.go from ~1170 lines to ~800 and separates per-node connection management from batcher orchestration. Pure move — no logic changes. Updates #2545	2026-03-14 02:52:28 -07:00
Kristoffer Dalby	2d549e579f	mapper/batcher: add regression tests for M1, M3, M7 fixes - TestBatcher_CloseBeforeStart_DoesNotHang: verifies Close() before Start() returns promptly now that done is initialized in NewBatcher. - TestBatcher_QueueWorkAfterClose_DoesNotHang: verifies queueWork returns via the done channel after Close(), even without Start(). - TestIsConnected_FalseAfterAddNodeFailure: verifies IsConnected returns false after AddNode fails and removes the last connection. - TestRemoveConnectionAtIndex_NilsTrailingSlot: verifies the backing array slot is nil-ed after removal to avoid retaining pointers. Updates #2545	2026-03-14 02:52:28 -07:00
Kristoffer Dalby	50e8b21471	mapper/batcher: fix pointer retention, done-channel init, and connected-map races M7: Nil out trailing *connectionEntry pointers in the backing array after slice removal in removeConnectionAtIndexLocked and send(). Without this, the GC cannot collect removed entries until the slice is reallocated. M1: Initialize the done channel in NewBatcher instead of Start(). Previously, calling Close() or queueWork before Start() would select on a nil channel, blocking forever. Moving the make() to the constructor ensures the channel is always usable. M2: Move b.connected.Delete and b.totalNodes decrement inside the Compute callback in cleanupOfflineNodes. Previously these ran after the Compute returned, allowing a concurrent AddNode to reconnect between the delete and the bookkeeping update, which would wipe the fresh connected state. M3: Call markDisconnectedIfNoConns on AddNode error paths. Previously, when initial map generation or send timed out, the connection was removed but b.connected retained its old nil (= connected) value, making IsConnected return true for a node with zero connections. Updates #2545	2026-03-14 02:52:28 -07:00
Kristoffer Dalby	8e26651f2c	mapper/batcher: add regression tests for timer leak and Close lifecycle Add four unit tests guarding fixes introduced in recent commits: - TestConnectionEntry_SendFastPath_TimerStopped: verifies the time.NewTimer fix (H1) does not leak goroutines after many fast-path sends on a buffered channel. - TestBatcher_CloseWaitsForWorkers: verifies Close() blocks until all worker goroutines exit (H3), preventing sends on torn-down channels. - TestBatcher_CloseThenStartIsNoop: verifies the one-shot lifecycle contract; Start() after Close() must not spawn new goroutines. - TestBatcher_CloseStopsTicker: verifies Close() stops the internal ticker to prevent resource leaks. Updates #2545	2026-03-14 02:52:28 -07:00
Kristoffer Dalby	57a38b5678	mapper/batcher: reduce hot-path log verbosity Remove Caller(), channel pointer formatting (fmt.Sprintf("%p",...)), and mutex timing from send(), addConnection(), and removeConnectionByChannel(). Move per-broadcast summary and no-connection logs from Debug to Trace. Remove per-connection "attempting"/"succeeded" logs entirely; keep Warn for failures. These methods run on every MapResponse delivery, so the savings compound quickly under load. Updates #2545	2026-03-14 02:52:28 -07:00
Kristoffer Dalby	051a38a4c4	mapper/batcher: track worker goroutines and stop ticker on Close Close() previously closed the done channel and returned immediately, without waiting for worker goroutines to exit. This caused goroutine leaks in tests and allowed workers to race with connection teardown. The ticker was also never stopped, leaking its internal goroutine. Add a sync.WaitGroup to track the doWork goroutine and every worker it spawns. Close() now calls wg.Wait() after signalling shutdown, ensuring all goroutines have exited before tearing down connections. Also stop the ticker to prevent resource leaks. Document that a Batcher must not be reused after Close().	2026-03-14 02:52:28 -07:00
Kristoffer Dalby	3276bda0c0	mapper/batcher: replace time.After with NewTimer to avoid timer leak connectionEntry.send() is on the hot path: called once per connection per broadcast tick. time.After allocates a timer that sits in the runtime timer heap until it fires (50 ms), even when the channel send succeeds immediately. At 1000 connected nodes, every tick leaks 1000 timers into the heap, creating continuous GC pressure. Replace with time.NewTimer + defer timer.Stop() so the timer is removed from the heap as soon as the fast-path send completes.	2026-03-14 02:52:28 -07:00
Kristoffer Dalby	ebc57d9a38	integration/acl: fix TestACLPolicyPropagationOverTime infrastructure Add embedded DERP server, TLS, and netfilter=off to match the infrastructure configuration used by all other ACL integration tests. Without these options, the test fails intermittently because traffic routes through external DERP relays and iptables initialization fails in Docker containers. Updates #3139	2026-03-14 02:52:28 -07:00
Kristoffer Dalby	2058343ad6	mapper: remove Batcher interface, rename to Batcher struct Remove the Batcher interface since there is only one implementation. Rename LockFreeBatcher to Batcher and merge batcher_lockfree.go into batcher.go. Drop type assertions in debug.go now that mapBatcher is a concrete *mapper.Batcher pointer.	2026-03-14 02:52:28 -07:00
Kristoffer Dalby	9b24a39943	mapper/batcher: add scale benchmarks Add benchmarks that systematically test node counts from 100 to 50,000 to identify scaling limits and validate performance under load.	2026-03-14 02:52:28 -07:00
Kristoffer Dalby	3ebe4d99c1	mapper/batcher: reduce lock contention with two-phase send Rewrite multiChannelNodeConn.send() to use a two-phase approach: 1. RLock: snapshot connections slice (cheap pointer copy) 2. Unlock: send to all connections (50ms timeouts happen here) 3. Lock: remove failed connections by pointer identity Previously, send() held the write lock for the entire duration of sending to all connections. With N stale connections each timing out at 50ms, this blocked addConnection/removeConnection for N50ms. The two-phase approach holds the lock only for O(N) pointer operations, not for N50ms I/O waits.	2026-03-14 02:52:28 -07:00
Kristoffer Dalby	da33795e79	mapper/batcher: fix race conditions in cleanup and lookups Replace the two-phase Load-check-Delete in cleanupOfflineNodes with xsync.Map.Compute() for atomic check-and-delete. This prevents the TOCTOU race where a node reconnects between the hasActiveConnections check and the Delete call. Add nil guards on all b.nodes.Load() and b.nodes.Range() call sites to prevent nil pointer panics from concurrent cleanup races.	2026-03-14 02:52:28 -07:00
Kristoffer Dalby	57070680a5	mapper/batcher: restructure internals for correctness Move per-node pending changes from a shared xsync.Map on the batcher into multiChannelNodeConn, protected by a dedicated mutex. The new appendPending/drainPending methods provide atomic append and drain operations, eliminating data races in addToBatch and processBatchedChanges. Add sync.Once to multiChannelNodeConn.close() to make it idempotent, preventing panics from concurrent close calls on the same channel. Add started atomic.Bool to guard Start() against being called multiple times, preventing orphaned goroutines. Add comprehensive concurrency tests validating these changes.	2026-03-14 02:52:28 -07:00
Kristoffer Dalby	21e02e5d1f	mapper/batcher: add unit tests and benchmarks Add comprehensive unit tests for the LockFreeBatcher covering AddNode/RemoveNode lifecycle, addToBatch routing (broadcast, targeted, full update), processBatchedChanges deduplication, cleanup of offline nodes, close/shutdown behavior, IsConnected state tracking, and connected map consistency. Add benchmarks for connection entry send, multi-channel send and broadcast, peer diff computation, sentPeers updates, addToBatch at various scales (10/100/1000 nodes), processBatchedChanges, broadcast delivery, IsConnected lookups, connected map enumeration, connection churn, and concurrent send+churn scenarios. Widen setupBatcherWithTestData to accept testing.TB so benchmarks can reuse the same database-backed test setup as unit tests.	2026-03-14 02:52:28 -07:00
Kristoffer Dalby	2f94b80e70	go.mod: add stress tool dependency Add golang.org/x/tools/cmd/stress as a tool dependency for running tests under repeated stress to surface flaky failures. Update flake vendorHash for the new go.mod dependencies.	2026-03-14 02:52:28 -07:00
Kristoffer Dalby	3e0a96ec3a	all: fix test flakiness and improve test infrastructure Buffer the AuthRequest verdict channel to prevent a race where the sender blocks indefinitely if the receiver has already timed out, and increase the auth followup test timeout from 100ms to 5s to prevent spurious failures under load. Skip postgres-backed tests when the postgres server is unavailable instead of calling t.Fatal, which was preventing the rest of the test suite from running. Add TestMain to db, types, and policy/v2 packages to chdir to the source directory before running tests. This ensures relative testdata/ paths resolve correctly when the test binary is executed from an arbitrary working directory (e.g., via "go tool stress").	2026-03-14 02:52:28 -07:00
DM	fffc58b5d0	poll: fix poll test linter violations	2026-03-12 01:27:34 -07:00
DM	4aca9d6568	poll: stop stale map sessions through an explicit teardown hook When stale-send cleanup prunes a connection from the batcher, the old serveLongPoll session needs an explicit stop signal. Pass a stop hook into AddNode and trigger it when that connection is removed, so the session exits through its normal cancel path instead of relying on channel closure from the batcher side.	2026-03-12 01:27:34 -07:00
DM	3daf45e88a	mapper: close stale map channels after send timeouts When the batcher timed out sending to a node, it removed the channel from multiChannelNodeConn but left the old serveLongPoll goroutine running on that channel. That left a live stale session behind: it no longer received new updates, but it could still keep the stream open and block shutdown. Close the pruned channel when stale-send cleanup removes it so the old map session exits after draining any buffered update.	2026-03-12 01:27:34 -07:00
DM	b81d6c734d	mapper: handle RemoveNode after channel cleanup A connection can already be removed from multiChannelNodeConn by the stale-send cleanup path before serveLongPoll reaches its deferred RemoveNode call. In that case RemoveNode used to return early on "channel not found" and never updated the node's connected state. Drop that early return so RemoveNode still checks whether any active connections remain and marks the node disconnected when the last one is gone.	2026-03-12 01:27:34 -07:00
Kristoffer Dalby	c5ef1d3bb9	nix: upgrade dev shell to Python 3.14 Update mdformat and related packages from python313Packages to python314Packages. All four packages (mdformat, mdformat-footnote, mdformat-frontmatter, mdformat-mkdocs) are available in the updated nixpkgs. Updates #1261	2026-03-11 03:18:14 -07:00
Kristoffer Dalby	542cdb2cb2	all: update Go to 1.26.1 Bump Go version from 1.26.0 to 1.26.1 across go.mod, Dockerfiles, and the integration test runner fallback defaults. Updates #1261	2026-03-11 03:18:14 -07:00
Kristoffer Dalby	5e33259550	nix: update flake inputs Update nixpkgs from 2026-02-15 (ac055f38) to 2026-03-08 (608d0cad). Updates #1261	2026-03-11 03:18:14 -07:00
Kristoffer Dalby	65880ecb58	nix: disable external DERP URL fetch in VM test Explicitly set derp.urls to an empty list in the NixOS VM test, matching the upstream nixpkgs test. The VMs have no internet access, so fetching the default Tailscale DERP map would silently fail and add unnecessary timeout delay to the test run.	2026-03-06 05:18:44 -08:00
Kristoffer Dalby	37c6a9e3a6	nix: sync module options and descriptions with upstream nixpkgs Add missing typed options from the upstream nixpkgs module: - configFile: read-only option exposing the generated config path for composability with other NixOS modules - dns.split: split DNS configuration with proper type checking - dns.extra_records: typed submodule with name/type/value validation Sync descriptions and assertions with upstream: - Use Tailscale doc link for override_local_dns description - Remove redundant requirement note from nameservers.global - Match upstream assertion message wording and expression style Update systemd script to reference cfg.configFile instead of a local let-binding, matching the upstream pattern.	2026-03-06 05:18:44 -08:00
DM	8423af2732	Swap favicon for updated version	2026-03-03 05:59:40 +01:00
Florian Preinstorfer	9baa795ddb	Update docs for auth-id changes - Replace "headscale nodes register" with "headscale auth register" - Update from registration key to Auth ID - Fix API example to register a node	2026-03-01 13:38:22 +01:00
Florian Preinstorfer	acddd73183	Reformat docs with mdformat	2026-03-01 09:24:52 +01:00
Florian Preinstorfer	47307d19cf	Switch to mdformat to format docs - Use mdformat and mdformat-mkdocs to format docs - Add mdformat to Makefile and pre-commit-config - Prettier ignores docs/	2026-03-01 09:24:52 +01:00
Kristoffer Dalby	5c449db125	ci: regenerate test-integration.yaml for TestSSHLocalpart Updates #3049	2026-02-28 05:14:11 -08:00
Kristoffer Dalby	2be94ce19a	integration: add TestSSHLocalpart integration test Add end-to-end integration test that validates localpart:*@domain SSH user mapping with real Tailscale clients. The test sets up an SSH policy with localpart entries and verifies that users can SSH into tagged servers using their email local-part as the username. Updates #3049	2026-02-28 05:14:11 -08:00
Kristoffer Dalby	6c59d3e601	policy/v2: add SSH compatibility testdata from Tailscale SaaS Add 39 test fixtures captured from Tailscale SaaS API responses to validate SSH policy compilation parity. Each JSON file contains the SSH policy section and expected compiled SSHRule arrays for 5 test nodes (3 user-owned, 2 tagged). Test series: SSH-A (basic), SSH-B (specific sources), SSH-C (destination combos), SSH-D (localpart), SSH-E (edge cases), SSH-F (multi-rule), SSH-G (acceptEnv). The data-driven TestSSHDataCompat harness uses cmp.Diff with principal order tolerance but strict rule ordering (first-match-wins semantics require exact order). Updates #3049	2026-02-28 05:14:11 -08:00
Kristoffer Dalby	0acf09bdd2	policy/v2: add localpart:@domain SSH user compilation Add support for localpart:@<domain> entries in SSH policy users. When a user SSHes into a target, their email local-part becomes the OS username (e.g. alice@example.com → OS user alice). Type system (types.go): - SSHUser.IsLocalpart() and ParseLocalpart() for validation - SSHUsers.LocalpartEntries(), NormalUsers(), ContainsLocalpart() - Enforces format: localpart:@<domain> (wildcard-only) - UserWildcard.Resolve for user:@domain SSH source aliases - acceptEnv passthrough for SSH rules Compilation (filter.go): - resolveLocalparts: pure function mapping users to local-parts by email domain. No node walking, easy to test. - groupSourcesByUser: single walk producing per-user principals with sorted user IDs, and tagged principals separately. - ipSetToPrincipals: shared helper replacing 6 inline copies. - selfPrincipalsForNode: self-access using pre-computed byUser. The approach separates data gathering from rule assembly. Localpart rules are interleaved per source user to match Tailscale SaaS first-match-wins ordering. Updates #3049	2026-02-28 05:14:11 -08:00
QEDeD	414d3bbbd8	Fix typo in comment about fsnotify behavior Correct loose (opposite of tight) to lose (opposite of keep).	2026-02-27 15:23:06 +01:00
Stefan Bethke	0f12e414a6	Explain one approach to update OIDC provider info See #3112	2026-02-27 10:06:50 +01:00

1 2 3 4 5 ...

3930 Commits