headscale

mirror of https://github.com/juanfont/headscale.git synced 2026-04-10 19:17:25 +02:00

Author	SHA1	Message	Date
Kristoffer Dalby	d6feadde88	policy/v2: fix grant-only policies returning FilterAllowAll compileFilterRules checked only pol.ACLs == nil to decide whether to return FilterAllowAll (permit-any). Policies that use only Grants (no ACLs) had nil ACLs, so the function short-circuited before compiling any CapGrant rules. This meant cap/relay, cap/drive, and any other App-based grant capabilities were silently ignored. Check both ACLs and Grants are empty before returning FilterAllowAll. Updates #2180	2026-03-25 15:17:24 +00:00
Kristoffer Dalby	aed573e813	policy/v2,state,mapper: implement per-viewer via route steering Via grants steer routes to specific nodes per viewer. Until now, all clients saw the same routes for each peer because route assembly was viewer-independent. This implements per-viewer route visibility so that via-designated peers serve routes only to matching viewers, while non-designated peers have those routes withdrawn. Add ViaRouteResult type (Include/Exclude prefix lists) and ViaRoutesForPeer to the PolicyManager interface. The v2 implementation iterates via grants, resolves sources against the viewer, matches destinations against the peer's advertised routes (both subnet and exit), and categorizes prefixes by whether the peer has the via tag. Add RoutesForPeer to State which composes global primary election, via Include/Exclude filtering, exit routes, and ACL reduction. When no via grants exist, it falls back to existing behavior. Update the mapper to call RoutesForPeer per-peer instead of using a single route function for all peers. The route function now returns all routes (subnet + exit), and TailNode filters exit routes out of the PrimaryRoutes field for HA tracking. Updates #2180	2026-03-25 15:17:24 +00:00
Kristoffer Dalby	66ac9a26ff	policy/v2: handle autogroup:internet in via grant compilation compileViaGrant only handled Prefix destinations, skipping AutoGroup entirely. This meant via grants with dst=[autogroup:internet] produced no filter rules even when the node was an exit node with approved exit routes. Switch the destination loop from a type assertion to a type switch that handles both Prefix (subnet routes) and AutoGroup (exit routes via autogroup:internet). Also check ExitRoutes() in addition to SubnetRoutes() so the function doesn't bail early when a node only has exit routes. Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	1a409424ee	policy/v2: implement autogroup:danger-all support Add autogroup:danger-all as a valid source alias that matches ALL IP addresses including non-Tailscale addresses. When used as a source, it resolves to 0.0.0.0/0 + ::/0 internally but produces SrcIPs: ["*"] in filter rules. When used as a destination, it is rejected with an error matching Tailscale SaaS behavior. Key changes: - Add AutoGroupDangerAll constant and validation - Add sourcesHaveDangerAll() helper and hasDangerAll parameter to srcIPsWithRoutes() across all compilation paths - Add ErrAutogroupDangerAllDst for destination rejection - Remove 3 AUTOGROUP_DANGER_ALL skip entries (K6, K7, K8) Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	12a34f3895	policy/v2: implement grant validation rules matching Tailscale SaaS Add five categories of grant validation that Tailscale enforces: 1. Capability name format: reject URL schemes (://) and restrict tailscale.com domain to an allowlist of user-grantable caps. 2. Grant-specific autogroup:self: reject wildcard () sources with autogroup:self destinations (stricter than ACL rules since includes tags which cannot use autogroup:self). 3. App + autogroup:internet: reject app grants targeting autogroup:internet. 4. Raw default route CIDRs: reject 0.0.0.0/0 and ::/0 as grant destinations, requiring "*" or "autogroup:internet" instead. 5. Via field: non-tag values (e.g. autogroup:tagged) are caught at unmarshal time by Tag.UnmarshalJSON validation. This resolves 23 ERROR_VALIDATION_GAP + 1 via validation test, reducing the grant compat skip list from 28 to 5 remaining tests. Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	84210c03bb	policy/v2: accept empty grant sources and destinations Tailscale SaaS accepts grants with empty src=[] or dst=[] arrays, producing no filter rules for any node. Headscale previously rejected these with validation errors. Remove the empty source/destination validation checks and add an early return in compileGrantWithAutogroupSelf when the grant has literally empty Sources or Destinations arrays. This is distinct from sources that resolve to empty (e.g., group:empty) where Tailscale still produces CapGrant rules with empty SrcIPs. Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	a3c262206c	policy/v2: implement via route compilation for grants Compile grants with "via" field into FilterRules that are placed only on nodes matching the via tag and actually advertising the destination subnets. Key behavior: - Filter rules go exclusively to via-nodes with matching approved routes - Destination subnets not advertised by the via node are silently dropped - App-only via grants (no ip field) produce no packet filter rules - Via grants are skipped in the global compileFilterRules since they are node-specific Reduces grant compat test skips from 41 to 30 (11 newly passing). Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	118bc1fcbf	policy/v2: implement CapGrant compilation with companion capabilities Compile grant app fields into CapGrant FilterRules matching Tailscale SaaS behavior. Key changes: - Generate CapGrant rules in compileFilterRules and compileGrantWithAutogroupSelf, with node-specific /32 and /128 Dsts for autogroup:self grants - Add reversed companion rules for drive→drive-sharer and relay→relay-target capabilities, ordered by original cap name - Narrow broad CapGrant Dsts to node-specific prefixes in ReduceFilterRules via new reduceCapGrantRule helper - Skip merging CapGrant rules in mergeFilterRules to preserve per-capability structure - Remove ip+app mutual exclusivity validation (Tailscale accepts both) - Add semantic JSON comparison for RawMessage types and netip.Prefix comparators in test infrastructure Reduces grant compat test skips from 99 to 41 (58 newly passing). Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	e0cb46ca53	policy/v2: preserve non-wildcard source IPs alongside wildcard ranges When an ACL source list contains a wildcard (*) alongside explicit sources (tags, groups, hosts, etc.), Tailscale preserves the individual IPs from non-wildcard sources in SrcIPs alongside the merged wildcard CGNAT ranges. Previously, headscale's IPSetBuilder would merge all sources into a single set, absorbing the explicit IPs into the wildcard range. Track non-wildcard resolved addresses separately during source resolution, then append their individual IP strings to the output when a wildcard is also present. This fixes the remaining 5 ACL compat test failures (K01 and M06 subtests). Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	ac4beab6d0	policy/v2: reorder ACL self grants to match Tailscale rule ordering When an ACL has non-autogroup destinations (groups, users, tags, hosts) alongside autogroup:self, emit non-self grants before self grants to match Tailscale's filter rule ordering. ACLs with only autogroup destinations (self + member) preserve the policy-defined order. This fixes ACL-A17, ACL-SF07, and ACL-SF11 compat test failures. Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	19b5a39aec	policy/v2: remove resolved SUBNET_ROUTE_FILTER_RULES grant skips Remove 10 grant skip entries for subnet route filter rule generation. These tests now pass after the exit route exclusion fix in ReduceFilterRules, which correctly handles routable IPs overlap for subnet-router nodes. Updates skip count from 207 to 197 (v1) and 109 to 99 (v2), with 10 additional tests now expected to pass. Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	044f3fc0ec	policy/policyutil: exclude exit routes from ReduceFilterRules Exit nodes handle traffic via AllowedIPs/routing, not packet filter rules. Skip exit routes (0.0.0.0/0, ::/0) when checking RoutableIPs overlap in ReduceFilterRules, matching Tailscale SaaS behavior where exit nodes do not receive filter rules for destinations that only overlap via exit routes. Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	9e0e77c90d	policy/v2: use approved node routes in wildcard SrcIPs Per Tailscale documentation, the wildcard (*) source includes "any approved subnets" — the actually-advertised-and-approved routes from nodes, not the autoApprover policy prefixes. Change Asterix.resolve() to return just the base CGNAT+ULA set, and add approved subnet routes as separate SrcIPs entries in the filter compilation path. This preserves individual route prefixes that would otherwise be merged by IPSet (e.g., 10.0.0.0/8 absorbing 10.33.0.0/16). Also swap rule ordering in compileGrantWithAutogroupSelf() to emit non-self destination rules before autogroup:self rules, matching the Tailscale FilterRule wire format ordering. Remove the unused AutoApproverPolicy.prefixes() method. Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	57bc77b98c	policy/v2: add advertised routes to compat test topologies Add routable_ips and approved_routes fields to the node topology definitions in all golden test files. These represent the subnet routes actually advertised by nodes on the Tailscale SaaS network during data capture: Routes topology (92 files, 6 router nodes): big-router: 10.0.0.0/8 subnet-router: 10.33.0.0/16 ha-router1: 192.168.1.0/24 ha-router2: 192.168.1.0/24 multi-router: 172.16.0.0/24 exit-node: 0.0.0.0/0, ::/0 ACL topology (199 files, 1 router node): subnet-router: 10.33.0.0/16 Grants topology (203 files, 1 router node): subnet-router: 10.33.0.0/16 The route assignments were deduced from the golden data by analyzing which router nodes receive FilterRules for which destination CIDRs across all test files, and cross-referenced with the MTS setup script (setup_grant_nodes.sh). Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	1235a17e6f	policy/v2: remove resolved AUTOGROUP_SELF_CIDR_FORMAT grant skips Remove 4 entries from grantSkipReasons that are now passing after the autogroup:self DstPorts bare IP fix. Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	bc9877ce28	policy/v2: use bare IPs in autogroup:self DstPorts Use ip.String() instead of netip.PrefixFrom(ip, ip.BitLen()).String() when building DstPorts for autogroup:self destinations. This produces bare IPs like "100.90.199.68" instead of CIDR notation like "100.90.199.68/32", matching the Tailscale FilterRule wire format. Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	e3ab288351	policy/v2: remove resolved grant skip categories Remove 91 entries from grantSkipReasons that are now passing: - 90 MISSING_IPV6_ADDRS entries (identity aliases now include IPv6) - 1 RAW_IPV6_ADDR_EXPANSION entry (address aliases no longer expand) Move GRANT-P09_12B from the removed MISSING_IPV6_ADDRS category to SUBNET_ROUTE_FILTER_RULES, which is its remaining failure mode. Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	1a5ed2c7ca	policy/policyutil: update ReduceFilterRules test expectations for IPv6 Now that AppendToIPSet includes both IPv4 and IPv6, tests with nodes that have IPv6 addresses produce additional entries in SrcIPs and DstPorts. Update the expected values accordingly. Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	ccade49742	policy: include IPv6 in identity-based alias resolution AppendToIPSet now adds both IPv4 and IPv6 addresses for nodes, matching Tailscale's FilterRule wire format where identity-based aliases (tags, users, groups, autogroups) resolve to both address families. Address-based aliases (raw IPs, host names) are unchanged: they resolve to exactly the literal prefix. The appendIfNodeHasIP helper that incorrectly expanded address aliases to include the matching node's other IPs is removed, fixing the RAW_IPV6_ADDR_EXPANSION bug where a raw fd7a: IPv6 address would incorrectly include the node's IPv4. Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	91aac1ceb2	hscontrol/policy/v2: replace routes golden data with Tailscale SaaS captures Replace the headscale-adapted routes golden files with authoritative captures from Tailscale SaaS using the 12-node topology (8 original grant nodes + 4 new route-specific nodes: ha-router1, ha-router2, big-router, multi-router). The golden data was captured via debug-packet-filter-rules from all 12 nodes. The routes driver now falls back to the standard 3-user setup when topology.users is absent (matching the SaaS capture format) and converts @passkey/@dalby.cc emails to @example.com. 92 test cases captured, all valid JSON, all from Tailscale SaaS. Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	162e1dc35b	hscontrol/policy/v2: replace ACL golden data with Tailscale SaaS captures Replace the headscale-adapted ACL golden files with authoritative captures from Tailscale SaaS using the 8-node grant topology. The golden data was captured via debug-packet-filter-rules (FilterRule wire format) from each of the 8 nodes after pushing each ACL policy to the Tailscale API. This gives us the exact format Tailscale sends to clients: - SrcIPs use IP ranges (100.64.0.0-100.115.91.255) not CIDRs - SrcIPs include subnet routes (10.33.0.0/16) for wildcard sources - IPProto is omitted for default all-protocol rules - DstPorts use bare IPs without /32 suffix - Identity aliases include both IPv4 and IPv6 addresses The test driver is updated to use the 8-node topology (3 users, 5 tagged nodes) matching the grant compat tests, with the same email conversion (kratail2tid@passkey -> @example.com). 215 test cases: 199 success + 16 error (captured from API 400s). All captured from Tailscale SaaS, no headscale-adapted values. Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	d83697186a	hscontrol/policy/v2: convert routes compat tests to JSON-driven format Replace 8,286 lines of inline Go struct test expectations in tailscale_routes_compat_test.go with 92 JSON golden files in testdata/routes_results/ROUTES-*.json and a ~300-line Go driver in tailscale_routes_data_compat_test.go. Unlike the ACL and grants compat tests which use shared hardcoded node topologies, the routes driver builds nodes from JSON topology data. Each test file embeds its full topology including routable_ips and approved_routes, making test files self-contained. This naturally handles the IPv6 tests which use a different 4-node topology from the standard 9-node setup. Test count is preserved: 92 test cases across 19 original test functions (SubnetBasics, ExitNodes, HARouters, FilterPlacement, RouteCoverage, Overlapping, TagResolution, ProtocolPort, IPv6, EdgeCases, AutoApprover, and additional variants). Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	7e71d1b58f	hscontrol/policy/v2: convert ACL compat tests to JSON-driven format Replace 9,937 lines of inline Go struct test expectations in tailscale_acl_compat_test.go with 215 JSON golden files in testdata/acl_results/ACL-.json and a ~400-line Go driver in tailscale_acl_data_compat_test.go. This matches the pattern used by the grants compat tests (testdata/grant_results/GRANT-.json + tailscale_grants_compat_test.go) and the SSH compat tests (testdata/ssh_results/SSH-*.json + tailscale_ssh_data_compat_test.go). The JSON golden files contain the same test expectations as the original Go file, preserving the Tailscale SaaS reference data. The expectations are NOT adapted to match headscale current output — they represent the target behavior. Test count is preserved: 215 test cases (203 success + 12 error). Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	0562bd85f4	hscontrol/policy/v2: fix test helpers to match production pipeline - TestTagUserMutualExclusivity and TestUserToTagCrossIdentityGrant: add ReduceFilterRules after compileFilterRulesForNode to match the production filter pipeline in filterForNodeLocked. The compilation step produces global rules for all ACLs; ReduceFilterRules strips them down to only rules where the node is a destination. - containsSrcIP/containsIP helpers: use util.ParseIPSet to handle IP range strings like "100.64.0.1-100.64.0.3" produced by ipSetToStrings when contiguous IPs are coalesced. Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	5830eabf09	hscontrol/policy: fix test assertions and expectations Fix several test issues exposed by the ResolvedAddresses refactor: - TestTagUserMutualExclusivity: remove incorrect ACL rule that was testing the wrong invariant. The test now correctly validates that without an explicit cross-identity grant, user-owned nodes cannot reach tagged nodes. Add TestUserToTagCrossIdentityGrant to verify that explicit user@ -> tag:X ACL rules produce valid filter rules. - TestResolvePolicy/wildcard-alias: update expected prefixes to match the CGNAT range minus ChromeOS VM range (multiple prefixes instead of the encompassing 100.64.0.0/10). - TestApproveRoutesWithPolicy: fix user Name fields from "testuser@" to "testuser" to match how resolveUser trims the @ suffix before comparing against stored names. Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	5f3bddc663	hscontrol/policy/v2: fix nil dereferences in alias resolution Fix three nil dereference issues in the policy resolution code: - newResolvedAddresses: preserve partial IP results when errors occur instead of discarding valid IPSets. Callers already handle errors and nil results independently, so returning both allows partial resolution (e.g. groups with phantom users) to work correctly. - resolveTagOwners: guard against nil ResolvedAddresses before calling Prefixes(), since Resolve may return nil when resolution fails. - Asterix.resolve: guard against nil *Policy pointer, which occurs when resolving wildcards without a policy context (e.g. in tests). Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	0c6ac28b04	hscontrol/policy/v2: recategorize grants skip list from SRCIPS_FORMAT into granular root causes Replace the monolithic SRCIPS_FORMAT skip category (125 tests) with 7 specific subcategories based on analysis of actual test failures: MISSING_IPV6_ADDRS - 90 tests: identity aliases resolve to IPv4 only SUBNET_ROUTE_FILTER_RULES - 10 tests: no rules for subnet-routed CIDRs AUTOGROUP_SELF_CIDR_FORMAT - 4 tests: /32 and /128 suffix on DstPorts IPs USER_PASSKEY_WILDCARD - 2 tests: user:*@passkey unresolvable RAW_IPV6_ADDR_EXPANSION - 2 tests: raw IPv6 expanded to include IPv4 SRCIPS_WILDCARD_NODE_DEDUP - 1 test: wildcard+specific node IP dedup Also reclassify tests that moved between categories after the CGNAT split range fix (4 tests now passing, others recategorized into CAPGRANT_COMPILATION, ERROR_VALIDATION_GAP, VIA_COMPILATION, etc). Total: 207 skipped, 30 passing (was 193 skipped, 19 passing).	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	6f32dcf6f9	maybe only return ipv4? not always? Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	f01052c85f	speculative new datastruct, fix ip range return Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	646a6e8266	hscontrol/policy/v2: add skip entries for 25 v2 gap-filling grant tests Update the grants compatibility test skip list with 23 new entries for the V-series tests (V07 and V24 pass without skipping). New skip categories introduced: - VIA_COMPILATION (3): via routes with specific src identities where there is no SrcIPs format issue (V11, V12, V13) - Additional VIA_COMPILATION_AND_SRCIPS_FORMAT (3): via with wildcard src (V17, V21, V23) - Additional CAPGRANT_COMPILATION (6): app grants on specific tags, drive cap, autogroup:self app (V02, V03, V06, V19, V20, V25) - Additional CAPGRANT_COMPILATION_AND_SRCIPS_FORMAT (2): mixed ip+app on specific tags rejected by headscale (V09, V10) - Additional ERROR_VALIDATION_GAP (9): autogroup:internet + app, raw 0.0.0.0/0 and ::/0 as grant dst (V01, V04, V05, V08, V14-V16, V18, V22) Test totals: 237 total, 21 pass, 216 skip, 0 fail. Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	aa68fbafc0	hscontrol/policy/v2: add 25 v2 gap-filling grant testdata files Add GRANT-V01 through GRANT-V25 JSON files captured from Tailscale SaaS to fill coverage gaps in the grants compatibility test suite. These tests cover: - App grants on specific tags (not just wildcards) - Mixed ip+app grants on specific tags - Via routes with specific src identities (tags, groups, members) - Via with multiple dst subnets and multiple via tags - Drive cap with reverse drive-sharer generation - autogroup:self with app grants - autogroup:internet rejection with app grants - Raw default route CIDR (0.0.0.0/0, ::/0) rejection as grant dst Updates #2180	2026-03-25 15:17:23 +00:00
Kristoffer Dalby	2446158191	hscontrol/policy/v2: add data-driven grants compatibility test Add TestGrantsCompat, a data-driven test that validates headscale's grants implementation against 212 test cases captured from Tailscale SaaS. Each test case loads a GRANT-*.json file from testdata/, applies the policy through headscale's engine, and compares the resulting packet filter rules against Tailscale's actual output. Currently 19 tests pass and 193 are skipped with documented reasons: - SRCIPS_FORMAT (125): IP range formatting differences - CAPGRANT_COMPILATION (41): app capability grants not yet compiled - ERROR_VALIDATION_GAP (14): validation strictness differences - CAPGRANT_AND_SRCIPS_FORMAT (9): combined ip+app grant issues - VIA_AND_SRCIPS_FORMAT (4): via route compilation not implemented - AUTOGROUP_DANGER_ALL (3): autogroup:danger-all not supported - VALIDATION_STRICTNESS (2): empty src/dst array handling Updates #2180	2026-03-25 15:17:22 +00:00
Kristoffer Dalby	f1756f4d12	hscontrol/policy/v2: add grants compatibility testdata (212 JSON files) Add 212 GRANT-*.json test files captured from Tailscale SaaS to testdata/grant_results/. Each file contains a policy with grants, the expected packet_filter_rules for 8 test nodes, and the topology used during capture. These files serve as the ground truth for the data-driven grants compatibility test. Updates #2180	2026-03-25 15:17:22 +00:00
Kristoffer Dalby	ca2081a44f	hscontrol/policy/v2: rename tailscale_compat_test.go to tailscale_acl_compat_test.go Rename the ACL compatibility test file to include 'acl' in the name, making room for the upcoming grants compatibility test file. Also fix a godoclint issue by adding a blank line between the file header comment and the package declaration. Updates #2180	2026-03-25 15:17:22 +00:00
Kristoffer Dalby	90c9555876	hscontrol/policy/v2: add ProtocolPort.MarshalJSON for Grant serialization Implement ProtocolPort.MarshalJSON to produce string format matching UnmarshalJSON expectations (e.g. "tcp:443", "udp:10000-20000", "*"). Add comprehensive TestGrantMarshalJSON with 10 test cases: - IP-based grants with TCP, UDP, ICMP, and wildcard protocols - Single ports, port ranges, and wildcard ports - Capability-based grants using app field - Grants with both ip and app fields - Grants with via field for route filtering - Testing omitempty behavior for ip, app, and via fields - JSON round-trip validation (marshal → unmarshal → compare) Add omitempty tag to Grant.InternetProtocols to avoid marshaling null when field is empty. Updates #2180	2026-03-25 15:17:22 +00:00
Kristoffer Dalby	1c31f04fab	hscontrol/policy/v2: add TestACLToGrants Add test for aclToGrants() function that converts ACL rules to Grant format. Tests conversion of: - Single-port TCP rules - Multiple ACL entries to multiple Grants - Port ranges and multiple ports in a single rule - Wildcard protocols - UDP, ICMP, and other protocol types Ensures backward compatibility by verifying that ACL rules are correctly transformed to the new Grant format. Updates #2180	2026-03-25 15:17:22 +00:00
Kristoffer Dalby	31c0ecbd68	hscontrol/policy/v2: add TestUnmarshalGrants Add comprehensive tests for Grant unmarshaling covering: - Valid grants with ip field (network access) - Valid grants with app field (capabilities) - Wildcard port handling - Port range parsing - Error cases (missing fields, conflicting fields) Updates #2180	2026-03-25 15:17:22 +00:00
Kristoffer Dalby	3ffdb4280a	hscontrol/policy/v2: add Grant policy format support Add support for the Grant policy format as an alternative to ACL format, following Tailscale's policy v2 specification. Grants provide a more structured way to define network access rules with explicit separation of IP-based and capability-based permissions. Key changes: - Add Grant struct with Sources, Destinations, InternetProtocols (ip), and App (capabilities) fields - Add ProtocolPort type for unmarshaling protocol:port strings - Add Grant validation in Policy.validate() to enforce: - Mutual exclusivity of ip and app fields - Required ip or app field presence - Non-empty sources and destinations - Refactor compileFilterRules to support both ACLs and Grants - Convert ACLs to Grants internally via aclToGrants() for unified processing - Extract destinationsToNetPortRange() helper for cleaner code - Rename parseProtocol() to toIANAProtocolNumbers() for clarity - Add ProtocolNumberToName mapping for reverse lookups The Grant format allows policies to be written using either the legacy ACL format or the new Grant format. ACLs are converted to Grants internally, ensuring backward compatibility while enabling the new format's benefits. Updates #2180	2026-03-25 15:17:22 +00:00
Tanayk07	568baf3d02	fix: align banner right-side border to consistent 64-char width	2026-03-19 07:08:35 +01:00
Tanayk07	5105033224	feat: add prominent warning banner for non-standard IP prefixes Add a highly visible ASCII-art warning banner that is printed at startup when the configured IP prefixes fall outside the standard Tailscale CGNAT (100.64.0.0/10) or ULA (fd7a:115c:a1e0::/48) ranges. The warning fires once even if both v4 and v6 are non-standard, and the warnBanner() function is reusable for other critical configuration warnings in the future. Also updates config-example.yaml to clarify that subsets of the default ranges are fine, but ranges outside CGNAT/ULA are not. Closes #3055	2026-03-19 07:08:35 +01:00
Kristoffer Dalby	3d53f97c82	hscontrol/servertest: fix test expectations for eventual consistency Three corrections to issue tests that had wrong assumptions about when data becomes available: 1. initial_map_should_include_peer_online_status: use WaitForCondition instead of checking the initial netmap. Online status is set by Connect() which sends a PeerChange patch after the initial RegisterResponse, so it may not be present immediately. 2. disco_key_should_propagate_to_peers: use WaitForCondition. The DiscoKey is sent in the first MapRequest (not RegisterRequest), so peers may not see it until a subsequent map update. 3. approved_route_without_announcement: invert the test expectation. Tailscale uses a strict advertise-then-approve model -- routes are only distributed when the node advertises them (Hostinfo.RoutableIPs) AND they are approved. An approval without advertisement is a dormant pre-approval. The test now asserts the route does NOT appear in AllowedIPs, matching upstream Tailscale semantics. Also fix TestClient.Reconnect to clear the cached netmap and drain pending updates before re-registering. Without this, WaitForPeers returned immediately based on the old session's stale data.	2026-03-19 07:05:58 +01:00
Kristoffer Dalby	1053fbb16b	hscontrol/state: fix online status reset during re-registration Two fixes to how online status is handled during registration: 1. Re-registration (applyAuthNodeUpdate, HandleNodeFromPreAuthKey) no longer resets IsOnline to false. Online status is managed exclusively by Connect()/Disconnect() in the poll session lifecycle. The reset caused a false offline blip: the auth handler's change notification triggered a map regeneration showing the node as offline to peers, even though Connect() would set it back to true moments later. 2. New node creation (createAndSaveNewNode) now explicitly sets IsOnline=false instead of leaving it nil. This ensures peers always receive a known online status rather than an ambiguous nil/unknown.	2026-03-19 07:05:58 +01:00
Kristoffer Dalby	b09af3846b	hscontrol/poll,state: fix grace period disconnect TOCTOU race When a node disconnects, serveLongPoll defers a cleanup that starts a grace period goroutine. This goroutine polls batcher.IsConnected() and, if the node has not reconnected within ~10 seconds, calls state.Disconnect() to mark it offline. A TOCTOU race exists: the node can reconnect (calling Connect()) between the IsConnected check and the Disconnect() call, causing the stale Disconnect() to overwrite the new session's online status. Fix with a monotonic per-node generation counter: - State.Connect() increments the counter and returns the current generation alongside the change list. - State.Disconnect() accepts the generation from the caller and rejects the call if a newer generation exists, making stale disconnects from old sessions a no-op. - serveLongPoll captures the generation at Connect() time and passes it to Disconnect() in the deferred cleanup. - RemoveNode's return value is now checked: if another session already owns the batcher slot (reconnect happened), the old session skips the grace period entirely. Update batcher_test.go to track per-node connect generations and pass them through to Disconnect(), matching production behavior. Fixes the following test failures: - server_state_online_after_reconnect_within_grace - update_history_no_false_offline - nodestore_correct_after_rapid_reconnect - rapid_reconnect_peer_never_sees_offline	2026-03-19 07:05:58 +01:00
Kristoffer Dalby	00c41b6422	hscontrol/servertest: add race, stress, and poll race tests Add three test files designed to stress the control plane under concurrent and adversarial conditions: - race_test.go: 14 tests exercising concurrent mutations, session replacement, batcher contention, NodeStore access, and map response delivery during disconnect. All pass the Go race detector. - poll_race_test.go: 8 tests targeting the poll.go grace period interleaving. These confirm a logical TOCTOU race: when a node disconnects and reconnects within the grace period, the old session's deferred Disconnect() can overwrite the new session's Connect(), leaving IsOnline=false despite an active poll session. - stress_test.go: sustained churn, rapid mutations, rolling replacement, data integrity checks under load, and verification that rapid reconnects do not leak false-offline notifications. Known failing tests (grace period TOCTOU race): - server_state_online_after_reconnect_within_grace - update_history_no_false_offline - rapid_reconnect_peer_never_sees_offline	2026-03-19 07:05:58 +01:00
Kristoffer Dalby	ab4e205ce7	hscontrol/servertest: expand issue tests to 24 scenarios, surface 4 issues Split TestIssues into 7 focused test functions to stay under cyclomatic complexity limits while testing more aggressively. Issues surfaced (4 failing tests): 1. initial_map_should_include_peer_online_status: Initial MapResponse has Online=nil for peers. Online status only arrives later via PeersChangedPatch. 2. disco_key_should_propagate_to_peers: DiscoPublicKey set by client is not visible to peers. Peers see zero disco key. 3. approved_route_without_announcement_is_visible: Server-side route approval without client-side announcement silently produces empty SubnetRoutes (intersection of empty announced + approved = empty). 4. nodestore_correct_after_rapid_reconnect: After 5 rapid reconnect cycles, NodeStore reports node as offline despite having an active poll session. The connect/disconnect grace period interleaving leaves IsOnline in an incorrect state. Passing tests (20) verify: - IP uniqueness across 10 nodes - IP stability across reconnect - New peers have addresses immediately - Node rename propagates to peers - Node delete removes from all peer lists - Hostinfo changes (OS field) propagate - NodeStore/DB consistency after route mutations - Grace period timing (8-20s window) - Ephemeral node deletion (not just offline) - 10-node simultaneous connect convergence - Rapid sequential node additions - Reconnect produces complete map - Cross-user visibility with default policy - Same-user multiple nodes get distinct IDs - Same-hostname nodes get unique GivenNames - Policy change during connect still converges - DERP region references are valid - User profiles present for self and peers - Self-update arrives after route approval - Route advertisement stored as AnnouncedRoutes	2026-03-19 07:05:58 +01:00
Kristoffer Dalby	f87b08676d	hscontrol/servertest: add policy, route, ephemeral, and content tests Extend the servertest harness with: - TestClient.Direct() accessor for advanced operations - TestClient.WaitForPeerCount and WaitForCondition helpers - TestHarness.ChangePolicy for ACL policy testing - AssertDERPMapPresent and AssertSelfHasAddresses New test suites: - content_test.go: self node, DERP map, peer properties, user profiles, update history monotonicity, and endpoint update propagation - policy_test.go: default allow-all, explicit policy, policy triggers updates on all nodes, multiple policy changes, multi-user mesh - ephemeral_test.go: ephemeral connect, cleanup after disconnect, mixed ephemeral/regular, reconnect prevents cleanup - routes_test.go: addresses in AllowedIPs, route advertise and approve, advertised routes via hostinfo, CGNAT range validation Also fix node_departs test to use WaitForCondition instead of assert.Eventually, and convert concurrent_join_and_leave to interleaved_join_and_leave with grace-period-tolerant assertions.	2026-03-19 07:05:58 +01:00
Kristoffer Dalby	ca7362e9aa	hscontrol/servertest: add control plane lifecycle and consistency tests Add three test files exercising the servertest harness: - lifecycle_test.go: connection, disconnection, reconnection, session replacement, and mesh formation at various sizes. - consistency_test.go: symmetric visibility, consistent peer state, address presence, concurrent join/leave convergence. - weather_test.go: rapid reconnects, flapping stability, reconnect with various delays, concurrent reconnects, and scale tests. All tests use table-driven patterns with subtests.	2026-03-19 07:05:58 +01:00
Kristoffer Dalby	0288614bdf	hscontrol: add servertest harness for in-process control plane testing Add a new hscontrol/servertest package that provides a test harness for exercising the full Headscale control protocol in-process, using Tailscale's controlclient.Direct as the client. The harness consists of: - TestServer: wraps a Headscale instance with an httptest.Server - TestClient: wraps controlclient.Direct with NetworkMap tracking - TestHarness: orchestrates N clients against a single server - Assertion helpers for mesh completeness, visibility, and consistency Export minimal accessor methods on Headscale (HTTPHandler, NoisePublicKey, GetState, SetServerURL, StartBatcher, StartEphemeralGC) so the servertest package can construct a working server from outside the hscontrol package. This enables fast, deterministic tests of connection lifecycle, update propagation, and network weather scenarios without Docker.	2026-03-19 07:05:58 +01:00
Kristoffer Dalby	82c7efccf8	mapper/batcher: serialize per-node work to prevent out-of-order delivery processBatchedChanges queued each pending change for a node as a separate work item. Since multiple workers pull from the same channel, two changes for the same node could be processed concurrently by different workers. This caused two problems: 1. MapResponses delivered out of order — a later change could finish generating before an earlier one, so the client sees stale state. 2. updateSentPeers and computePeerDiff race against each other — updateSentPeers does Clear() + Store() which is not atomic relative to a concurrent Range() in computePeerDiff. Bundle all pending changes for a node into a single work item so one worker processes them sequentially. Add a per-node workMu that serializes processing across consecutive batch ticks, preventing a second worker from starting tick N+1 while tick N is still in progress. Fixes #3140	2026-03-19 07:05:58 +01:00
Kristoffer Dalby	87b8507ac9	mapper/batcher: replace connected map with per-node disconnectedAt The Batcher's connected field (xsync.Map[types.NodeID, time.Time]) encoded three states via pointer semantics: - nil value: node is connected - non-nil time: node disconnected at that timestamp - key missing: node was never seen This was error-prone (nil meaning 'connected' inverts Go idioms), redundant with b.nodes + hasActiveConnections(), and required keeping two parallel maps in sync. It also contained a bug in RemoveNode where new(time.Now()) was used instead of &now, producing a zero time. Replace the separate connected map with a disconnectedAt field on multiChannelNodeConn (atomic.Pointer[time.Time]), tracked directly on the object that already manages the node's connections. Changes: - Add disconnectedAt field and helpers (markConnected, markDisconnected, isConnected, offlineDuration) to multiChannelNodeConn - Remove the connected field from Batcher - Simplify IsConnected from two map lookups to one - Simplify ConnectedMap and Debug from two-map iteration to one - Rewrite cleanupOfflineNodes to scan b.nodes directly - Remove the markDisconnectedIfNoConns helper - Update all tests and benchmarks Fixes #3141	2026-03-16 02:22:56 -07:00

1 2 3 4 5 ...

567 Commits