Buffer the AuthRequest verdict channel to prevent a race where the
sender blocks indefinitely if the receiver has already timed out, and
increase the auth followup test timeout from 100ms to 5s to prevent
spurious failures under load.
Skip postgres-backed tests when the postgres server is unavailable
instead of calling t.Fatal, which was preventing the rest of the test
suite from running.
Add TestMain to db, types, and policy/v2 packages to chdir to the
source directory before running tests. This ensures relative testdata/
paths resolve correctly when the test binary is executed from an
arbitrary working directory (e.g., via "go tool stress").
Add support for localpart:*@<domain> entries in SSH policy users.
When a user SSHes into a target, their email local-part becomes the
OS username (e.g. alice@example.com → OS user alice).
Type system (types.go):
- SSHUser.IsLocalpart() and ParseLocalpart() for validation
- SSHUsers.LocalpartEntries(), NormalUsers(), ContainsLocalpart()
- Enforces format: localpart:*@<domain> (wildcard-only)
- UserWildcard.Resolve for user:*@domain SSH source aliases
- acceptEnv passthrough for SSH rules
Compilation (filter.go):
- resolveLocalparts: pure function mapping users to local-parts
by email domain. No node walking, easy to test.
- groupSourcesByUser: single walk producing per-user principals
with sorted user IDs, and tagged principals separately.
- ipSetToPrincipals: shared helper replacing 6 inline copies.
- selfPrincipalsForNode: self-access using pre-computed byUser.
The approach separates data gathering from rule assembly. Localpart
rules are interleaved per source user to match Tailscale SaaS
first-match-wins ordering.
Updates #3049
Add SSH check period tracking so that recently authenticated users
are auto-approved without requiring manual intervention each time.
Introduce SSHCheckPeriod type with validation (min 1m, max 168h,
"always" for every request) and encode the compiled check period
as URL query parameters in the HoldAndDelegate URL.
The SSHActionHandler checks recorded auth times before creating a
new HoldAndDelegate flow. Auth timestamps are stored in-memory:
- Default period (no explicit checkPeriod): auth covers any
destination, keyed by source node with Dst=0 sentinel
- Explicit period: auth covers only that specific destination,
keyed by (source, destination) pair
Auth times are cleared on policy changes.
Updates #1850
Implement the SSH "check" action which requires additional
verification before allowing SSH access. The policy compiler generates
a HoldAndDelegate URL that the Tailscale client calls back to
headscale. The SSHActionHandler creates an auth session and waits for
approval via the generalised auth flow.
Sort check (HoldAndDelegate) rules before accept rules to match
Tailscale's first-match-wins evaluation order.
Updates #1850
Add end-to-end test cases to TestUnmarshalPolicy that verify bracketed
IPv6 addresses are correctly parsed through the full policy pipeline
(JSON unmarshal -> splitDestinationAndPort -> parseAlias -> parsePortRange)
and survive JSON round-trips.
Cover single port, multiple ports, wildcard port, CIDR prefix, port
range, bracketed IPv4, and hostname rejection.
Updates #2754
Headscale rejects IPv6 addresses with square brackets in ACL policy
destinations (e.g. "[fd7a:115c:a1e0::87e1]:80,443"), while Tailscale
SaaS accepts them. The root cause is that splitDestinationAndPort uses
strings.LastIndex(":") which leaves brackets on the destination string,
and netip.ParseAddr does not accept brackets.
Add a bracket-handling branch at the top of splitDestinationAndPort that
uses net.SplitHostPort for RFC 3986 parsing when input starts with "[".
The extracted host is validated with netip.ParseAddr/ParsePrefix to
ensure brackets are only accepted around IP addresses and CIDR prefixes,
not hostnames or other alias types like tags and groups.
Fixes#2754
Use new(users["name"]) instead of extracting to intermediate
variables that staticcheck does not recognise as used with
Go 1.26 new(value) syntax.
Updates #3058
Fix issues found by the upgraded golangci-lint:
- wsl_v5: add required whitespace in CLI files
- staticcheck SA4006: replace new(var.Field) with &localVar
pattern since staticcheck does not recognize Go 1.26
new(value) as a use of the variable
- staticcheck SA5011: use t.Fatal instead of t.Error for
nil guard checks so execution stops
- unused: remove dead ptrTo helper function
This commit upgrades the codebase from Go 1.25.5 to Go 1.26rc2 and
adopts new language features.
Toolchain updates:
- go.mod: go 1.25.5 → go 1.26rc2
- flake.nix: buildGo125Module → buildGo126Module, go_1_25 → go_1_26
- flake.nix: build golangci-lint from source with Go 1.26
- Dockerfile.integration: golang:1.25-trixie → golang:1.26rc2-trixie
- Dockerfile.tailscale-HEAD: golang:1.25-alpine → golang:1.26rc2-alpine
- Dockerfile.derper: golang:alpine → golang:1.26rc2-alpine
- .goreleaser.yml: go mod tidy -compat=1.25 → -compat=1.26
- cmd/hi/run.go: fallback Go version 1.25 → 1.26rc2
- .pre-commit-config.yaml: simplify golangci-lint hook entry
Code modernization using Go 1.26 features:
- Replace tsaddr.SortPrefixes with slices.SortFunc + netip.Prefix.Compare
- Replace ptr.To(x) with new(x) syntax
- Replace errors.As with errors.AsType[T]
Lint rule updates:
- Add forbidigo rules to prevent regression to old patterns
Errors should not start capitalised and they should not contain the word error
or state that they "failed" as we already know it is an error
Signed-off-by: Kristoffer Dalby <kristoffer@dalby.cc>
Go style recommends that log messages and error strings should not be
capitalized (unless beginning with proper nouns or acronyms) and should
not end with punctuation.
This change normalizes all zerolog .Msg() and .Msgf() calls to start
with lowercase letters, following Go conventions and making logs more
consistent across the codebase.
According to Tailscale SaaS behavior, autogroup:internet is handled
by exit node routing via AllowedIPs, not by packet filtering. ACL
rules with autogroup:internet as destination should produce no
filter rules for any node.
Previously, Headscale expanded autogroup:internet to public CIDR
ranges and distributed filters to exit nodes (because 0.0.0.0/0
"covers" internet destinations). This was incorrect.
Add detection for AutoGroupInternet in filter compilation to skip
filter generation for this autogroup. Update test expectations
accordingly.
Fix two compatibility issues discovered in Tailscale SaaS testing:
1. Wildcard DstPorts format: Headscale was expanding wildcard
destinations to CGNAT ranges (100.64.0.0/10, fd7a:115c:a1e0::/48)
while Tailscale uses {IP: "*"} directly. Add detection for
wildcard (Asterix) alias type in filter compilation to use the
correct format.
2. proto:icmp handling: The "icmp" protocol name was returning both
ICMPv4 (1) and ICMPv6 (58), but Tailscale only returns ICMPv4.
Users should use "ipv6-icmp" or protocol number 58 explicitly
for IPv6 ICMP.
Update all test expectations accordingly. This significantly reduces
test file line count by replacing duplicated CGNAT range patterns
with single wildcard entries.
Update test expectations across policy tests to expect merged
FilterRule entries instead of separate ones. Tests now expect:
- Single FilterRule with combined DstPorts for same source
- Reduced matcher counts for exit node tests
Updates #3036
Tailscale merges multiple ACL rules into fewer FilterRule entries
when they have identical SrcIPs and IPProto, combining their DstPorts
arrays. This change implements the same behavior in Headscale.
Add mergeFilterRules() which uses O(n) hash map lookup to merge rules
with identical keys. DstPorts are NOT deduplicated to match Tailscale
behavior.
Also fix DestsIsTheInternet() to handle merged filter rules where
TheInternet is combined with other destinations - now uses superset
check instead of equality check.
Updates #3036
Change Asterix.Resolve() to use Tailscale's CGNAT range (100.64.0.0/10)
and ULA range (fd7a:115c:a1e0::/48) instead of all IPs (0.0.0.0/0 and
::/0).
This better matches Tailscale's security model where wildcard (*) means
"any node in the tailnet" rather than literally "any IP address on the
internet".
Updates #3036
Updates #3036
Tailscale validates that autogroup:self destinations in ACL rules can
only be used when ALL sources are users, groups, autogroup:member, or
wildcard (*). Previously, Headscale only performed this validation for
SSH rules.
Add validateACLSrcDstCombination() to enforce that tags, autogroup:tagged,
hosts, and raw IPs cannot be used as sources with autogroup:self
destinations. Invalid policies like `tag:client → autogroup:self:*` are
now rejected at validation time, matching Tailscale behavior.
Wildcard (*) is allowed because autogroup:self evaluation narrows it
per-node to only the node's own IPs.
Updates #3036
When ACL rules don't specify a protocol, Headscale now defaults to
[TCP, UDP, ICMP, ICMPv6] instead of just [TCP, UDP], matching
Tailscale's behavior.
Also export protocol number constants (ProtocolTCP, ProtocolUDP, etc.)
for use in external test packages, renaming the string protocol
constants to ProtoNameTCP, ProtoNameUDP, etc. to avoid conflicts.
This resolves 78 ICMP-related TODOs in the Tailscale compatibility
tests, reducing the total from 165 to 87.
Updates #3036
Add extensive test coverage verifying Headscale's ACL policy behavior
matches Tailscale's coordination server. Tests cover:
- Source/destination resolution for users, groups, tags, hosts, IPs
- autogroup:member, autogroup:tagged, autogroup:self behavior
- Filter rule deduplication and merging semantics
- Multi-rule interaction patterns
- Error case validation
Key behavioral differences documented:
- Headscale creates separate filter entries per ACL rule; Tailscale
merges rules with identical sources
- Headscale deduplicates Dsts within a rule; Tailscale does not
- Headscale does not validate autogroup:self source restrictions for
ACL rules (only SSH rules); Tailscale rejects invalid sources
Tests are based on real Tailscale coordination server responses
captured from a test environment with 5 nodes (1 user-owned, 4 tagged).
Updates #3036
Both compileFilterRules and compileSSHPolicy include .Caller() on
their resolution error log statements, but compileACLWithAutogroupSelf
does not. Add .Caller() to the three log sites (source resolution
error, destination resolution error, nil destination) for consistent
debuggability across all compilation paths.
Updates #2990
In compileSSHPolicy, when resolving other (non-autogroup:self)
destinations, the code discards the entire result on error via
`continue`. If a destination alias (e.g., a tag owned by a group
with a non-existent user) returns a partial IPSet alongside an
error, valid IPs are lost.
Both ACL compilation paths (compileFilterRules and
compileACLWithAutogroupSelf) already handle this correctly by
logging the error and using the IPSet if non-nil.
Remove the `continue` so the SSH path is consistent with the
ACL paths.
Fixes#2990
Three related issues where User().ID() is called on potentially tagged
nodes without first checking IsTagged():
1. compileACLWithAutogroupSelf: the autogroup:self block at line 166
lacks the !node.IsTagged() guard that compileSSHPolicy already has.
If a tagged node is the compilation target, node.User().ID() may
panic. Tagged nodes should never participate in autogroup:self.
2. compileSSHPolicy: the IsTagged() check is on the right side of &&,
so n.User().ID() evaluates first and may panic before short-circuit
can prevent it. Swap to !n.IsTagged() && n.User().ID() == ... to
match the already-correct order in compileACLWithAutogroupSelf.
3. invalidateAutogroupSelfCache: calls User().ID() at ~10 sites
without IsTagged() guards. Tagged nodes don't participate in
autogroup:self, so they should be skipped when collecting affected
users and during cache lookup. Tag status transitions are handled
by using the non-tagged version's user ID.
Fixes#2990
In compileACLWithAutogroupSelf, when a group contains a non-existent
user, Group.Resolve() returns a partial IPSet (with IPs from valid
users) alongside an error. The code was discarding the entire result
via `continue`, losing valid IPs. The non-autogroup-self path
(compileFilterRules) already handles this correctly by logging the
error and using the IPSet if non-empty.
Remove the `continue` on error for both source and destination
resolution, matching the existing behavior in compileFilterRules.
Also reorder the IsTagged check before User().ID() comparison
in the same-user node filter to prevent nil dereference on tagged
nodes that have no User set.
Fixes#2990
Add test reproducing the exact scenario from issue #2990 where:
- One user (user1) in group:admin
- node1: user device (not tagged)
- node2: tagged with tag:admin, same user
The test verifies that peer visibility and packet filters are correct.
Updates #2990
Previously, nodes with empty filter rules (e.g., tagged servers that are
only destinations, never sources) were skipped entirely in BuildPeerMap.
This could cause visibility issues when using autogroup:self with
multiple user groups.
Remove the len(filter) == 0 skip condition so all nodes are included in
nodeMatchers. Empty filters result in empty matchers where CanAccess()
returns false, but the node still needs to be in the map so symmetric
visibility works correctly: if node A can access node B, both should see
each other regardless of B's filter rules.
Add comprehensive tests for:
- Multi-group scenarios where autogroup:self is used by privileged users
- Nodes with empty filters remaining visible to authorized peers
- Combined access rules (autogroup:self + tags in same rule)
Updates #2990
Update unit tests to use valid SSH patterns that conform to Tailscale's
security model:
- Change group->user destinations to group->tag
- Change tag->user destinations to tag->tag
- Update expected error messages for new validation format
- Add proper tagged/untagged node setup in filter tests
Updates #3009
Updates #3010
Add validation for SSH source/destination combinations that enforces
Tailscale's security model:
- Tags/autogroup:tagged cannot SSH to user-owned devices
- autogroup:self destination requires source to contain only users/groups
- Username destinations require source to be that same single user only
- Wildcard (*) is no longer supported as SSH destination; use
autogroup:member or autogroup:tagged instead
The validateSSHSrcDstCombination() function is called during policy
validation to reject invalid configurations at load time.
Fixes#3009Fixes#3010
When autogroup:self was combined with other ACL rules (e.g., group:admin
-> *:*), tagged nodes became invisible to users who should have access.
The BuildPeerMap function had two code paths:
- Global filter path: used symmetric OR logic (if either can access, both
see each other)
- Autogroup:self path: used asymmetric logic (only add peer if that
specific direction has access)
This caused problems with one-way rules like admin -> tagged-server. The
admin could access the server, but since the server couldn't access the
admin, neither was added to the other's peer list.
Fix by using symmetric visibility in the autogroup:self path, matching
the global filter path behavior: if either node can access the other,
both should see each other as peers.
Credit: vdovhanych <vdovhanych@users.noreply.github.com>
Fixes#2990
This commit changes so that node changes to the policy is
calculated if any of the nodes has changed in a way that might
affect the policy.
Previously we just checked if the number of nodes had changed,
which meant that if a node was added and removed, we would be
in a bad state.
Signed-off-by: Kristoffer Dalby <kristoffer@dalby.cc>
This PR investigates, adds tests and aims to correctly implement Tailscale's model for how Tags should be accepted, assigned and used to identify nodes in the Tailscale access and ownership model.
When evaluating in Headscale's policy, Tags are now only checked against a nodes "tags" list, which defines the source of truth for all tags for a given node. This simplifies the code for dealing with tags greatly, and should help us have less access bugs related to nodes belonging to tags or users.
A node can either be owned by a user, or a tag.
Next, to ensure the tags list on the node is correctly implemented, we first add tests for every registration scenario and combination of user, pre auth key and pre auth key with tags with the same registration expectation as observed by trying them all with the Tailscale control server. This should ensure that we implement the correct behaviour and that it does not change or break over time.
Lastly, the missing parts of the auth has been added, or changed in the cases where it was wrong. This has in large parts allowed us to delete and simplify a lot of code.
Now, tags can only be changed when a node authenticates or if set via the CLI/API. Tags can only be fully overwritten/replaced and any use of either auth or CLI will replace the current set if different.
A user owned device can be converted to a tagged device, but it cannot be changed back. A tagged device can never remove the last tag either, it has to have a minimum of one.
This PR changes tags to be something that exists on nodes in addition to users, to being its own thing. It is part of moving our tags support towards the correct tailscale compatible implementation.
There are probably rough edges in this PR, but the intention is to get it in, and then start fixing bugs from 0.28.0 milestone (long standing tags issue) to discover what works and what doesnt.
Updates #2417Closes#2619
When we encounter a source we cannot resolve, we skipped the whole rule,
even if some of the srcs could be resolved. In this case, if we had one user
that exists and one that does not.
In the regular policy, we log this, and still let a rule be created from what
does exist, while in the SSH policy we did not.
This commit fixes it so the behaviour is the same.
Fixes#2863
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>