Commit Graph

26 Commits

Author SHA1 Message Date
Kristoffer Dalby
8358017dcf policy/v2,state,mapper: implement per-viewer via route steering
Via grants steer routes to specific nodes per viewer. Until now,
all clients saw the same routes for each peer because route
assembly was viewer-independent. This implements per-viewer route
visibility so that via-designated peers serve routes only to
matching viewers, while non-designated peers have those routes
withdrawn.

Add ViaRouteResult type (Include/Exclude prefix lists) and
ViaRoutesForPeer to the PolicyManager interface. The v2
implementation iterates via grants, resolves sources against the
viewer, matches destinations against the peer's advertised routes
(both subnet and exit), and categorizes prefixes by whether the
peer has the via tag.

Add RoutesForPeer to State which composes global primary election,
via Include/Exclude filtering, exit routes, and ACL reduction.
When no via grants exist, it falls back to existing behavior.

Update the mapper to call RoutesForPeer per-peer instead of using
a single route function for all peers. The route function now
returns all routes (subnet + exit), and TailNode filters exit
routes out of the PrimaryRoutes field for HA tracking.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
9f7aa55689 policy/v2: refactor alias resolution to use ResolvedAddresses
Introduce ResolvedAddresses type for structured IP set results. Refactor all Alias.Resolve() methods to return ResolvedAddresses instead of raw IPSets. Restrict identity-based aliases to matching address families, fix nil dereferences in partial resolution paths, and update test expectations for the new IP format (bare IPs, IP ranges instead of CIDR prefixes).

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
7bab8da366 state, policy, noise: implement SSH check period auto-approval
Add SSH check period tracking so that recently authenticated users
are auto-approved without requiring manual intervention each time.

Introduce SSHCheckPeriod type with validation (min 1m, max 168h,
"always" for every request) and encode the compiled check period
as URL query parameters in the HoldAndDelegate URL.

The SSHActionHandler checks recorded auth times before creating a
new HoldAndDelegate flow. Auth timestamps are stored in-memory:
- Default period (no explicit checkPeriod): auth covers any
  destination, keyed by source node with Dst=0 sentinel
- Explicit period: auth covers only that specific destination,
  keyed by (source, destination) pair

Auth times are cleared on policy changes.

Updates #1850
2026-02-25 21:28:05 +01:00
Kristoffer Dalby
107c2f2f70 policy, noise: implement SSH check action
Implement the SSH "check" action which requires additional
verification before allowing SSH access. The policy compiler generates
a HoldAndDelegate URL that the Tailscale client calls back to
headscale. The SSHActionHandler creates an auth session and waits for
approval via the generalised auth flow.

Sort check (HoldAndDelegate) rules before accept rules to match
Tailscale's first-match-wins evaluation order.

Updates #1850
2026-02-25 21:28:05 +01:00
Kristoffer Dalby
ce580f8245 all: fix golangci-lint issues (#3064) 2026-02-06 21:45:32 +01:00
Kristoffer Dalby
1f32c8bf61 policy/v2: add IsTagged() guards to prevent panics on tagged nodes
Three related issues where User().ID() is called on potentially tagged
nodes without first checking IsTagged():

1. compileACLWithAutogroupSelf: the autogroup:self block at line 166
   lacks the !node.IsTagged() guard that compileSSHPolicy already has.
   If a tagged node is the compilation target, node.User().ID() may
   panic. Tagged nodes should never participate in autogroup:self.

2. compileSSHPolicy: the IsTagged() check is on the right side of &&,
   so n.User().ID() evaluates first and may panic before short-circuit
   can prevent it. Swap to !n.IsTagged() && n.User().ID() == ... to
   match the already-correct order in compileACLWithAutogroupSelf.

3. invalidateAutogroupSelfCache: calls User().ID() at ~10 sites
   without IsTagged() guards. Tagged nodes don't participate in
   autogroup:self, so they should be skipped when collecting affected
   users and during cache lookup. Tag status transitions are handled
   by using the non-tagged version's user ID.

Fixes #2990
2026-02-03 16:53:15 +01:00
Kristoffer Dalby
11f0d4cfdd policy/v2: include nodes with empty filters in BuildPeerMap
Previously, nodes with empty filter rules (e.g., tagged servers that are
only destinations, never sources) were skipped entirely in BuildPeerMap.
This could cause visibility issues when using autogroup:self with
multiple user groups.

Remove the len(filter) == 0 skip condition so all nodes are included in
nodeMatchers. Empty filters result in empty matchers where CanAccess()
returns false, but the node still needs to be in the map so symmetric
visibility works correctly: if node A can access node B, both should see
each other regardless of B's filter rules.

Add comprehensive tests for:
- Multi-group scenarios where autogroup:self is used by privileged users
- Nodes with empty filters remaining visible to authorized peers
- Combined access rules (autogroup:self + tags in same rule)

Updates #2990
2026-02-03 16:53:15 +01:00
Kristoffer Dalby
22afb2c61b policy: fix asymmetric peer visibility with autogroup:self
When autogroup:self was combined with other ACL rules (e.g., group:admin
-> *:*), tagged nodes became invisible to users who should have access.

The BuildPeerMap function had two code paths:
- Global filter path: used symmetric OR logic (if either can access, both
  see each other)
- Autogroup:self path: used asymmetric logic (only add peer if that
  specific direction has access)

This caused problems with one-way rules like admin -> tagged-server. The
admin could access the server, but since the server couldn't access the
admin, neither was added to the other's peer list.

Fix by using symmetric visibility in the autogroup:self path, matching
the global filter path behavior: if either node can access the other,
both should see each other as peers.

Credit: vdovhanych <vdovhanych@users.noreply.github.com>

Fixes #2990
2026-01-21 14:35:16 +01:00
Kristoffer Dalby
506bd8c8eb policy: more accurate node change
This commit changes so that node changes to the policy is
calculated if any of the nodes has changed in a way that might
affect the policy.

Previously we just checked if the number of nodes had changed,
which meant that if a node was added and removed, we would be
in a bad state.

Signed-off-by: Kristoffer Dalby <kristoffer@dalby.cc>
2025-12-16 10:12:36 +01:00
Kristoffer Dalby
22ee2bfc9c tags: process tags on registration, simplify policy (#2931)
This PR investigates, adds tests and aims to correctly implement Tailscale's model for how Tags should be accepted, assigned and used to identify nodes in the Tailscale access and ownership model.

When evaluating in Headscale's policy, Tags are now only checked against a nodes "tags" list, which defines the source of truth for all tags for a given node. This simplifies the code for dealing with tags greatly, and should help us have less access bugs related to nodes belonging to tags or users.

A node can either be owned by a user, or a tag.

Next, to ensure the tags list on the node is correctly implemented, we first add tests for every registration scenario and combination of user, pre auth key and pre auth key with tags with the same registration expectation as observed by trying them all with the Tailscale control server. This should ensure that we implement the correct behaviour and that it does not change or break over time.

Lastly, the missing parts of the auth has been added, or changed in the cases where it was wrong. This has in large parts allowed us to delete and simplify a lot of code.
Now, tags can only be changed when a node authenticates or if set via the CLI/API. Tags can only be fully overwritten/replaced and any use of either auth or CLI will replace the current set if different.

A user owned device can be converted to a tagged device, but it cannot be changed back. A tagged device can never remove the last tag either, it has to have a minimum of one.
2025-12-08 18:51:07 +01:00
Kristoffer Dalby
eb788cd007 make tags first class node owner (#2885)
This PR changes tags to be something that exists on nodes in addition to users, to being its own thing. It is part of moving our tags support towards the correct tailscale compatible implementation.

There are probably rough edges in this PR, but the intention is to get it in, and then start fixing bugs from 0.28.0 milestone (long standing tags issue) to discover what works and what doesnt.

Updates #2417
Closes #2619
2025-12-02 12:01:25 +01:00
Vitalij Dovhanyc
0078eb7790 chore: fix filterHash to work with autogroup:self in the acls (#2882) 2025-12-02 12:01:02 +01:00
Kristoffer Dalby
2bf1200483 policy: fix autogroup:self propagation and optimize cache invalidation (#2807) 2025-10-23 17:57:41 +02:00
Vitalij Dovhanyc
c2a58a304d feat: add autogroup:self (#2789) 2025-10-16 12:59:52 +02:00
Kristoffer Dalby
ed3a9c8d6d mapper: send change instead of full update (#2775) 2025-09-17 14:23:21 +02:00
Kristoffer Dalby
233dffc186 lint and leftover
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-09-09 09:40:00 +02:00
Kristoffer Dalby
9d236571f4 state/nodestore: in memory representation of nodes
Initial work on a nodestore which stores all of the nodes
and their relations in memory with relationship for peers
precalculated.

It is a copy-on-write structure, replacing the "snapshot"
when a change to the structure occurs. It is optimised for reads,
and while batches are not fast, they are grouped together
to do less of the expensive peer calculation if there are many
changes rapidly.

Writes will block until commited, while reads are never
blocked.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-09-09 09:40:00 +02:00
Kristoffer Dalby
a058bf3cd3 mapper: produce map before poll (#2628) 2025-07-28 11:15:53 +02:00
Kristoffer Dalby
c6d7b512bd integration: replace time.Sleep with assert.EventuallyWithT (#2680) 2025-07-10 23:38:55 +02:00
Kristoffer Dalby
73023c2ec3 all: use immutable node view in read path
This commit changes most of our (*)types.Node to
types.NodeView, which is a readonly version of the
underlying node ensuring that there is no mutations
happening in the read path.

Based on the migration, there didnt seem to be any, but the
idea here is to prevent it in the future and simplify other
new implementations.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-07-07 21:28:59 +01:00
Kristoffer Dalby
37dc0dad35 policy/v2: separate exit node and 0.0.0.0/0 routes (#2578)
* policy: add tests for route auto approval

Reproduce #2568

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* policy/v2: separate exit node and 0.0.0.0/0 routes

Fixes #2568

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-05-09 23:20:04 +02:00
Kristoffer Dalby
45e38cb080 policy: reduce routes sent to peers based on packetfilter (#2561)
* notifier: use convenience funcs

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* policy: reduce routes based on policy

Fixes #2365

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* hsic: more helper methods

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* policy: more test cases

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* integration: add route with filter acl integration test

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* integration: correct route reduce test, now failing

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* mapper: compare peer routes against node

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* hs: more output to debug strings

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* types/node: slice.ContainsFunc

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* policy: more reduce route test

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* changelog: add entry for route filter

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-05-04 21:52:47 +02:00
Kristoffer Dalby
c923f461ab error on undefined host in policy (#2490)
* add testcases

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* policy/v2: add validate to do post marshal validation

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-05-01 14:30:52 +02:00
aergus-tng
4651d06fa8 Make matchers part of the Policy interface (#2514)
* Make matchers part of the Policy interface

* Prevent race condition between rules and matchers

* Test also matchers in tests for Policy.Filter

* Compute `filterChanged` in v2 policy correctly

* Fix nil vs. empty list issue in v2 policy test

* policy/v2: always clear ssh map

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Co-authored-by: Aras Ergus <aras.ergus@tngtech.com>
Co-authored-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-05-01 07:06:30 +02:00
Kristoffer Dalby
f1206328dc fix webauth + autoapprove routes (#2528)
* types/node: add helper funcs for node tags

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* types/node: add DebugString method for node

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* policy/v2: add String func to AutoApprover interface

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* policy/v2: simplify, use slices.Contains

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* policy/v2: debug, use nodes.DebugString

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* policy/v1: fix potential nil pointer in NodeCanApproveRoute

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* policy/v1: slices.Contains

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* integration/tsic: fix diff in login commands

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* integration: fix webauth running with wrong scenario

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* integration: move common oidc opts to func

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* integration: require node count, more verbose

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* auth: remove uneffective route approve

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* .github/workflows: fmt

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* integration/tsic: add id func

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* integration: remove call that might be nil

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* integration: test autoapprovers against web/authkey x group/tag/user

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* integration: unique network id per scenario

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* Revert "integration: move common oidc opts to func"

This reverts commit 7e9d165d4a900c304f1083b665f1a24a26e06e55.

* remove cmd

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* integration: clean docker images between runs in ci

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* integration: run autoapprove test against differnt policy modes

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* integration/tsic: append, not overrwrite extra login args

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* .github/workflows: remove polv2

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-04-30 07:54:04 +02:00
Kristoffer Dalby
87326f5c4f Experimental implementation of Policy v2 (#2214)
* utility iterator for ipset

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* split policy -> policy and v1

This commit split out the common policy logic and policy implementation
into separate packages.

policy contains functions that are independent of the policy implementation,
this typically means logic that works on tailcfg types and generic formats.
In addition, it defines the PolicyManager interface which the v1 implements.

v1 is a subpackage which implements the PolicyManager using the "original"
policy implementation.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* use polivyv1 definitions in integration tests

These can be marshalled back into JSON, which the
new format might not be able to.

Also, just dont change it all to JSON strings for now.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* formatter: breaks lines

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* remove compareprefix, use tsaddr version

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* remove getacl test, add back autoapprover

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* use policy manager tag handling

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* rename display helper for user

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* introduce policy v2 package

policy v2 is built from the ground up to be stricter
and follow the same pattern for all types of resolvers.

TODO introduce
aliass
resolver

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* wire up policyv2 in integration testing

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* split policy v2 tests into seperate workflow to work around github limit

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* add policy manager output to /debug

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* update changelog

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-03-10 16:20:29 +01:00