Commit Graph

95 Commits

Author SHA1 Message Date
Kristoffer Dalby
f3767dddf8 batcher: ensure removal from batcher
Fixes #2924

Signed-off-by: Kristoffer Dalby <kristoffer@dalby.cc>
2025-12-17 13:19:26 +01:00
Kristoffer Dalby
5767ca5085 change: smarter change notifications
This commit replaces the ChangeSet with a simpler bool based
change model that can be directly used in the map builder to
build the appropriate map response based on the change that
has occured. Previously, we fell back to sending full maps
for a lot of changes as that was consider "the safe" thing to
do to ensure no updates were missed.

This was slightly problematic as a node that already has a list
of peers will only do full replacement of the peers if the list
is non-empty, meaning that it was not possible to remove all
nodes (if for example policy changed).

Now we will keep track of last seen nodes, so we can send remove
ids, but also we are much smarter on how we send smaller, partial
maps when needed.

Fixes #2389

Signed-off-by: Kristoffer Dalby <kristoffer@dalby.cc>
2025-12-16 10:12:36 +01:00
Kristoffer Dalby
616c0e895d batcher: fix closed panic
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-12-15 16:28:27 +01:00
Kristoffer Dalby
642073f4b8 types: add option to disable taildrop, improve tests (#2955) 2025-12-12 11:35:16 +01:00
Kristoffer Dalby
87bd67318b golangci-lint: use forbidigo to block time.Sleep (#2946) 2025-12-10 16:45:59 +00:00
Kristoffer Dalby
0e1673041c all: remove deadcode (#2952) 2025-12-10 15:55:15 +01:00
Kristoffer Dalby
c8376e44a2 mapper: move tail node conversion to node type (#2950) 2025-12-10 09:16:22 +01:00
Kristoffer Dalby
22ee2bfc9c tags: process tags on registration, simplify policy (#2931)
This PR investigates, adds tests and aims to correctly implement Tailscale's model for how Tags should be accepted, assigned and used to identify nodes in the Tailscale access and ownership model.

When evaluating in Headscale's policy, Tags are now only checked against a nodes "tags" list, which defines the source of truth for all tags for a given node. This simplifies the code for dealing with tags greatly, and should help us have less access bugs related to nodes belonging to tags or users.

A node can either be owned by a user, or a tag.

Next, to ensure the tags list on the node is correctly implemented, we first add tests for every registration scenario and combination of user, pre auth key and pre auth key with tags with the same registration expectation as observed by trying them all with the Tailscale control server. This should ensure that we implement the correct behaviour and that it does not change or break over time.

Lastly, the missing parts of the auth has been added, or changed in the cases where it was wrong. This has in large parts allowed us to delete and simplify a lot of code.
Now, tags can only be changed when a node authenticates or if set via the CLI/API. Tags can only be fully overwritten/replaced and any use of either auth or CLI will replace the current set if different.

A user owned device can be converted to a tagged device, but it cannot be changed back. A tagged device can never remove the last tag either, it has to have a minimum of one.
2025-12-08 18:51:07 +01:00
Kristoffer Dalby
eb788cd007 make tags first class node owner (#2885)
This PR changes tags to be something that exists on nodes in addition to users, to being its own thing. It is part of moving our tags support towards the correct tailscale compatible implementation.

There are probably rough edges in this PR, but the intention is to get it in, and then start fixing bugs from 0.28.0 milestone (long standing tags issue) to discover what works and what doesnt.

Updates #2417
Closes #2619
2025-12-02 12:01:25 +01:00
Kristoffer Dalby
eec196d200 modernize: run gopls modernize to bring up to 1.25 (#2920) 2025-12-01 19:40:25 +01:00
Kristoffer Dalby
db293e0698 hscontrol/state: make NodeStore batch configuration tunable (#2886) 2025-11-28 16:38:29 +01:00
Kristoffer Dalby
7fb0f9a501 batcher: send endpoint and derp only updates. (#2856) 2025-11-13 20:38:49 +01:00
Kristoffer Dalby
d7a43a7cf1 state: use AllApprovedRoutes instead of SubnetRoutes
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-11-02 13:19:59 +01:00
Kristoffer Dalby
2bf1200483 policy: fix autogroup:self propagation and optimize cache invalidation (#2807) 2025-10-23 17:57:41 +02:00
Florian Preinstorfer
46477b8021 Downgrade completed broadcast message to debug 2025-10-18 07:56:59 +02:00
Vitalij Dovhanyc
c2a58a304d feat: add autogroup:self (#2789) 2025-10-16 12:59:52 +02:00
Kristoffer Dalby
ed3a9c8d6d mapper: send change instead of full update (#2775) 2025-09-17 14:23:21 +02:00
Kristoffer Dalby
d41fb4d540 app: fix sigint hanging
When the node notifier was replaced with batcher, we removed
its closing, but forgot to add the batchers so it was never
stopping node connections and waiting forever.

Fixes #2751

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-09-11 11:53:26 +02:00
Kristoffer Dalby
233dffc186 lint and leftover
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-09-09 09:40:00 +02:00
Kristoffer Dalby
9d236571f4 state/nodestore: in memory representation of nodes
Initial work on a nodestore which stores all of the nodes
and their relations in memory with relationship for peers
precalculated.

It is a copy-on-write structure, replacing the "snapshot"
when a change to the structure occurs. It is optimised for reads,
and while batches are not fast, they are grouped together
to do less of the expensive peer calculation if there are many
changes rapidly.

Writes will block until commited, while reads are never
blocked.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-09-09 09:40:00 +02:00
Kristoffer Dalby
b6d5788231 mapper: produce map before poll
Before this patch, we would send a message to each "node stream"
that there is an update that needs to be turned into a mapresponse
and sent to a node.

Producing the mapresponse is a "costly" afair which means that while
a node was producing one, it might start blocking and creating full
queues from the poller and all the way up to where updates where sent.

This could cause updates to time out and being dropped as a bad node
going away or spending too time processing would cause all the other
nodes to not get any updates.

In addition, it contributed to "uncontrolled parallel processing" by
potentially doing too many expensive operations at the same time:

Each node stream is essentially a channel, meaning that if you have 30
nodes, we will try to process 30 map requests at the same time. If you
have 8 cpu cores, that will saturate all the cores immediately and cause
a lot of wasted switching between the processing.

Now, all the maps are processed by workers in the mapper, and the number
of workers are controlable. These would now be recommended to be a bit
less than number of CPU cores, allowing us to process them as fast as we
can, and then send them to the poll.

When the poll recieved the map, it is only responsible for taking it and
sending it to the node.

This might not directly improve the performance of Headscale, but it will
likely make the performance a lot more consistent. And I would argue the
design is a lot easier to reason about.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-09-09 09:40:00 +02:00
Kristoffer Dalby
8e25f7f9dd bunch of qol (#2748) 2025-08-27 17:09:13 +02:00
Kristoffer Dalby
b87567628a derp: increase update frequency and harden on failures (#2741) 2025-08-22 10:40:38 +02:00
Kristoffer Dalby
a058bf3cd3 mapper: produce map before poll (#2628) 2025-07-28 11:15:53 +02:00
Kristoffer Dalby
c6d7b512bd integration: replace time.Sleep with assert.EventuallyWithT (#2680) 2025-07-10 23:38:55 +02:00
Kristoffer Dalby
73023c2ec3 all: use immutable node view in read path
This commit changes most of our (*)types.Node to
types.NodeView, which is a readonly version of the
underlying node ensuring that there is no mutations
happening in the read path.

Based on the migration, there didnt seem to be any, but the
idea here is to prevent it in the future and simplify other
new implementations.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-07-07 21:28:59 +01:00
Kristoffer Dalby
1553f0ab53 state: introduce state
this commit moves all of the read and write logic, and all different parts
of headscale that manages some sort of persistent and in memory state into
a separate package.

The goal of this is to clearly define the boundry between parts of the app
which accesses and modifies data, and where it happens. Previously, different
state (routes, policy, db and so on) was used directly, and sometime passed to
functions as pointers.

Now all access has to go through state. In the initial implementation,
most of the same functions exists and have just been moved. In the future
centralising this will allow us to optimise bottle necks with the database
(in memory state) and make the different parts talking to eachother do so
in the same way across headscale components.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-06-24 07:58:54 +02:00
Kristoffer Dalby
a52f1df180 policy: remove v1 code (#2600)
* policy: remove v1 code

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* db: update test with v1 removal

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* integration: start moving to v2 policy

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* policy: add ssh unmarshal tests

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* changelog: add entry

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* policy: remove v1 comment

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* integration: remove comment out case

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* cleanup skipv1

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* policy: remove v1 prefix workaround

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* policy: add all node ips if prefix/host is ts ip

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-05-20 13:57:26 +02:00
Kristoffer Dalby
45e38cb080 policy: reduce routes sent to peers based on packetfilter (#2561)
* notifier: use convenience funcs

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* policy: reduce routes based on policy

Fixes #2365

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* hsic: more helper methods

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* policy: more test cases

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* integration: add route with filter acl integration test

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* integration: correct route reduce test, now failing

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* mapper: compare peer routes against node

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* hs: more output to debug strings

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* types/node: slice.ContainsFunc

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* policy: more reduce route test

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* changelog: add entry for route filter

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-05-04 21:52:47 +02:00
aergus-tng
4651d06fa8 Make matchers part of the Policy interface (#2514)
* Make matchers part of the Policy interface

* Prevent race condition between rules and matchers

* Test also matchers in tests for Policy.Filter

* Compute `filterChanged` in v2 policy correctly

* Fix nil vs. empty list issue in v2 policy test

* policy/v2: always clear ssh map

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Co-authored-by: Aras Ergus <aras.ergus@tngtech.com>
Co-authored-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-05-01 07:06:30 +02:00
Kristoffer Dalby
8f9fbf16f1 types/authkey: include user object in response (#2542)
* types/authkey: include user object, not string

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* make preauthkeys use id

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* changelog

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* integration: wire up user id for auth keys

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-04-30 11:45:08 +02:00
Kristoffer Dalby
2b38f7bef7 policy/v2: make default (#2546)
* policy/v2: make default

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* integration: do not run v1 tests

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* policy/v2: fix potential nil pointers

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* mapper: fix test failures in v2

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-04-29 16:27:41 +02:00
Nick
109989005d ensure final dot on node name (#2503)
* ensure final dot on node name

This ensures that nodes which have a base domain set, will have a dot appended to their FQDN.

Resolves: https://github.com/juanfont/headscale/issues/2501

* improve OIDC TTL expire test

Waiting a bit more than the TTL of the OIDC token seems to remove some flakiness of this test. This furthermore makes use of a go func safe buffer which should avoid race conditions.
2025-04-11 12:39:08 +02:00
Enkelmann
0d3134720b Only read relevant nodes from database in PeerChangedResponse (#2509)
* Only read relevant nodes from database in PeerChangedResponse

* Rework to ensure transactional consistency in PeerChangedResponse again

* An empty nodeIDs list should return an empty nodes list

* Add test to ListNodesSubset

* Link PR in CHANGELOG.md

* combine ListNodes and ListNodesSubset into one function

* query for all nodes in ListNodes if no parameter is given

* also add optional filtering for relevant nodes to ListPeers
2025-04-08 14:56:44 +02:00
Kristoffer Dalby
603f3ad490 Multi network integration tests (#2464) 2025-03-21 11:49:32 +01:00
Kristoffer Dalby
0b5c29e875 remove policy handling for old capver (#2429)
* remove policy handling for old capver

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* update tests

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-03-10 18:19:25 +00:00
Kristoffer Dalby
87326f5c4f Experimental implementation of Policy v2 (#2214)
* utility iterator for ipset

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* split policy -> policy and v1

This commit split out the common policy logic and policy implementation
into separate packages.

policy contains functions that are independent of the policy implementation,
this typically means logic that works on tailcfg types and generic formats.
In addition, it defines the PolicyManager interface which the v1 implements.

v1 is a subpackage which implements the PolicyManager using the "original"
policy implementation.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* use polivyv1 definitions in integration tests

These can be marshalled back into JSON, which the
new format might not be able to.

Also, just dont change it all to JSON strings for now.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* formatter: breaks lines

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* remove compareprefix, use tsaddr version

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* remove getacl test, add back autoapprover

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* use policy manager tag handling

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* rename display helper for user

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* introduce policy v2 package

policy v2 is built from the ground up to be stricter
and follow the same pattern for all types of resolvers.

TODO introduce
aliass
resolver

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* wire up policyv2 in integration testing

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* split policy v2 tests into seperate workflow to work around github limit

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* add policy manager output to /debug

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* update changelog

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-03-10 16:20:29 +01:00
Kristoffer Dalby
7891378f57 Redo route code (#2422)
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2025-02-26 16:22:55 +01:00
Kristoffer Dalby
8b92c017ec add 1.80 to capver and update deps (#2394) 2025-02-05 07:17:51 +01:00
Kristoffer Dalby
e4a3dcc3b8 use headscale server url as domain instead of base_domain (#2338) 2025-01-16 18:05:20 +01:00
Kristoffer Dalby
380fcdba17 Add worker reading extra_records_path from file (#2271)
* consolidate scheduled tasks into one goroutine

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* rename Tailcfg dns struct

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* add dns.extra_records_path option

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* prettier lint

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* go-fmt

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-12-13 07:52:40 +00:00
Kristoffer Dalby
f7b0cbbbea wrap policy in policy manager interface (#2255)
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-11-26 15:16:06 +01:00
Kristoffer Dalby
fffd23602b Resolve user to stable unique ID in policy (#2205) 2024-11-24 00:13:27 +01:00
Kristoffer Dalby
e2d5ee0927 cleanup linter warnings (#2206)
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-10-23 10:45:59 -05:00
Kristoffer Dalby
218138afee Redo OIDC configuration (#2020)
expand user, add claims to user

This commit expands the user table with additional fields that
can be retrieved from OIDC providers (and other places) and
uses this data in various tailscale response objects if it is
available.

This is the beginning of implementing
https://docs.google.com/document/d/1X85PMxIaVWDF6T_UPji3OeeUqVBcGj_uHRM5CI-AwlY/edit
trying to make OIDC more coherant and maintainable in addition
to giving the user a better experience and integration with a
provider.

remove usernames in magic dns, normalisation of emails

this commit removes the option to have usernames as part of MagicDNS
domains and headscale will now align with Tailscale, where there is a
root domain, and the machine name.

In addition, the various normalisation functions for dns names has been
made lighter not caring about username and special character that wont
occur.

Email are no longer normalised as part of the policy processing.

untagle oidc and regcache, use typed cache

This commits stops reusing the registration cache for oidc
purposes and switches the cache to be types and not use any
allowing the removal of a bunch of casting.

try to make reauth/register branches clearer in oidc

Currently there was a function that did a bunch of stuff,
finding the machine key, trying to find the node, reauthing
the node, returning some status, and it was called validate
which was very confusing.

This commit tries to split this into what to do if the node
exists, if it needs to register etc.

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-10-02 14:50:17 +02:00
Kristoffer Dalby
bc9e83b52e use gorm serialiser instead of custom hooks (#2156)
* add sqlite to debug/test image

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* test using gorm serialiser instead of custom hooks

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-10-02 11:41:58 +02:00
Kristoffer Dalby
3964dec1c6 use tsaddr library and cleanups (#2150)
* resuse tsaddr code instead of handrolled

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* ensure we dont give out internal tailscale IPs

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* use prefix instead of string for routes

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* remove old custom compare func

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* trim unused util code

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-10-02 09:06:09 +02:00
Kristoffer Dalby
4f2fb65929 remove versions older than 1.56 (#2149)
* remove versions older than 1.56

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* remove code no longer needed for new clients

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

* update changelog

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>

---------

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-09-24 18:34:20 +02:00
curlwget
99f18f9cd9 chore: fix some comments (#2069) 2024-09-09 14:17:25 +02:00
Mike Poindexter
3101f895a7 Fix 764 (#2093)
* Fix KeyExpiration when a zero time value has a timezone

When a zero time value is loaded from JSON or a DB in a way that
assigns it the local timezone, it does not roudtrip in JSON as a
value for which IsZero returns true. This causes KeyExpiry to be
treated as a far past value instead of a nilish value.

See https://github.com/golang/go/issues/57040

* Fix whitespace

* Ensure that postgresql is used for all tests when env var is set

* Pass through value of HEADSCALE_INTEGRATION_POSTGRES env var

* Add option to set timezone on headscale container

* Add test for registration with auth key in alternate timezone
2024-09-03 09:22:17 +02:00