Policy rework tracking bug #937

New Issue

adam · 2025-12-29T02:26:27+01:00

adam commented

2025-12-29 02:26:27 +01:00

Originally created by @kradalby on GitHub (Feb 7, 2025).

Originally assigned to: @kradalby on GitHub.

This bug is a tracking bug for all work related to reworking the Policy and ACLs that we aim to get into 0.26.0.

Within the scope of 0.26.0:

Making the current code more maintainable
Making it easier to test
Bring the ACLs up to scratch with Tailscale's ACL
Maintain current functionality

Out of scope for 0.26.0:

Add new functionality (grants etc)
Add new autogroups

Originally created by @kradalby on GitHub (Feb 7, 2025). Originally assigned to: @kradalby on GitHub. This bug is a tracking bug for all work related to reworking the Policy and ACLs that we aim to get into 0.26.0. Within the scope of 0.26.0: - Making the current code more maintainable - Making it easier to test - Bring the ACLs up to scratch with Tailscale's ACL - Maintain current functionality Out of scope for 0.26.0: - Add new functionality (`grants` etc) - Add new `autogroups`

adam added the bug policy 📝 labels 2025-12-29 02:26:27 +01:00

adam closed this issue

2025-12-29 02:26:27 +01:00

adam commented

2025-12-29 02:26:27 +01:00

@aergus-tng commented on GitHub (Mar 28, 2025):

Hello, while evaluating the possibility of running a Headscale network in our organization, we have identified a few performance bottlenecks we are interested in addressing. We are aware of #2491 and your general preference of maintainability over performance, but one of the "hot paths" points to this issue regarding a planned change, so we wanted to query the status.

Namely, there is the following in the CanAccess method of *Node.

// TODO(kradalby): Regenerate this every time the filter change, instead of
// every time we use it.
// Part of #2416
matchers := make([]matcher.Match, len(filter))
for i, rule := range filter {
	matchers[i] = matcher.MatchFromFilterRule(rule)
}

(CanAccess itself is called in a loop FilterNodesByACL, which is in the call stack of FullMapResponse and PeerChangedResponse. In the end calls to ParseIPSet further down the call stack of MatchFromFilterRule add up to a significant amount of CPU time for these responses.)

Do you have specific plans for the implementation of this change? We would be interested in submitting a PR if there are no blockers. (If that is the case: How exactly would you like the "cached" matches to be accessed? Would you add a new method to the PolicyManager interface or implement caching somehow only for the version 2 policy manager?)

PS: FWIW, we already tested the attached patch that moves the generation of the matches out of the loop of FilterNodesByACL to appendPeerChanges, which seems to help with performance. (That would be redundant with the aforementioned changes, but if desired, we can nevertheless submit also this patch as a PR.)

@aergus-tng commented on GitHub (Mar 28, 2025): Hello, while evaluating the possibility of running a Headscale network in our organization, we have identified a few performance bottlenecks we are interested in addressing. We are aware of #2491 and your general preference of maintainability over performance, but one of the "hot paths" points to this issue regarding a planned change, so we wanted to query the status. Namely, there is the following in the `CanAccess` method of `*Node`. ```go // TODO(kradalby): Regenerate this every time the filter change, instead of // every time we use it. // Part of #2416 matchers := make([]matcher.Match, len(filter)) for i, rule := range filter { matchers[i] = matcher.MatchFromFilterRule(rule) } ``` (`CanAccess` itself is called in a loop `FilterNodesByACL`, which is in the call stack of `FullMapResponse` and `PeerChangedResponse`. In the end calls to `ParseIPSet` further down the call stack of `MatchFromFilterRule` add up to a significant amount of CPU time for these responses.) Do you have specific plans for the implementation of this change? We would be interested in submitting a PR if there are no blockers. (If that is the case: How exactly would you like the "cached" matches to be accessed? Would you add a new method to the `PolicyManager` interface or implement caching somehow only for the version 2 policy manager?) PS: FWIW, we already tested the [attached patch](https://github.com/user-attachments/files/19501123/0001-Compute-matches-from-ACL-rules-before-a-loop.patch.txt) that moves the generation of the matches out of the loop of `FilterNodesByACL` to `appendPeerChanges`, which seems to help with performance. (That would be redundant with the aforementioned changes, but if desired, we can nevertheless submit also this patch as a PR.)

adam commented

2025-12-29 02:26:28 +01:00

@kradalby commented on GitHub (Mar 31, 2025):

@aergus-tng in general, I am a bit conservative when it comes to changing too much at once, but in this particular case, I do agree that this is pretty low hanging fruit and I did even mark it as such.

If you would be up for doing a PR to implement this I would probably manage to get this into 0.26 or 0.27.

I making a method for returning the precomputed match filter on the Policy interface makes sense,

to not waste time on v1, I suggest that the basic "compute on every call" is implemented for v1, and then a cached "recompute on filter change" for v2.

Let me know if you have any questions, I wrote this up a bit quickly as I'm travelling/conferencing this week.

@kradalby commented on GitHub (Mar 31, 2025): @aergus-tng in general, I am a bit conservative when it comes to changing too much at once, but in this particular case, I do agree that this is pretty low hanging fruit and I did even mark it as such. If you would be up for doing a PR to implement this I would probably manage to get this into 0.26 or 0.27. I making a method for returning the precomputed match filter on the Policy interface makes sense, to not waste time on v1, I suggest that the basic "compute on every call" is implemented for v1, and then a cached "recompute on filter change" for v2. Let me know if you have any questions, I wrote this up a bit quickly as I'm travelling/conferencing this week.

adam commented

2025-12-29 02:26:28 +01:00

@kradalby commented on GitHub (May 14, 2025):

Closing as finished as part of 0.26. Further Policy work will be tracked in new issues.

@kradalby commented on GitHub (May 14, 2025): Closing as finished as part of 0.26. Further Policy work will be tracked in new issues.

adam commented

2025-12-29 02:26:28 +01:00

@almereyda commented on GitHub (Sep 23, 2025):

The wealth of slightly related follow ups can be found via the sister story:

[PR #1950] [MERGED] move debug image to distroless (#2417)

@almereyda commented on GitHub (Sep 23, 2025): The wealth of slightly related follow ups can be found via the sister story: - #2417

adam referenced this issue

2025-12-29 02:31:33 +01:00

[PR #937] [CLOSED] Integration tests: Port DERP, add TLS ability. #1752

Sign in to join this conversation.

Branches Tags

main

gh-pages

update_flake_lock_action

kradalby/3038-reg-panic

kradalby/release-v0.27.2

dependabot/go_modules/golang.org/x/crypto-0.45.0

dependabot/go_modules/github.com/opencontainers/runc-1.3.3

copilot/investigate-headscale-issue-2788

copilot/investigate-visibility-issue-2788

copilot/investigate-issue-2833

copilot/debug-issue-2846

copilot/fix-issue-2847

dependabot/go_modules/github.com/go-viper/mapstructure/v2-2.4.0

dependabot/go_modules/github.com/docker/docker-28.3.3incompatible

kradalby/cli-experiement3

doc/0.26.1

doc/0.25.1

doc/0.25.0

doc/0.24.3

doc/0.24.2

doc/0.24.1

doc/0.24.0

kradalby/build-docker-on-pr

topic/docu-versioning

topic/docker-kos

juanfont/fix-crash-node-id

juanfont/better-disclaimer

update-contributors

topic/prettier

revert-1893-add-test-stage-to-docs

add-test-stage-to-docs

remove-node-check-interval

fix-empty-prefix

fix-ephemeral-reusable

bug_report-debuginfo

autogroups

logs-to-stderr

revert-1414-topic/fix_unix_socket

rename-machine-node

port-embedded-derp-tests-v2

port-derp-tests

duplicate-word-linter

update-tailscale-1.36

warn-against-apache

ko-fi-link

more-acl-tests

fix-typo-standalone

parallel-nolint

tparallel-fix

rerouting

ssh-changelog-docs

oidc-cleanup

web-auth-flow-tests

kradalby-gh-runner

fix-proto-lint

remove-funding-links

go-1.19

enable-1.30-in-tests

0.16.x

cosmetic-changes-integration

tmp-fix-integration-docker

fix-integration-docker

configurable-update-interval

show-nodes-online

hs2021

acl-syntax-fixes

ts2021-implementation

fix-spurious-updates

unstable-integration-tests

mandatory-stun

embedded-derp

prtemplate-fix

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: starred/headscale#937