Policy rework tracking bug #937

Closed
opened 2025-12-29 02:26:27 +01:00 by adam · 4 comments
Owner

Originally created by @kradalby on GitHub (Feb 7, 2025).

Originally assigned to: @kradalby on GitHub.

This bug is a tracking bug for all work related to reworking the Policy and ACLs that we aim to get into 0.26.0.

Within the scope of 0.26.0:

  • Making the current code more maintainable
  • Making it easier to test
  • Bring the ACLs up to scratch with Tailscale's ACL
  • Maintain current functionality

Out of scope for 0.26.0:

  • Add new functionality (grants etc)
  • Add new autogroups
Originally created by @kradalby on GitHub (Feb 7, 2025). Originally assigned to: @kradalby on GitHub. This bug is a tracking bug for all work related to reworking the Policy and ACLs that we aim to get into 0.26.0. Within the scope of 0.26.0: - Making the current code more maintainable - Making it easier to test - Bring the ACLs up to scratch with Tailscale's ACL - Maintain current functionality Out of scope for 0.26.0: - Add new functionality (`grants` etc) - Add new `autogroups`
adam added the bugpolicy 📝 labels 2025-12-29 02:26:27 +01:00
adam closed this issue 2025-12-29 02:26:27 +01:00
Author
Owner

@aergus-tng commented on GitHub (Mar 28, 2025):

Hello, while evaluating the possibility of running a Headscale network in our organization, we have identified a few performance bottlenecks we are interested in addressing. We are aware of #2491 and your general preference of maintainability over performance, but one of the "hot paths" points to this issue regarding a planned change, so we wanted to query the status.

Namely, there is the following in the CanAccess method of *Node.

// TODO(kradalby): Regenerate this every time the filter change, instead of
// every time we use it.
// Part of #2416
matchers := make([]matcher.Match, len(filter))
for i, rule := range filter {
	matchers[i] = matcher.MatchFromFilterRule(rule)
}

(CanAccess itself is called in a loop FilterNodesByACL, which is in the call stack of FullMapResponse and PeerChangedResponse. In the end calls to ParseIPSet further down the call stack of MatchFromFilterRule add up to a significant amount of CPU time for these responses.)

Do you have specific plans for the implementation of this change? We would be interested in submitting a PR if there are no blockers. (If that is the case: How exactly would you like the "cached" matches to be accessed? Would you add a new method to the PolicyManager interface or implement caching somehow only for the version 2 policy manager?)

PS: FWIW, we already tested the attached patch that moves the generation of the matches out of the loop of FilterNodesByACL to appendPeerChanges, which seems to help with performance. (That would be redundant with the aforementioned changes, but if desired, we can nevertheless submit also this patch as a PR.)

@aergus-tng commented on GitHub (Mar 28, 2025): Hello, while evaluating the possibility of running a Headscale network in our organization, we have identified a few performance bottlenecks we are interested in addressing. We are aware of #2491 and your general preference of maintainability over performance, but one of the "hot paths" points to this issue regarding a planned change, so we wanted to query the status. Namely, there is the following in the `CanAccess` method of `*Node`. ```go // TODO(kradalby): Regenerate this every time the filter change, instead of // every time we use it. // Part of #2416 matchers := make([]matcher.Match, len(filter)) for i, rule := range filter { matchers[i] = matcher.MatchFromFilterRule(rule) } ``` (`CanAccess` itself is called in a loop `FilterNodesByACL`, which is in the call stack of `FullMapResponse` and `PeerChangedResponse`. In the end calls to `ParseIPSet` further down the call stack of `MatchFromFilterRule` add up to a significant amount of CPU time for these responses.) Do you have specific plans for the implementation of this change? We would be interested in submitting a PR if there are no blockers. (If that is the case: How exactly would you like the "cached" matches to be accessed? Would you add a new method to the `PolicyManager` interface or implement caching somehow only for the version 2 policy manager?) PS: FWIW, we already tested the [attached patch](https://github.com/user-attachments/files/19501123/0001-Compute-matches-from-ACL-rules-before-a-loop.patch.txt) that moves the generation of the matches out of the loop of `FilterNodesByACL` to `appendPeerChanges`, which seems to help with performance. (That would be redundant with the aforementioned changes, but if desired, we can nevertheless submit also this patch as a PR.)
Author
Owner

@kradalby commented on GitHub (Mar 31, 2025):

@aergus-tng in general, I am a bit conservative when it comes to changing too much at once, but in this particular case, I do agree that this is pretty low hanging fruit and I did even mark it as such.

If you would be up for doing a PR to implement this I would probably manage to get this into 0.26 or 0.27.

I making a method for returning the precomputed match filter on the Policy interface makes sense,

to not waste time on v1, I suggest that the basic "compute on every call" is implemented for v1, and then a cached "recompute on filter change" for v2.

Let me know if you have any questions, I wrote this up a bit quickly as I'm travelling/conferencing this week.

@kradalby commented on GitHub (Mar 31, 2025): @aergus-tng in general, I am a bit conservative when it comes to changing too much at once, but in this particular case, I do agree that this is pretty low hanging fruit and I did even mark it as such. If you would be up for doing a PR to implement this I would probably manage to get this into 0.26 or 0.27. I making a method for returning the precomputed match filter on the Policy interface makes sense, to not waste time on v1, I suggest that the basic "compute on every call" is implemented for v1, and then a cached "recompute on filter change" for v2. Let me know if you have any questions, I wrote this up a bit quickly as I'm travelling/conferencing this week.
Author
Owner

@kradalby commented on GitHub (May 14, 2025):

Closing as finished as part of 0.26. Further Policy work will be tracked in new issues.

@kradalby commented on GitHub (May 14, 2025): Closing as finished as part of 0.26. Further Policy work will be tracked in new issues.
Author
Owner

@almereyda commented on GitHub (Sep 23, 2025):

The wealth of slightly related follow ups can be found via the sister story:

@almereyda commented on GitHub (Sep 23, 2025): The wealth of slightly related follow ups can be found via the sister story: - #2417
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/headscale#937