Headscale does not start with large networks in ACLs #485

Closed
opened 2025-12-29 02:18:56 +01:00 by adam · 6 comments
Owner

Originally created by @pkrivanec on GitHub (Apr 22, 2023).

Hi,

headscale v0.22.1 does not work with large networks (10.0.0.0/8) in ACLs.
The v0.21.0 version had no problem with this configuration.

Bug description

Headscale does not start correctly and uses 100% CPU and memory.

To Reproduce
Use acl.hujson:

{
    "groups": {
        "group:test": [
            "testuser"
        ]
    },
    "acls": [
        {
            "action": "accept",
            "src": [
                "group:test"
            ],
            "dst": [
                "tag:exitnode:*",
                "10.0.0.0/8:*"
            ]
        }
    ]
}

Context info

  • Version of headscale used 0.22.1
  • OS Rocky Linux release 9.1
  • Kernel version 5.14.0-162.23.1.el9_1.x86_64
Originally created by @pkrivanec on GitHub (Apr 22, 2023). Hi, headscale v0.22.1 does not work with large networks (10.0.0.0/8) in ACLs. The v0.21.0 version had no problem with this configuration. **Bug description** Headscale does not start correctly and uses 100% CPU and memory. **To Reproduce** Use acl.hujson: { "groups": { "group:test": [ "testuser" ] }, "acls": [ { "action": "accept", "src": [ "group:test" ], "dst": [ "tag:exitnode:*", "10.0.0.0/8:*" ] } ] } **Context info** - Version of headscale used 0.22.1 - OS Rocky Linux release 9.1 - Kernel version 5.14.0-162.23.1.el9_1.x86_64
adam added the bug label 2025-12-29 02:18:56 +01:00
adam closed this issue 2025-12-29 02:18:57 +01:00
Author
Owner

@Chluz commented on GitHub (Apr 22, 2023):

Hi, Same issue here. I'm actually using large networks in order to allow access through exit nodes to the internet only, so quite problematic.

{
      "action": "accept",
      "src": ["group:external"],
      "dst": ["tag:internet-external:0",
               "0.0.0.0/5:*",
               "8.0.0.0/7:*",
               "11.0.0.0/8:*",
               "12.0.0.0/6:*",
               "16.0.0.0/4:*",
               "32.0.0.0/3:*",
               "64.0.0.0/2:*",
               "128.0.0.0/3:*",
               "160.0.0.0/5:*",
               "168.0.0.0/6:*",
               "172.0.0.0/12:*",
               "172.32.0.0/11:*",
               "172.64.0.0/10:*",
               "172.128.0.0/9:*",
               "173.0.0.0/8:*",
               "174.0.0.0/7:*",
               "176.0.0.0/4:*",
               "192.0.0.0/9:*",
               "192.128.0.0/11:*",
               "192.160.0.0/13:*",
               "192.169.0.0/16:*",
               "192.170.0.0/15:*",
               "192.172.0.0/14:*",
               "192.176.0.0/12:*",
               "192.192.0.0/10:*",
               "193.0.0.0/8:*",
               "194.0.0.0/7:*",
               "196.0.0.0/6:*",
               "200.0.0.0/5:*",
               "208.0.0.0/4:*"
              ]
    },
@Chluz commented on GitHub (Apr 22, 2023): Hi, Same issue here. I'm actually using large networks in order to allow access through exit nodes to the internet only, so quite problematic. ``` { "action": "accept", "src": ["group:external"], "dst": ["tag:internet-external:0", "0.0.0.0/5:*", "8.0.0.0/7:*", "11.0.0.0/8:*", "12.0.0.0/6:*", "16.0.0.0/4:*", "32.0.0.0/3:*", "64.0.0.0/2:*", "128.0.0.0/3:*", "160.0.0.0/5:*", "168.0.0.0/6:*", "172.0.0.0/12:*", "172.32.0.0/11:*", "172.64.0.0/10:*", "172.128.0.0/9:*", "173.0.0.0/8:*", "174.0.0.0/7:*", "176.0.0.0/4:*", "192.0.0.0/9:*", "192.128.0.0/11:*", "192.160.0.0/13:*", "192.169.0.0/16:*", "192.170.0.0/15:*", "192.172.0.0/14:*", "192.176.0.0/12:*", "192.192.0.0/10:*", "193.0.0.0/8:*", "194.0.0.0/7:*", "196.0.0.0/6:*", "200.0.0.0/5:*", "208.0.0.0/4:*" ] }, ```
Author
Owner

@pkrivanec commented on GitHub (Apr 23, 2023):

Hi,

I started debugging and found a possible problem with the generation of the ACLPeerCacheMap.

The generateACLPeerCacheMap function in acls.go creates a struct for each destination IP.
This will not scale for large network ranges.

If I understand the code correctly, the code should only look if the peer is included in the destination prefix.
It might be better to store a list of destination prefixes for each source and use the IPNet.Contains function to check if the IP is included.

@pkrivanec commented on GitHub (Apr 23, 2023): Hi, I started debugging and found a possible problem with the generation of the ACLPeerCacheMap. The generateACLPeerCacheMap function in acls.go creates a struct for each destination IP. This will not scale for large network ranges. If I understand the code correctly, the code should only look if the peer is included in the destination prefix. It might be better to store a list of destination prefixes for each source and use the IPNet.Contains function to check if the IP is included.
Author
Owner

@kradalby commented on GitHub (Apr 24, 2023):

Yes, this is kind of known, the logic was not correct, and the current working version is not attempted optimised, the main goal was for it to be correct. The goal would be to rewrite it later to provide a version that does not do something silly like listing all IPs.

As @pkrivanec the way to do this properly is by doing ip prefixes instead of single IPs, but that required more planning and rewrites, and we needed to get the tests right first, before we can start changing it.

@kradalby commented on GitHub (Apr 24, 2023): Yes, this is kind of known, the logic was not correct, and the current working version is not attempted optimised, the main goal was for it to be correct. The goal would be to rewrite it later to provide a version that does not do something silly like listing all IPs. As @pkrivanec the way to do this properly is by doing ip prefixes instead of single IPs, but that required more planning and rewrites, and we needed to get the tests right first, before we can start changing it.
Author
Owner

@codyro commented on GitHub (Apr 26, 2023):

This unfortunately is a blocker for our organization too.

Edit: I noticed #1377 has a potential fix. I'll give that patch a try and report back in that issue.

@codyro commented on GitHub (Apr 26, 2023): This unfortunately is a blocker for our organization too. Edit: I noticed #1377 has a potential fix. I'll give that patch a try and report back in that issue.
Author
Owner

@kradalby commented on GitHub (Apr 27, 2023):

I have rewritten this whole thing in https://github.com/juanfont/headscale/pull/1381, this is an unoptimised, but "way smarter and cleaner" solution, I have added @Chluz case as a test case so we don't regress.

The initial version was a lot of cladding on a suboptimal temp solution, but this should be closer to something we can live with. I think it should not need optimalisations before it's useful, but I'll look into that as part of ACL refactoring.

The previous few PRs was mostly to get tests that we could be confident in so we now can refactor it without too much worry. I appreciate that it lead to one really bad decision, but now we can write a faster solution without breaking things.

Would be great if someone tests #1381 while I am working on it, and apologies for the inconvenience.

@kradalby commented on GitHub (Apr 27, 2023): I have rewritten this whole thing in https://github.com/juanfont/headscale/pull/1381, this is an unoptimised, but "way smarter and cleaner" solution, I have added @Chluz case as a test case so we don't regress. The initial version was a lot of cladding on a suboptimal temp solution, but this should be closer to something we can live with. I _think_ it should not _need_ optimalisations before it's useful, but I'll look into that as part of ACL refactoring. The previous few PRs was mostly to get tests that we could be confident in so we now can refactor it without too much worry. I appreciate that it lead to one really bad decision, but now we can write a faster solution without breaking things. Would be great if someone tests #1381 while I am working on it, and apologies for the inconvenience.
Author
Owner

@kradalby commented on GitHub (May 10, 2023):

resolved in next release

@kradalby commented on GitHub (May 10, 2023): resolved in next release
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/headscale#485