[Bug] Invalid ACL stored in DB, crash on startup #1036

Closed
opened 2025-12-29 02:27:51 +01:00 by adam · 8 comments
Owner

Originally created by @stblassitude on GitHub (May 27, 2025).

Is this a support request?

  • This is not a support request

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

I've used headscale-admin to update te the ACL, then I restarted headscale. On startup, it complained about a host referenced in the ACLs that wasn't defined (I had just deleted it thinking that it wasn't needed anymore).

On startup, it complained about the missing host:

handshake-headscale-1  | 2025-05-27T09:18:31Z INF Opening database database=sqlite3 path=/var/lib/headscale/db.sqlite
handshake-headscale-1  | 2025-05-27T09:18:31Z FTL home/runner/work/headscale/headscale/cmd/headscale/cli/serve.go:24 > Error initializing error="creating new headscale: loading ACL policy: creating policy manager: parsing policy: Host \"siemens-distillery\" is not defined in the Policy, please define or remove the reference to it"

and then exited.

I patched the database manually using sqlite3 (I basically copied an older version to the currect version in the policies table, and that let me start up headscale again.

This is with docker.io/headscale/headscale:v0.26.0 and docker.io/goodieshq/headscale-admin:v0.25.6.

Expected Behavior

Headscale should either refuse to store the broken ACL, or ignore the broken (part of the) ACLs on startup, so users can fix it through the command line or the web interface.

Steps To Reproduce

  1. Bring up headscale and headscale admin with docker-compose
  2. Add a host to the policies
  3. Add an ACL referencing that host
  4. Save the config
  5. Remove the host
  6. Save the config; note that there is no error
  7. Restart headscale and observe that it won't start.

Environment

- OS: docker compose
- Headscale version: v0.26.0
- Tailscale version: n/a
- Headscale Admin: v0.25.6

Runtime environment

  • Headscale is behind a (reverse) proxy
  • Headscale runs in a container

Debug information

see above

Originally created by @stblassitude on GitHub (May 27, 2025). ### Is this a support request? - [x] This is not a support request ### Is there an existing issue for this? - [x] I have searched the existing issues ### Current Behavior I've used headscale-admin to update te the ACL, then I restarted headscale. On startup, it complained about a host referenced in the ACLs that wasn't defined (I had just deleted it thinking that it wasn't needed anymore). On startup, it complained about the missing host: ``` handshake-headscale-1 | 2025-05-27T09:18:31Z INF Opening database database=sqlite3 path=/var/lib/headscale/db.sqlite handshake-headscale-1 | 2025-05-27T09:18:31Z FTL home/runner/work/headscale/headscale/cmd/headscale/cli/serve.go:24 > Error initializing error="creating new headscale: loading ACL policy: creating policy manager: parsing policy: Host \"siemens-distillery\" is not defined in the Policy, please define or remove the reference to it" ``` and then exited. I patched the database manually using sqlite3 (I basically copied an older version to the currect version in the `policies` table, and that let me start up headscale again. This is with docker.io/headscale/headscale:v0.26.0 and docker.io/goodieshq/headscale-admin:v0.25.6. ### Expected Behavior Headscale should either refuse to store the broken ACL, or ignore the broken (part of the) ACLs on startup, so users can fix it through the command line or the web interface. ### Steps To Reproduce 1. Bring up headscale and headscale admin with docker-compose 2. Add a host to the policies 3. Add an ACL referencing that host 4. Save the config 5. Remove the host 6. Save the config; note that there is no error 7. Restart headscale and observe that it won't start. ### Environment ```markdown - OS: docker compose - Headscale version: v0.26.0 - Tailscale version: n/a - Headscale Admin: v0.25.6 ``` ### Runtime environment - [x] Headscale is behind a (reverse) proxy - [x] Headscale runs in a container ### Debug information see above
adam added the bugno-stale-botpolicy 📝 labels 2025-12-29 02:27:51 +01:00
adam closed this issue 2025-12-29 02:27:51 +01:00
Author
Owner

@dulinux commented on GitHub (Jun 1, 2025):

Similar issue here. Was working before pulling updated image.

[...]
headscale        | 2025-06-01T13:03:49-03:00 INF Opening database database=sqlite3 path=/var/lib/headscale/db.sqlite
headscale        | 2025-06-01T13:03:49-03:00 FTL home/runner/work/headscale/headscale/cmd/headscale/cli/serve.go:24 > Error initializing error="creating new headscale: loading ACL policy: creating policy manager: parsing policy: parsing policy from bytes: Invalid Owner \"duli\". An alias must be one of the following types:\n- user (containing an \"@\")\n- group (starting with \"group:\")\n- tag (starting with \"tag:\")\n\nPlease check the format and try again."
[...]
duli@oc1:/opt/docker/headscale$ docker image list
REPOSITORY                           TAG        IMAGE ID       CREATED        SIZE
headscale/headscale                  stable     d70eeb8fb774   N/A            80.8MB
[...]
@dulinux commented on GitHub (Jun 1, 2025): Similar issue here. Was working before pulling updated image. ``` [...] headscale | 2025-06-01T13:03:49-03:00 INF Opening database database=sqlite3 path=/var/lib/headscale/db.sqlite headscale | 2025-06-01T13:03:49-03:00 FTL home/runner/work/headscale/headscale/cmd/headscale/cli/serve.go:24 > Error initializing error="creating new headscale: loading ACL policy: creating policy manager: parsing policy: parsing policy from bytes: Invalid Owner \"duli\". An alias must be one of the following types:\n- user (containing an \"@\")\n- group (starting with \"group:\")\n- tag (starting with \"tag:\")\n\nPlease check the format and try again." [...] ``` ``` duli@oc1:/opt/docker/headscale$ docker image list REPOSITORY TAG IMAGE ID CREATED SIZE headscale/headscale stable d70eeb8fb774 N/A 80.8MB [...] ```
Author
Owner

@malosaaa commented on GitHub (Jun 3, 2025):

Just open the database in sqlbrowser in windows.
delete the last entry data in policy, just copy first the acls and modify. and insert it later with for example headplane UI, that works

For me the last update in acls broke everything and got annoyed to reconfigure everything.. however still having issue's

@malosaaa commented on GitHub (Jun 3, 2025): Just open the database in sqlbrowser in windows. delete the last entry data in policy, just copy first the acls and modify. and insert it later with for example headplane UI, that works For me the last update in acls broke everything and got annoyed to reconfigure everything.. however still having issue's
Author
Owner

@bratorange commented on GitHub (Jun 7, 2025):

+1, my ACLs also got screwed up.

@bratorange commented on GitHub (Jun 7, 2025): +1, my ACLs also got screwed up.
Author
Owner

@stblassitude commented on GitHub (Jul 5, 2025):

Another example:

home/runner/work/headscale/headscale/cmd/headscale/cli/serve.go:24 > Error initializing error="creating new headscale: loading ACL policy: creating policy manager: parsing policy: parsing policy from bytes: Username has to contain @, got: \"foobar\"

Again ,since headscale fails to start up, the only way out is to either throw away the entire configuration, or try and patch the database manually.

@stblassitude commented on GitHub (Jul 5, 2025): Another example: ``` home/runner/work/headscale/headscale/cmd/headscale/cli/serve.go:24 > Error initializing error="creating new headscale: loading ACL policy: creating policy manager: parsing policy: parsing policy from bytes: Username has to contain @, got: \"foobar\" ``` Again ,since headscale fails to start up, the only way out is to either throw away the entire configuration, or try and patch the database manually.
Author
Owner

@nblock commented on GitHub (Jul 5, 2025):

… the only way out is to either throw away the entire configuration, or try and patch the database manually.

Have a look at "Migration notes when the policy is stored in the database." in the Changelog.

@nblock commented on GitHub (Jul 5, 2025): > … the only way out is to either throw away the entire configuration, or try and patch the database manually. Have a look at "Migration notes when the policy is stored in the database." in the Changelog.
Author
Owner

@kradalby commented on GitHub (Sep 10, 2025):

ok, these are the three options I have come up with, I don't care for either that much:

Let it start with an allow all policy -> security risk, also if it doesnt fail and yell, how would people know that they are now running with a wide open network without checking logs.

Let it start with a deny all policy -> UX annoyance, but people would at least notice, but not really easily, they would have to go to the logs and see it yell, after wondering "why doesnt anything work". This will break your current network the moment the nodes reconnect, which I do not think is acceptable.

Add a CLI flag that bypasses gRPC and allows you to pass the database file and policy file directly. This is kind of a "helper" to patch the database yourself. It would be a bit weird since you cant use it remote, it will be only for the server where it runs.

@kradalby commented on GitHub (Sep 10, 2025): ok, these are the three options I have come up with, I don't care for either that much: Let it start with an allow all policy -> security risk, also if it doesnt fail and yell, how would people know that they are now running with a wide open network without checking logs. Let it start with a deny all policy -> UX annoyance, but people would at least notice, but not really easily, they would have to go to the logs and see it yell, after wondering "why doesnt anything work". This will break your current network the moment the nodes reconnect, which I do not think is acceptable. Add a CLI flag that bypasses gRPC and allows you to pass the database file and policy file directly. This is kind of a "helper" to patch the database yourself. It would be a bit weird since you cant use it remote, it will be only for the server where it runs.
Author
Owner

@nblock commented on GitHub (Sep 10, 2025):

Maybe there could be two mutually exclusive configuration options (or cli flags) to bypass whatever policy is configured and apply "allow all" or "deny all" temporarily. This should log scary warnings when being used to make it clear that it is just a temporary measure to fixup a policy in the GUI. Maybe:

  • --bypass-policy-allow-all
  • --bypass-policy-deny-all
@nblock commented on GitHub (Sep 10, 2025): Maybe there could be two mutually exclusive configuration options (or cli flags) to bypass whatever policy is configured and apply "allow all" or "deny all" temporarily. This should log scary warnings when being used to make it clear that it is just a temporary measure to fixup a policy in the GUI. Maybe: - `--bypass-policy-allow-all` - `--bypass-policy-deny-all`
Author
Owner

@kradalby commented on GitHub (Sep 10, 2025):

After discussing, @nblock think the best option is to add a --database-path flag to headscale policy set|get which bypasses the gRPC/connection to the server and allows you to extract and set the policy in the database without the server running.

This will of course only work for SQLite.

@kradalby commented on GitHub (Sep 10, 2025): After discussing, @nblock think the best option is to add a `--database-path` flag to `headscale policy set|get` which bypasses the gRPC/connection to the server and allows you to extract and set the policy in the database without the server running. This will of course only work for SQLite.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/headscale#1036