[Bug] ERR noise upgrade failed error #932

Closed
opened 2025-12-29 02:26:19 +01:00 by adam · 17 comments
Owner

Originally created by @SysAdminSmith on GitHub (Jan 31, 2025).

Is this a support request?

  • This is not a support request

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

When attempting to use custom certificates (backed by a trusted CA; purchased through Comodo) I get the following:

Jan 31 17:16:59 DD-0125-003VM headscale[12331]: 2025-01-31T17:16:59Z ERR noise upgrade failed error="noise handshake failed: decrypting machine key: chacha20poly1305: message authentication failed"
Jan 31 17:16:59 DD-0125-003VM headscale[12331]: 2025/01/31 17:16:59 http: response.WriteHeader on hijacked connection from github.com/juanfont/headscale/hscontrol.(*Headscale).NoiseUpgradeHandler (noise.go:83)
Jan 31 17:16:59 DD-0125-003VM headscale[12331]: 2025/01/31 17:16:59 http: response.Write on hijacked connection from fmt.Fprintln (print.go:305)

Expected Behavior

I expect the cert to work. openssl verifies it to include leaf, intermediate, and root certs. The key matches the pem. Cert and key are chown'd headscale:headscale with 644 600, respectively

Steps To Reproduce

Modify config to use custom certificates
Start headscale

Environment

- OS: Debian GNU/Linux 12 (bookwork)
- Headscale version: v0.24.2
- Tailscale version: latest per client

Runtime environment

  • Headscale is behind a (reverse) proxy
  • Headscale runs in a container

Anything else?

The Headscale server is hosted in Azure

Originally created by @SysAdminSmith on GitHub (Jan 31, 2025). ### Is this a support request? - [x] This is not a support request ### Is there an existing issue for this? - [x] I have searched the existing issues ### Current Behavior When attempting to use custom certificates (backed by a trusted CA; purchased through Comodo) I get the following: ``` Jan 31 17:16:59 DD-0125-003VM headscale[12331]: 2025-01-31T17:16:59Z ERR noise upgrade failed error="noise handshake failed: decrypting machine key: chacha20poly1305: message authentication failed" Jan 31 17:16:59 DD-0125-003VM headscale[12331]: 2025/01/31 17:16:59 http: response.WriteHeader on hijacked connection from github.com/juanfont/headscale/hscontrol.(*Headscale).NoiseUpgradeHandler (noise.go:83) Jan 31 17:16:59 DD-0125-003VM headscale[12331]: 2025/01/31 17:16:59 http: response.Write on hijacked connection from fmt.Fprintln (print.go:305) ``` ### Expected Behavior I expect the cert to work. openssl verifies it to include leaf, intermediate, and root certs. The key matches the pem. Cert and key are chown'd headscale:headscale with 644 600, respectively ### Steps To Reproduce Modify config to use custom certificates Start headscale ### Environment ```markdown - OS: Debian GNU/Linux 12 (bookwork) - Headscale version: v0.24.2 - Tailscale version: latest per client ``` ### Runtime environment - [ ] Headscale is behind a (reverse) proxy - [ ] Headscale runs in a container ### Anything else? The Headscale server is hosted in Azure
adam added the stalebug labels 2025-12-29 02:26:19 +01:00
adam closed this issue 2025-12-29 02:26:19 +01:00
Author
Owner

@Cobertos commented on GitHub (Feb 1, 2025):

Is this just an error that shows up with any new headscale server? I'm seeing this in journalctl on my old headscale server (Oct 4 2024) and then just earlier today in journalctl when I started up a new headscale instance on a separate machine.

EDIT: To clarify, it showed up for an hour or so in the logs after first startup and then never showed up again. (As the subsequent comment describes)

EDIT: For addtional context, my machines are self-hosted at home. The old one was Debian and the new one is Raspbian. No containers, no reverse proxy (punched a hole right through the router to the machine with headscale). The nodes do work as intended now

@Cobertos commented on GitHub (Feb 1, 2025): Is this just an error that shows up with any new headscale server? I'm seeing this in journalctl on my old headscale server (Oct 4 2024) and then just earlier today in journalctl when I started up a new headscale instance on a separate machine. EDIT: To clarify, it showed up for an hour or so in the logs after first startup and then never showed up again. (As the subsequent comment describes) EDIT: For addtional context, my machines are self-hosted at home. The old one was Debian and the new one is Raspbian. No containers, no reverse proxy (punched a hole right through the router to the machine with headscale). The nodes do work as intended now
Author
Owner

@SysAdminSmith commented on GitHub (Feb 1, 2025):

I have no idea. It literally stopped showing up a few hours ago for me. Makes me itchy not knowing the cause/solution

@SysAdminSmith commented on GitHub (Feb 1, 2025): I have no idea. It literally stopped showing up a few hours ago for me. Makes me itchy not knowing the cause/solution
Author
Owner

@ycsh-w commented on GitHub (Feb 1, 2025):

I am not sure if this is related to custom certificate or not. I am upgrading from 0.23 and just modified my config file and restarted, and I am seeing this now.

@ycsh-w commented on GitHub (Feb 1, 2025): I am not sure if this is related to custom certificate or not. I am upgrading from 0.23 and just modified my config file and restarted, and I am seeing this now.
Author
Owner

@nblock commented on GitHub (Feb 1, 2025):

When exactly does this happen? (e.g. when a new node connects for the first time, or when an existing node connects, or just random connections from the Internet)?

Do your tailscale clients work as expected?

Is there some "proxy/security" thing in Azure configured, maybe something that messes with Websocket POST?

Probably related: https://github.com/juanfont/headscale/issues/1295

@nblock commented on GitHub (Feb 1, 2025): When exactly does this happen? (e.g. when a new node connects for the first time, or when an existing node connects, or just random connections from the Internet)? Do your tailscale clients work as expected? Is there some "proxy/security" thing in Azure configured, maybe something that messes with Websocket POST? Probably related: https://github.com/juanfont/headscale/issues/1295
Author
Owner

@SysAdminSmith commented on GitHub (Feb 1, 2025):

I have been unable to correlate it to any specific event. The nodes seem to work as intended but I haven't had time to implement any ACLs (which were not functioning at all and ultimately led to a reinstall of the server and all nodes back during 0.24.0 and 0.24.1)

@SysAdminSmith commented on GitHub (Feb 1, 2025): I have been unable to correlate it to any specific event. The nodes seem to work as intended but I haven't had time to implement any ACLs (which were not functioning at all and ultimately led to a reinstall of the server and all nodes back during 0.24.0 and 0.24.1)
Author
Owner

@nblock commented on GitHub (Feb 1, 2025):

I have been unable to correlate it to any specific event.

That makes it hard to narrow it down or reproduce. I checked several instances and have not seen this once. Can you test your setup without Azure? Like on some cheap VPS (hint: not cloudflare)?

The nodes seem to work as intended but I haven't had time to implement any ACLs

This is probably unrelated to ACLs as the error message indicates problems during connection establishment.

How often does this error occur? Just once in a while or repeatedly?

@nblock commented on GitHub (Feb 1, 2025): > I have been unable to correlate it to any specific event. That makes it hard to narrow it down or reproduce. I checked several instances and have not seen this once. Can you test your setup without Azure? Like on some cheap VPS (hint: not cloudflare)? > The nodes seem to work as intended but I haven't had time to implement any ACLs This is probably unrelated to ACLs as the error message indicates problems during connection establishment. How often does this error occur? Just once in a while or repeatedly?
Author
Owner

@mawanasad commented on GitHub (Feb 17, 2025):

Did you delete a machine from the node list and joined it again? I think there are some old key exchanges happening at headscale end that are not valid anymore once you deleted the old node.

@mawanasad commented on GitHub (Feb 17, 2025): Did you delete a machine from the node list and joined it again? I think there are some old key exchanges happening at headscale end that are not valid anymore once you deleted the old node.
Author
Owner

@Darcsis commented on GitHub (Feb 23, 2025):

This same issue is occuring for my new install of headscale running behind a traefik RP. I am running the headscale server in a docker container on an debian 12 Vps.
At start up the error messages come every few seconds.

@Darcsis commented on GitHub (Feb 23, 2025): This same issue is occuring for my new install of headscale running behind a traefik RP. I am running the headscale server in a docker container on an debian 12 Vps. At start up the error messages come every few seconds.
Author
Owner

@kradalby commented on GitHub (Feb 23, 2025):

@Darcsis can you try without traefik? I would not be surprised if it is the proxies fault.

@kradalby commented on GitHub (Feb 23, 2025): @Darcsis can you try without traefik? I would not be surprised if it is the proxies fault.
Author
Owner

@Darcsis commented on GitHub (Feb 27, 2025):

@kradalby ok so after some reading on the discord and github, someone pointed to the drop version >1.62 as the reason for the error. is this something that i need to configure on the headscale server?
-sort a noob

@Darcsis commented on GitHub (Feb 27, 2025): @kradalby ok so after some reading on the discord and github, someone pointed to the drop version >1.62 as the reason for the error. is this something that i need to configure on the headscale server? -sort a noob
Author
Owner

@kradalby commented on GitHub (Feb 27, 2025):

Do you have versions older than 1.62?

@kradalby commented on GitHub (Feb 27, 2025): Do you have versions older than 1.62?
Author
Owner

@Darcsis commented on GitHub (Feb 27, 2025):

on my tailscale clients, no the only one i have connected to the server currently is running 1.70.0

@Darcsis commented on GitHub (Feb 27, 2025): on my tailscale clients, no the only one i have connected to the server currently is running 1.70.0
Author
Owner

@kradalby commented on GitHub (Feb 27, 2025):

then I would expect it to be unrelated, this is probably something to discuss in discord rather than github.

@kradalby commented on GitHub (Feb 27, 2025): then I would expect it to be unrelated, this is probably something to discuss in discord rather than github.
Author
Owner

@jonny190 commented on GitHub (Mar 12, 2025):

@kradalby i'm having the same issue with the logs getting spammed every second and i am running an earlier version on PFSense

and a tailscale update returns "already running stable version 1.54.0"

@jonny190 commented on GitHub (Mar 12, 2025): @kradalby i'm having the same issue with the logs getting spammed every second and i am running an earlier version on PFSense and a tailscale update returns "already running stable version 1.54.0"
Author
Owner

@jonny190 commented on GitHub (Mar 12, 2025):

ended up up dating via https://forum.netgate.com/topic/174525/how-to-update-to-the-latest-tailscale-version/116

@jonny190 commented on GitHub (Mar 12, 2025): ended up up dating via https://forum.netgate.com/topic/174525/how-to-update-to-the-latest-tailscale-version/116
Author
Owner

@github-actions[bot] commented on GitHub (Jun 11, 2025):

This issue is stale because it has been open for 90 days with no activity.

@github-actions[bot] commented on GitHub (Jun 11, 2025): This issue is stale because it has been open for 90 days with no activity.
Author
Owner

@github-actions[bot] commented on GitHub (Jun 18, 2025):

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions[bot] commented on GitHub (Jun 18, 2025): This issue was closed because it has been inactive for 14 days since being marked as stale.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/headscale#932