[Bug] CLI could not connect to a server #971

Closed
opened 2025-12-29 02:26:52 +01:00 by adam · 9 comments
Owner

Originally created by @YouSysAdmin on GitHub (Mar 12, 2025).

Is this a support request?

  • This is not a support request

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

headscale nodes list

2025-03-12T13:56:11+02:00 FTL ../../../../../../home/runner/work/headscale/headscale/cmd/headscale/cli/utils.go:124 > Could not connect: context deadline exceeded error="context deadline exceeded"

Expected Behavior

list of nodes

Steps To Reproduce

  1. touch ~/.headscale/config.yaml
export HEADSCALE_CLI_API_KEY=************
export HEADSCALE_CLI_ADDRESS=access*****:443 
  1. execute headscale nodes list

  2. Check GRPC is working correct

grpcurl -H "authorization: Bearer ${HEADSCALE_CLI_API_KEY}" "${HEADSCALE_CLI_ADDRESS}" 'headscale.v1.HeadscaleService.ListNodes'
"nodes": [
    {
      "id": "5",
      "machineKey": "mkey:**********",
      "nodeKey": "nodekey:**********",
      "discoKey": "discokey:**********",
      "ipAddresses": [
        "100.64.0.2",
        "fd7a:115c:a1e0::2"
      ],
.....

Environment

- OS: Server Kubernetes (used official image) / Client MacOS ARM64
- Headscale Server: 0.25.1 / Client 0.25.1
- Tailscale version:

Runtime environment

  • Headscale is behind a (reverse) proxy
  • Headscale runs in a container

Anything else?

Only remote CLI is affected, all other functions work correctly.

Update: I tested older versions and the latest working version is 0.23.0, connects and possible to set policy.

Originally created by @YouSysAdmin on GitHub (Mar 12, 2025). ### Is this a support request? - [x] This is not a support request ### Is there an existing issue for this? - [x] I have searched the existing issues ### Current Behavior ``` headscale nodes list 2025-03-12T13:56:11+02:00 FTL ../../../../../../home/runner/work/headscale/headscale/cmd/headscale/cli/utils.go:124 > Could not connect: context deadline exceeded error="context deadline exceeded" ``` ### Expected Behavior list of nodes ### Steps To Reproduce 1. `touch ~/.headscale/config.yaml` 2. ``` export HEADSCALE_CLI_API_KEY=************ export HEADSCALE_CLI_ADDRESS=access*****:443 ``` 3. execute `headscale nodes list` 4. Check GRPC is working correct ``` grpcurl -H "authorization: Bearer ${HEADSCALE_CLI_API_KEY}" "${HEADSCALE_CLI_ADDRESS}" 'headscale.v1.HeadscaleService.ListNodes' "nodes": [ { "id": "5", "machineKey": "mkey:**********", "nodeKey": "nodekey:**********", "discoKey": "discokey:**********", "ipAddresses": [ "100.64.0.2", "fd7a:115c:a1e0::2" ], ..... ``` ### Environment ```markdown - OS: Server Kubernetes (used official image) / Client MacOS ARM64 - Headscale Server: 0.25.1 / Client 0.25.1 - Tailscale version: ``` ### Runtime environment - [x] Headscale is behind a (reverse) proxy - [x] Headscale runs in a container ### Anything else? Only remote CLI is affected, all other functions work correctly. **Update:** I tested older versions and the latest working version is 0.23.0, connects and possible to set policy.
adam added the bug label 2025-12-29 02:26:52 +01:00
adam closed this issue 2025-12-29 02:26:52 +01:00
Author
Owner

@nblock commented on GitHub (Mar 15, 2025):

I tested the remote-cli with 0.25.1 as described in the docs (without reverse proxy or container) and it works.

local config testing.yml:

cli:
  address: headscale.example.com:50443
  api_key: rS-0soL.8OfdRblablablbablablblazYX5kd

Invocation: ./headscale_0.25.1_linux_amd64 -c testing.yml user list

Please note that the address has to be configured without http:// or https://. Can you please check your configuration again?

@nblock commented on GitHub (Mar 15, 2025): I tested the remote-cli with 0.25.1 as described in the [docs](https://headscale.net/0.25.1/ref/remote-cli/) (without reverse proxy or container) and it works. local config `testing.yml`: ```yml cli: address: headscale.example.com:50443 api_key: rS-0soL.8OfdRblablablbablablblazYX5kd ``` Invocation: `./headscale_0.25.1_linux_amd64 -c testing.yml user list` Please note that the address has to be configured **without** `http://` or `https://`. Can you please check your configuration again?
Author
Owner

@YouSysAdmin commented on GitHub (Mar 15, 2025):

Hi @nblock
The same result if using an ~/.headscale/config.yaml file.
All config variables set are correctly, judding by a trace output and additional debug outputs (inside the utils/newHeadscaleCLIWithConfig function).

I have compiled a conditionally working version via downgrade versions of some packages (haven't tested all the CLI functions)

diff --git a/go.mod b/go.mod
index ecf94318..d75df6da 100644
--- a/go.mod
+++ b/go.mod
@@ -16,7 +16,7 @@ require (
        github.com/google/go-cmp v0.6.0
        github.com/gorilla/mux v1.8.1
        github.com/grpc-ecosystem/go-grpc-middleware v1.4.0
-       github.com/grpc-ecosystem/grpc-gateway/v2 v2.24.0
+       github.com/grpc-ecosystem/grpc-gateway/v2 v2.22.0
        github.com/jagottsicher/termcolor v1.0.2
        github.com/klauspost/compress v1.17.11
        github.com/oauth2-proxy/mockoidc v0.0.0-20240214162133-caebfff84d25
@@ -42,8 +42,8 @@ require (
        golang.org/x/net v0.34.0
        golang.org/x/oauth2 v0.25.0
        golang.org/x/sync v0.10.0
-       google.golang.org/genproto/googleapis/api v0.0.0-20241216192217-9240e9c98484
-       google.golang.org/grpc v1.69.0
+       google.golang.org/genproto/googleapis/api v0.0.0-20240903143218-8af14fe29dc1
+       google.golang.org/grpc v1.66.0
        google.golang.org/protobuf v1.36.0
        gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c
        gopkg.in/yaml.v3 v3.0.1

routes list output:

> go run cmd/headscale/headscale.go -c ~/.headscale/config.yaml routes list
2025-03-15T14:23:14+02:00 DBG Setting timeout timeout=5000
2025-03-15T14:23:14+02:00 TRC cmd/headscale/cli/utils.go:121 > Connecting via gRPC address=my.server.example:443
ID  | Node                                  | Prefix           | Advertised | Enabled | Primary
209 | vpn-router                       | 10.1.4.0/24    | true       | true    | true
210 | vpn-router                        | 10.1.0.0/16     | true       | true    | true
211 | vpn-router                         | 10.2.0.0/16     | true       | true    | true
212 | vpn-router                        | 10.30.0/16     | true       | true    | true
graph TD
    A[Traefik 443 headscale.example.com] --> C[8080]
    B[Traefik 443 grpc-headscale.example.com]--> D[50443 h2c]
    C --> E
    D --> E
    E(headscale container HTTP 8080, GRPC 50443)
@YouSysAdmin commented on GitHub (Mar 15, 2025): Hi @nblock The same result if using an `~/.headscale/config.yaml` file. All config variables set are correctly, judding by a trace output and additional debug outputs (inside the utils/newHeadscaleCLIWithConfig function). I have compiled a conditionally working version via downgrade versions of some packages (haven't tested all the CLI functions) ```diff diff --git a/go.mod b/go.mod index ecf94318..d75df6da 100644 --- a/go.mod +++ b/go.mod @@ -16,7 +16,7 @@ require ( github.com/google/go-cmp v0.6.0 github.com/gorilla/mux v1.8.1 github.com/grpc-ecosystem/go-grpc-middleware v1.4.0 - github.com/grpc-ecosystem/grpc-gateway/v2 v2.24.0 + github.com/grpc-ecosystem/grpc-gateway/v2 v2.22.0 github.com/jagottsicher/termcolor v1.0.2 github.com/klauspost/compress v1.17.11 github.com/oauth2-proxy/mockoidc v0.0.0-20240214162133-caebfff84d25 @@ -42,8 +42,8 @@ require ( golang.org/x/net v0.34.0 golang.org/x/oauth2 v0.25.0 golang.org/x/sync v0.10.0 - google.golang.org/genproto/googleapis/api v0.0.0-20241216192217-9240e9c98484 - google.golang.org/grpc v1.69.0 + google.golang.org/genproto/googleapis/api v0.0.0-20240903143218-8af14fe29dc1 + google.golang.org/grpc v1.66.0 google.golang.org/protobuf v1.36.0 gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c gopkg.in/yaml.v3 v3.0.1 ``` `routes list` output: ``` > go run cmd/headscale/headscale.go -c ~/.headscale/config.yaml routes list 2025-03-15T14:23:14+02:00 DBG Setting timeout timeout=5000 2025-03-15T14:23:14+02:00 TRC cmd/headscale/cli/utils.go:121 > Connecting via gRPC address=my.server.example:443 ID | Node | Prefix | Advertised | Enabled | Primary 209 | vpn-router | 10.1.4.0/24 | true | true | true 210 | vpn-router | 10.1.0.0/16 | true | true | true 211 | vpn-router | 10.2.0.0/16 | true | true | true 212 | vpn-router | 10.30.0/16 | true | true | true ``` ```mermaid graph TD A[Traefik 443 headscale.example.com] --> C[8080] B[Traefik 443 grpc-headscale.example.com]--> D[50443 h2c] C --> E D --> E E(headscale container HTTP 8080, GRPC 50443) ```
Author
Owner

@nblock commented on GitHub (Mar 15, 2025):

Please try without a reverse proxy in between.

@nblock commented on GitHub (Mar 15, 2025): Please try without a reverse proxy in between.
Author
Owner

@YouSysAdmin commented on GitHub (Mar 16, 2025):

@nblock
This headscale instance running inside kubernetes cluster end external connection possible only via traefik.

I used kubectl port-forward for forwarding the port 50433 to my local machine and try again, it not working for any CLI version
(doesn't matter configuration via CLI or file)

❯ HEADSCALE_CLI_INSECURE=1 HEADSCALE_CLI_ADDRESS=127.0.0.1:50443 headscale node list
2025-03-16T12:23:58+02:00 FTL ../../../home/runner/work/headscale/headscale/cmd/headscale/cli/utils.go:124 > Could not connect: context deadline exceeded error="context deadline exceeded"

GRPCURL work fine

grpcurl -plaintext -H 'authorization: Bearer TOKEN' 127.0.0.1:50443 headscale.v1.HeadscaleService.GetRoutes
headscale server config
server_url: https://access.example.com
listen_addr: 0.0.0.0:8080
grpc_listen_addr: 0.0.0.0:50443
grpc_allow_insecure: true
@YouSysAdmin commented on GitHub (Mar 16, 2025): @nblock This headscale instance running inside kubernetes cluster end external connection possible only via traefik. I used kubectl port-forward for forwarding the port 50433 to my local machine and try again, it not working for any CLI version (doesn't matter configuration via CLI or file) ``` ❯ HEADSCALE_CLI_INSECURE=1 HEADSCALE_CLI_ADDRESS=127.0.0.1:50443 headscale node list 2025-03-16T12:23:58+02:00 FTL ../../../home/runner/work/headscale/headscale/cmd/headscale/cli/utils.go:124 > Could not connect: context deadline exceeded error="context deadline exceeded" ``` GRPCURL work fine ```shell grpcurl -plaintext -H 'authorization: Bearer TOKEN' 127.0.0.1:50443 headscale.v1.HeadscaleService.GetRoutes ``` <details> <summary>headscale server config</summary> ```yaml server_url: https://access.example.com listen_addr: 0.0.0.0:8080 grpc_listen_addr: 0.0.0.0:50443 grpc_allow_insecure: true ``` </details>
Author
Owner

@plittlefield commented on GitHub (May 23, 2025):

Same here, I'm behind Traefik but have Tailscale running on my nodes so CAN use that perfectly.

This is my config ...

$ (mbp-linux) cat .headscale/config.yaml
cli:
    address: headscale.mydomain.uk:443
    api_key: xxxxxxxxxxxxxxxxxxxxxxxxxxLtSX9_hWC-sz

I'm getting this error ...

$ (mbp-linux) headscale nodes list
Cannot get nodes: unexpected HTTP status code received from server: 404 (Not Found); malformed header: missing HTTP content-type

I have tried swapping the URL for a Tailscale IP and port that I can telnet to ...

$ (mbp-linux) telnet 100.64.0.4 8080
Trying 100.64.0.4...
Connected to 100.64.0.4.
Escape character is '^]'.
^]
telnet> quit
Connection closed.

... and put this in my config ...

$ (mbp-linux) cat .headscale/config.yaml
cli:
    address: 100.64.0.4:8080
    api_key: xxxxxxxxxxxxxxxxxxxxxxxxxxLtSX9_hWC-sz

... and this time I get a different error ...

$ (mbp-linux) headscale nodes list
2025-05-23T10:58:58+01:00 FTL ../runner/work/headscale/headscale/cmd/headscale/cli/utils.go:124 > Could not connect: context deadline exceeded error="context deadline exceeded"

Any clue?

FYI I have headscale-admin working fine on Traefik.

Thanks,

Paully

@plittlefield commented on GitHub (May 23, 2025): Same here, I'm behind Traefik but have Tailscale running on my nodes so CAN use that perfectly. This is my config ... ``` $ (mbp-linux) cat .headscale/config.yaml cli: address: headscale.mydomain.uk:443 api_key: xxxxxxxxxxxxxxxxxxxxxxxxxxLtSX9_hWC-sz ``` I'm getting this error ... ``` $ (mbp-linux) headscale nodes list Cannot get nodes: unexpected HTTP status code received from server: 404 (Not Found); malformed header: missing HTTP content-type ``` I have tried swapping the URL for a Tailscale IP and port that I can telnet to ... ``` $ (mbp-linux) telnet 100.64.0.4 8080 Trying 100.64.0.4... Connected to 100.64.0.4. Escape character is '^]'. ^] telnet> quit Connection closed. ``` ... and put this in my config ... ``` $ (mbp-linux) cat .headscale/config.yaml cli: address: 100.64.0.4:8080 api_key: xxxxxxxxxxxxxxxxxxxxxxxxxxLtSX9_hWC-sz ``` ... and this time I get a different error ... ``` $ (mbp-linux) headscale nodes list 2025-05-23T10:58:58+01:00 FTL ../runner/work/headscale/headscale/cmd/headscale/cli/utils.go:124 > Could not connect: context deadline exceeded error="context deadline exceeded" ``` Any clue? FYI I have headscale-admin working fine on Traefik. Thanks, Paully
Author
Owner

@YouSysAdmin commented on GitHub (May 23, 2025):

Hi @plittlefield
I don't have any solution for it.
I just use v0.23 to set a policy, etc., and it works fine. :)

There may be some dependencies on headers that are not passed through Traefik to headscale, but I haven't had time to conduct this research across the two versions of the GRPC package.

P.S. I haven't tested version 0.26 yet.

@YouSysAdmin commented on GitHub (May 23, 2025): Hi @plittlefield I don't have any solution for it. I just use v0.23 to set a policy, etc., and it works fine. :) There may be some dependencies on headers that are not passed through Traefik to headscale, but I haven't had time to conduct this research across the two versions of the GRPC package. P.S. I haven't tested version 0.26 yet.
Author
Owner

@ozhankaraman commented on GitHub (Jun 8, 2025):

I am using docker based letsencrypt + haproxy + headscale(v0.26.1) implementation and macos client works fine for me. I am using normal tailscale pkg from their official web site(Tailscale-1.84.1-macos.pkg). My headscale config on haproxy is simple nothing fancy backend config is like

backend headscale_backend
    mode http
    server headscale_server 172.17.0.1:8080 check
@ozhankaraman commented on GitHub (Jun 8, 2025): I am using docker based letsencrypt + haproxy + headscale(v0.26.1) implementation and macos client works fine for me. I am using normal tailscale pkg from their official web site(Tailscale-1.84.1-macos.pkg). My headscale config on haproxy is simple nothing fancy backend config is like ``` backend headscale_backend mode http server headscale_server 172.17.0.1:8080 check ```
Author
Owner

@YouSysAdmin commented on GitHub (Jul 29, 2025):

@ozhankaraman the question is not about the client, here the question is only about the remote CLI of Headscale (GRPC) :)

I have no idea what the problem is here.

Headscale and CLI v0.26.1 (get/set policy):

  • Docker compose + Traefik - no problem
  • Kubernetes + Traefik - Could not connect: context deadline exceeded error="context deadline exceeded"
  • GRPCURL - no problem

Headscale v0.26.1 + CLI v0.23.0 (get/set policy):

  • Docker compose + Traefik - no problem
  • Kubernetes + Traefik - no problem
  • GRPCURL - no problem

This definitely happened after updating the version of the GRPC library, but I still haven't found the reason and how to fix it.
Interesting is that this problem with GRPC only occurs for Heascale and it's not clear why at all, i have a many tools based on GRPC.
Need to dig changes after google.golang.org/grpc v1.66.0

# WWW -> AWS NLB -> Kube Node -> Traefik -> SVC -> POD
---
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  annotations:
    kubernetes.io/ingress.class: traefik
  name: ingress-route
spec:
  routes:
    - kind: Rule
      match: Host(`access.example.co`)
      middlewares:
        - name: headscale-cors-middleware
          namespace: headscale
      priority: 10
      services:
        - kind: Service
          name: headscale-svc
          port: 8080
          scheme: http
    - kind: Rule
      match: Host(`access-grpc.example.co`)
      priority: 10
      services:
        - kind: Service
          name: headscale-svc
          port: 50443
          scheme: h2c
          passHostHeader: true
@YouSysAdmin commented on GitHub (Jul 29, 2025): @ozhankaraman the question is not about the client, here the question is only about the remote CLI of Headscale (GRPC) :) I have no idea what the problem is here. **Headscale and CLI v0.26.1 (get/set policy):** - Docker compose + Traefik - no problem - Kubernetes + Traefik - `Could not connect: context deadline exceeded error="context deadline exceeded"` - GRPCURL - no problem **Headscale v0.26.1 + CLI v0.23.0 (get/set policy):** - Docker compose + Traefik - no problem - Kubernetes + Traefik - no problem - GRPCURL - no problem This definitely happened after updating the version of the GRPC library, but I still haven't found the reason and how to fix it. Interesting is that this problem with GRPC only occurs for Heascale and it's not clear why at all, i have a many tools based on GRPC. Need to dig changes after `google.golang.org/grpc v1.66.0` ```yaml # WWW -> AWS NLB -> Kube Node -> Traefik -> SVC -> POD --- apiVersion: traefik.io/v1alpha1 kind: IngressRoute metadata: annotations: kubernetes.io/ingress.class: traefik name: ingress-route spec: routes: - kind: Rule match: Host(`access.example.co`) middlewares: - name: headscale-cors-middleware namespace: headscale priority: 10 services: - kind: Service name: headscale-svc port: 8080 scheme: http - kind: Rule match: Host(`access-grpc.example.co`) priority: 10 services: - kind: Service name: headscale-svc port: 50443 scheme: h2c passHostHeader: true ```
Author
Owner

@YouSysAdmin commented on GitHub (Jul 29, 2025):

Hi @nblock
I found problem, Falling bombs with five hundred kilograms of TNT clear my brain well :D

This is actually a misconfiguration of the AWS Network Load Balancer.

Image

fix:

  1. In the main navigation panel, under Load Balancing, choose Load Balancers.
  2. Click inside the Filter by tags and attributes or search by keyword box, select Type and choose network to list the Network Load Balancers available in the current AWS region.
  3. Select the Network Load Balancer (NLB) that you want to examine.
  4. Select the Listeners tab from the console bottom panel to access the load balancer listeners.
  5. Select the TLS : 443 listener and choose Edit to access the TLS listener configuration.
  6. In the Listener details section, check the name of the policy selected for ALPN Policy. If there is no TLS ALPN policy configured for the selected listener and the ALPN Policy is set to None, change this option to the HTTP2Preferred.

If using HTTP2Preferred is not possible for you, you can use an additional environment variable for the client.
GRPC_ENFORCE_ALPN_ENABLED=false headscale get policy [etc]

@YouSysAdmin commented on GitHub (Jul 29, 2025): Hi @nblock I found problem, Falling bombs with five hundred kilograms of TNT clear my brain well :D ~~As a temporary solution, we can use the following:~~ ~~`GRPC_ENFORCE_ALPN_ENABLED=false headscale get policy [etc]`~~ this is related to enforce ALPN protocol: https://github.com/grpc/grpc-go/issues/434 --- This is actually a misconfiguration of the AWS Network Load Balancer. <img width="933" height="327" alt="Image" src="https://github.com/user-attachments/assets/3e08d1d2-2b68-4936-aa50-a435b85996b6" /> fix: 1. In the main navigation panel, under Load Balancing, choose Load Balancers. 2. Click inside the Filter by tags and attributes or search by keyword box, select Type and choose network to list the Network Load Balancers available in the current AWS region. 3. Select the Network Load Balancer (NLB) that you want to examine. 4. Select the Listeners tab from the console bottom panel to access the load balancer listeners. 5. Select the TLS : 443 listener and choose Edit to access the TLS listener configuration. 6. In the Listener details section, check the name of the policy selected for ALPN Policy. If there is no TLS ALPN policy configured for the selected listener and the ALPN Policy is set to None, change this option to the `HTTP2Preferred`. If using `HTTP2Preferred` is not possible for you, you can use an additional environment variable for the client. `GRPC_ENFORCE_ALPN_ENABLED=false headscale get policy [etc]`
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/headscale#971