[Bug] Headscale embedded Derper Server speed slow #1070

Closed
opened 2025-12-29 02:28:06 +01:00 by adam · 3 comments
Owner

Originally created by @kocy33 on GitHub (Jul 22, 2025).

Is this a support request?

  • This is not a support request

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

Hello,

I am running Headscale with the embedded Derp Server on a VPS with docker compose.
The iperf3 results from the VPS shows fast speeds and with monitoring htop i can only see 10% utilization.
But I can only get approx 1mb/s - 2mb/s throughput.

I have also tried public derp servers, but this results in much worse latency and speed (700kb/s)
I run through 5g - 464xlat and local upload speed is 100mbit (so approx 12mb/s).
Is that expected speed for headscale?
Or did I misconfigure something?

My idea was to maybe run a wireguard tunnel from 5gwan home > vps.
(Because I can't open ports, since the 5G WAN sits behind CGNAT.)
Would that be useful?

My infrastructure looks like this right now.
Client announcing ipv4 routes > pfsense > 5G WAN > VPS control with embedded derp.
(IPV6 is disabled everywhere)

I also tested upload directly with iperf3 from client to vps, and that is no issue.

I tried to access with multiple outside clients. All with slow speed.

Is the embedded derp bad performance? Is it adviced to run a seperate docker container for derp server?
If so, could somone advice one?

The latency of the embedded derp server seems to be good, since i can run 6 camera streams in realtime.
(On public derp servers that is not possible.)

iperf3 from VPS result:

[ 5] 5.00-6.00 sec 119 MBytes 997 Mbits/sec

[ 7] 5.00-6.00 sec 132 MBytes 1.11 Gbits/sec

[ 9] 5.00-6.00 sec 101 MBytes 846 Mbits/sec

[ 11] 5.00-6.00 sec 124 MBytes 1.04 Gbits/sec

[SUM] 5.00-6.00 sec 476 MBytes 4.00 Gbits/sec

Expected Behavior

Little Overhead but fast speed.

Steps To Reproduce

Use embedded Derp Server.
See bad speed.

Environment

- OS: debian / Windows 
- Headscale version: 0.26.1
- Tailscale version:

Runtime environment

  • Headscale is behind a (reverse) proxy
  • Headscale runs in a container

Debug information

Relay derp used because no direct connection possible through 5gwan

Originally created by @kocy33 on GitHub (Jul 22, 2025). ### Is this a support request? - [x] This is not a support request ### Is there an existing issue for this? - [x] I have searched the existing issues ### Current Behavior Hello, I am running Headscale with the embedded Derp Server on a VPS with docker compose. The iperf3 results from the VPS shows fast speeds and with monitoring htop i can only see 10% utilization. But I can only get approx 1mb/s - 2mb/s throughput. I have also tried public derp servers, but this results in much worse latency and speed (700kb/s) I run through 5g - 464xlat and local upload speed is 100mbit (so approx 12mb/s). Is that expected speed for headscale? Or did I misconfigure something? My idea was to maybe run a wireguard tunnel from 5gwan home > vps. (Because I can't open ports, since the 5G WAN sits behind CGNAT.) Would that be useful? My infrastructure looks like this right now. Client announcing ipv4 routes > pfsense > 5G WAN > VPS control with embedded derp. (IPV6 is disabled everywhere) I also tested upload directly with iperf3 from client to vps, and that is no issue. I tried to access with multiple outside clients. All with slow speed. Is the embedded derp bad performance? Is it adviced to run a seperate docker container for derp server? If so, could somone advice one? The latency of the embedded derp server seems to be good, since i can run 6 camera streams in realtime. (On public derp servers that is not possible.) iperf3 from VPS result: [ 5] 5.00-6.00 sec 119 MBytes 997 Mbits/sec [ 7] 5.00-6.00 sec 132 MBytes 1.11 Gbits/sec [ 9] 5.00-6.00 sec 101 MBytes 846 Mbits/sec [ 11] 5.00-6.00 sec 124 MBytes 1.04 Gbits/sec [SUM] 5.00-6.00 sec 476 MBytes 4.00 Gbits/sec ### Expected Behavior Little Overhead but fast speed. ### Steps To Reproduce Use embedded Derp Server. See bad speed. ### Environment ```markdown - OS: debian / Windows - Headscale version: 0.26.1 - Tailscale version: ``` ### Runtime environment - [ ] Headscale is behind a (reverse) proxy - [ ] Headscale runs in a container ### Debug information Relay derp used because no direct connection possible through 5gwan
adam added the bug label 2025-12-29 02:28:06 +01:00
adam closed this issue 2025-12-29 02:28:06 +01:00
Author
Owner

@kradalby commented on GitHub (Jul 22, 2025):

Is the embedded derp bad performance? Is it adviced to run a seperate docker container for derp server?
If so, could somone advice one?

I have no particular expectations about the speed, it shares part of the web server with Headscale, if your after performance I would try running the separate derper binary from Tailscale.

I've never speed tested the embedded DERP, nor have we done anything to ensure its fast. I would view it as _purely for convenience.

@kradalby commented on GitHub (Jul 22, 2025): > Is the embedded derp bad performance? Is it adviced to run a seperate docker container for derp server? > If so, could somone advice one? I have no particular expectations about the speed, it shares part of the web server with Headscale, if your after performance I would try running the separate `derper` binary from Tailscale. I've never speed tested the embedded DERP, nor have we done anything to ensure its fast. I would view it as _purely for convenience.
Author
Owner

@kocy33 commented on GitHub (Jul 23, 2025):

Aye!
I am pretty happy with headscale!
Getting it up and running was pretty drop in smooth. :)
Since this was the first time setting it up, I just wasn't sure what speeds to expect.

Since I have written this I made some more tests.
I setup the sending side with open ports and then tried the other side with a relayed side and a side with open ports.
The direct connected had full speed and one relayed side was getting approx half of it.
It made sense.

You reckon running a seperate derper server would gain more speed in general compared to the embedded?
Would running a tailscale client on the VPS that also runs the controlserver / embedded derper be beneficial?

Ill close it as a bug because its expected behavior i guess.

@kocy33 commented on GitHub (Jul 23, 2025): Aye! I am pretty happy with headscale! Getting it up and running was pretty drop in smooth. :) Since this was the first time setting it up, I just wasn't sure what speeds to expect. Since I have written this I made some more tests. I setup the sending side with open ports and then tried the other side with a relayed side and a side with open ports. The direct connected had full speed and one relayed side was getting approx half of it. It made sense. You reckon running a seperate derper server would gain more speed in general compared to the embedded? Would running a tailscale client on the VPS that also runs the controlserver / embedded derper be beneficial? Ill close it as a bug because its expected behavior i guess.
Author
Owner

@kradalby commented on GitHub (Jul 23, 2025):

The main reason to have tailscale and derp on the same is that it supports a flag to validate so only nodes from your server can use the derp I believe, but I have never used it.

Otherwise I would expect better performance with a dedicated derp

@kradalby commented on GitHub (Jul 23, 2025): The main reason to have tailscale and derp on the same is that it supports a flag to validate so only nodes from your server can use the derp I believe, but I have never used it. Otherwise I would expect better performance with a dedicated derp
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/headscale#1070