Performance degradation after updating to NetBox 2.7 #3385

New Issue

adam · 2025-12-29T18:28:37+01:00

adam commented

2025-12-29 18:28:37 +01:00

Originally created by @karlism on GitHub (Feb 24, 2020).

Environment

Python version: 3.6.8
NetBox version: 2.7.7

Steps to Reproduce

Upgrade from NetBox version 2.6.12 to NetBox 2.7.6 (with subsequent update to 2.7.7)

Expected Behavior

Performance to stay the same or improve.

Observed Behavior

Performance degradation.

Let me show you an example of device list API request by curl:

$ for i in `jot 10 1`; do curl -H "Authorization: Token XXX" -o /dev/null https://netbox1/api/dcim/devices/; done 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1506k  100 1506k    0     0  20498      0  0:01:15  0:01:15 --:--:--  354k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1506k  100 1506k    0     0   122k      0  0:00:12  0:00:12 --:--:--  309k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1506k  100 1506k    0     0   123k      0  0:00:12  0:00:12 --:--:--  402k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1506k  100 1506k    0     0   126k      0  0:00:11  0:00:11 --:--:--  442k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1506k  100 1506k    0     0   129k      0  0:00:11  0:00:11 --:--:--  356k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1506k  100 1506k    0     0   129k      0  0:00:11  0:00:11 --:--:--  358k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1506k  100 1506k    0     0   117k      0  0:00:12  0:00:12 --:--:--  348k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1506k  100 1506k    0     0   129k      0  0:00:11  0:00:11 --:--:--  360k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1506k  100 1506k    0     0   127k      0  0:00:11  0:00:11 --:--:--  443k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1506k  100 1506k    0     0   128k      0  0:00:11  0:00:11 --:--:--  347k

As you can see, one of the requests took 1 minute longer than the rest of them. During this API curl query, I can see that gunicorn, and especially redis-server services are utilising lots of CPU resources.

Nginx logs during these requests:

192.168.xx.xx - - [24/Feb/2020:09:39:57 +0000] "GET /api/dcim/devices/ HTTP/2.0" 200 1543090 "-" "curl/7.66.0"
172.16.xx.xx - - [24/Feb/2020:09:39:58 +0000] "GET /metrics HTTP/1.1" 200 30540 "-" "Prometheus/2.15.2"
172.16.xx.xx - - [24/Feb/2020:09:40:01 +0000] "GET /metrics HTTP/1.1" 200 30527 "-" "Prometheus/2.15.2"
192.168.xx.xx - - [24/Feb/2020:09:40:09 +0000] "GET /api/dcim/devices/ HTTP/2.0" 200 1543090 "-" "curl/7.66.0"
192.168.xx.xx - - [24/Feb/2020:09:40:21 +0000] "GET /api/dcim/devices/ HTTP/2.0" 200 1543090 "-" "curl/7.66.0"
192.168.xx.xx - - [24/Feb/2020:09:40:33 +0000] "GET /api/dcim/devices/ HTTP/2.0" 200 1543090 "-" "curl/7.66.0"
192.168.xx.xx - - [24/Feb/2020:09:40:44 +0000] "GET /api/dcim/devices/ HTTP/2.0" 200 1543090 "-" "curl/7.66.0"
192.168.xx.xx - - [24/Feb/2020:09:40:56 +0000] "GET /api/dcim/devices/ HTTP/2.0" 200 1543090 "-" "curl/7.66.0"
172.16.xx.xx - - [24/Feb/2020:09:40:58 +0000] "GET /metrics HTTP/1.1" 200 30544 "-" "Prometheus/2.15.2"
172.16.xx.xx - - [24/Feb/2020:09:41:01 +0000] "GET /metrics HTTP/1.1" 200 30545 "-" "Prometheus/2.15.2"
192.168.xx.xx - - [24/Feb/2020:09:41:09 +0000] "GET /api/dcim/devices/ HTTP/2.0" 200 1543090 "-" "curl/7.66.0"
192.168.xx.xx - - [24/Feb/2020:09:41:20 +0000] "GET /api/dcim/devices/ HTTP/2.0" 200 1543090 "-" "curl/7.66.0"
192.168.xx.xx - - [24/Feb/2020:09:41:32 +0000] "GET /api/dcim/devices/ HTTP/2.0" 200 1543090 "-" "curl/7.66.0"
192.168.xx.xx - - [24/Feb/2020:09:41:44 +0000] "GET /api/dcim/devices/ HTTP/2.0" 200 1543090 "-" "curl/7.66.0"

I'm quite confident that there requests didn't take more than 5 seconds prior to update, but I don't have any precise data to back it up. This NetBox instance currently has 1016 devices.

Relevant NetBox redis configuration is following:

REDIS = {
    'webhooks': {
        'HOST': 'localhost',
        'PORT': 6379,
        'PASSWORD': '',
        'DATABASE': 0,
        'DEFAULT_TIMEOUT': 300,
        'SSL': False,
    },
    'caching': {
        'HOST': 'localhost',
        'PORT': 6379,
        'PASSWORD': '',
        'DATABASE': 1,
        'DEFAULT_TIMEOUT': 300,
        'SSL': False,
    }
}

Redis server configuration is default.

# cat /etc/redhat-release 
CentOS Linux release 7.7.1908 (Core)
# redis-server --version            
Redis server v=3.2.12 sha=00000000:0 malloc=jemalloc-3.6.0 bits=64 build=7897e7d0e13773f
# redis-cli INFO
# Server
redis_version:3.2.12
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:7897e7d0e13773f
redis_mode:standalone
os:Linux 3.10.0-1062.12.1.el7.x86_64 x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:4.8.5
process_id:1208
run_id:a347186355f27833610acdc44633ac600400c52c
tcp_port:6379
uptime_in_seconds:334038
uptime_in_days:3
hz:10
lru_clock:5480347
executable:/usr/bin/redis-server
config_file:/etc/redis.conf

# Clients
connected_clients:8
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:1

# Memory
used_memory:3681584
used_memory_human:3.51M
used_memory_rss:10780672
used_memory_rss_human:10.28M
used_memory_peak:8024368
used_memory_peak_human:7.65M
total_system_memory:3954020352
total_system_memory_human:3.68G
used_memory_lua:1289216
used_memory_lua_human:1.23M
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
mem_fragmentation_ratio:2.93
mem_allocator:jemalloc-3.6.0

# Persistence
loading:0
rdb_changes_since_last_save:7119
rdb_bgsave_in_progress:0
rdb_last_save_time:1582538475
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:0
rdb_current_bgsave_time_sec:-1
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok

# Stats
total_connections_received:6203
total_commands_processed:6752359701
instantaneous_ops_per_sec:0
total_net_input_bytes:53371521425
total_net_output_bytes:7092431153
instantaneous_input_kbps:0.00
instantaneous_output_kbps:0.00
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:1112469
evicted_keys:0
keyspace_hits:2271519586
keyspace_misses:1072865
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:472
migrate_cached_sockets:0

# Replication
role:master
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

# CPU
used_cpu_sys:1716.82
used_cpu_user:24816.67
used_cpu_sys_children:6.27
used_cpu_user_children:27.66

# Cluster
cluster_enabled:0

# Keyspace
db0:keys=2,expires=2,avg_ttl=327385
db1:keys=1316,expires=1253,avg_ttl=164100

We have second instance of NetBox running on different VM in different datacenter (in maintenance mode on read-only replicated PostgreSQL database) and experience same performance issue after upgrade on that instance as well.

Originally created by @karlism on GitHub (Feb 24, 2020).  ### Environment * Python version: 3.6.8 * NetBox version: 2.7.7  ### Steps to Reproduce 1. Upgrade from NetBox version 2.6.12 to NetBox 2.7.6 (with subsequent update to 2.7.7)  ### Expected Behavior Performance to stay the same or improve.  ### Observed Behavior Performance degradation. Let me show you an example of device list API request by `curl`: ``` $ for i in `jot 10 1`; do curl -H "Authorization: Token XXX" -o /dev/null https://netbox1/api/dcim/devices/; done % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1506k 100 1506k 0 0 20498 0 0:01:15 0:01:15 --:--:-- 354k % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1506k 100 1506k 0 0 122k 0 0:00:12 0:00:12 --:--:-- 309k % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1506k 100 1506k 0 0 123k 0 0:00:12 0:00:12 --:--:-- 402k % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1506k 100 1506k 0 0 126k 0 0:00:11 0:00:11 --:--:-- 442k % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1506k 100 1506k 0 0 129k 0 0:00:11 0:00:11 --:--:-- 356k % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1506k 100 1506k 0 0 129k 0 0:00:11 0:00:11 --:--:-- 358k % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1506k 100 1506k 0 0 117k 0 0:00:12 0:00:12 --:--:-- 348k % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1506k 100 1506k 0 0 129k 0 0:00:11 0:00:11 --:--:-- 360k % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1506k 100 1506k 0 0 127k 0 0:00:11 0:00:11 --:--:-- 443k % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1506k 100 1506k 0 0 128k 0 0:00:11 0:00:11 --:--:-- 347k ``` As you can see, one of the requests took 1 minute longer than the rest of them. During this API `curl` query, I can see that `gunicorn`, and especially `redis-server` services are utilising lots of CPU resources. Nginx logs during these requests: ``` 192.168.xx.xx - - [24/Feb/2020:09:39:57 +0000] "GET /api/dcim/devices/ HTTP/2.0" 200 1543090 "-" "curl/7.66.0" 172.16.xx.xx - - [24/Feb/2020:09:39:58 +0000] "GET /metrics HTTP/1.1" 200 30540 "-" "Prometheus/2.15.2" 172.16.xx.xx - - [24/Feb/2020:09:40:01 +0000] "GET /metrics HTTP/1.1" 200 30527 "-" "Prometheus/2.15.2" 192.168.xx.xx - - [24/Feb/2020:09:40:09 +0000] "GET /api/dcim/devices/ HTTP/2.0" 200 1543090 "-" "curl/7.66.0" 192.168.xx.xx - - [24/Feb/2020:09:40:21 +0000] "GET /api/dcim/devices/ HTTP/2.0" 200 1543090 "-" "curl/7.66.0" 192.168.xx.xx - - [24/Feb/2020:09:40:33 +0000] "GET /api/dcim/devices/ HTTP/2.0" 200 1543090 "-" "curl/7.66.0" 192.168.xx.xx - - [24/Feb/2020:09:40:44 +0000] "GET /api/dcim/devices/ HTTP/2.0" 200 1543090 "-" "curl/7.66.0" 192.168.xx.xx - - [24/Feb/2020:09:40:56 +0000] "GET /api/dcim/devices/ HTTP/2.0" 200 1543090 "-" "curl/7.66.0" 172.16.xx.xx - - [24/Feb/2020:09:40:58 +0000] "GET /metrics HTTP/1.1" 200 30544 "-" "Prometheus/2.15.2" 172.16.xx.xx - - [24/Feb/2020:09:41:01 +0000] "GET /metrics HTTP/1.1" 200 30545 "-" "Prometheus/2.15.2" 192.168.xx.xx - - [24/Feb/2020:09:41:09 +0000] "GET /api/dcim/devices/ HTTP/2.0" 200 1543090 "-" "curl/7.66.0" 192.168.xx.xx - - [24/Feb/2020:09:41:20 +0000] "GET /api/dcim/devices/ HTTP/2.0" 200 1543090 "-" "curl/7.66.0" 192.168.xx.xx - - [24/Feb/2020:09:41:32 +0000] "GET /api/dcim/devices/ HTTP/2.0" 200 1543090 "-" "curl/7.66.0" 192.168.xx.xx - - [24/Feb/2020:09:41:44 +0000] "GET /api/dcim/devices/ HTTP/2.0" 200 1543090 "-" "curl/7.66.0" ``` I'm quite confident that there requests didn't take more than 5 seconds prior to update, but I don't have any precise data to back it up. This NetBox instance currently has 1016 devices. Relevant NetBox redis configuration is following: ``` REDIS = { 'webhooks': { 'HOST': 'localhost', 'PORT': 6379, 'PASSWORD': '', 'DATABASE': 0, 'DEFAULT_TIMEOUT': 300, 'SSL': False, }, 'caching': { 'HOST': 'localhost', 'PORT': 6379, 'PASSWORD': '', 'DATABASE': 1, 'DEFAULT_TIMEOUT': 300, 'SSL': False, } } ``` Redis server configuration is default. ``` # cat /etc/redhat-release CentOS Linux release 7.7.1908 (Core) # redis-server --version Redis server v=3.2.12 sha=00000000:0 malloc=jemalloc-3.6.0 bits=64 build=7897e7d0e13773f # redis-cli INFO # Server redis_version:3.2.12 redis_git_sha1:00000000 redis_git_dirty:0 redis_build_id:7897e7d0e13773f redis_mode:standalone os:Linux 3.10.0-1062.12.1.el7.x86_64 x86_64 arch_bits:64 multiplexing_api:epoll gcc_version:4.8.5 process_id:1208 run_id:a347186355f27833610acdc44633ac600400c52c tcp_port:6379 uptime_in_seconds:334038 uptime_in_days:3 hz:10 lru_clock:5480347 executable:/usr/bin/redis-server config_file:/etc/redis.conf # Clients connected_clients:8 client_longest_output_list:0 client_biggest_input_buf:0 blocked_clients:1 # Memory used_memory:3681584 used_memory_human:3.51M used_memory_rss:10780672 used_memory_rss_human:10.28M used_memory_peak:8024368 used_memory_peak_human:7.65M total_system_memory:3954020352 total_system_memory_human:3.68G used_memory_lua:1289216 used_memory_lua_human:1.23M maxmemory:0 maxmemory_human:0B maxmemory_policy:noeviction mem_fragmentation_ratio:2.93 mem_allocator:jemalloc-3.6.0 # Persistence loading:0 rdb_changes_since_last_save:7119 rdb_bgsave_in_progress:0 rdb_last_save_time:1582538475 rdb_last_bgsave_status:ok rdb_last_bgsave_time_sec:0 rdb_current_bgsave_time_sec:-1 aof_enabled:0 aof_rewrite_in_progress:0 aof_rewrite_scheduled:0 aof_last_rewrite_time_sec:-1 aof_current_rewrite_time_sec:-1 aof_last_bgrewrite_status:ok aof_last_write_status:ok # Stats total_connections_received:6203 total_commands_processed:6752359701 instantaneous_ops_per_sec:0 total_net_input_bytes:53371521425 total_net_output_bytes:7092431153 instantaneous_input_kbps:0.00 instantaneous_output_kbps:0.00 rejected_connections:0 sync_full:0 sync_partial_ok:0 sync_partial_err:0 expired_keys:1112469 evicted_keys:0 keyspace_hits:2271519586 keyspace_misses:1072865 pubsub_channels:0 pubsub_patterns:0 latest_fork_usec:472 migrate_cached_sockets:0 # Replication role:master connected_slaves:0 master_repl_offset:0 repl_backlog_active:0 repl_backlog_size:1048576 repl_backlog_first_byte_offset:0 repl_backlog_histlen:0 # CPU used_cpu_sys:1716.82 used_cpu_user:24816.67 used_cpu_sys_children:6.27 used_cpu_user_children:27.66 # Cluster cluster_enabled:0 # Keyspace db0:keys=2,expires=2,avg_ttl=327385 db1:keys=1316,expires=1253,avg_ttl=164100 ``` We have second instance of NetBox running on different VM in different datacenter (in maintenance mode on read-only replicated PostgreSQL database) and experience same performance issue after upgrade on that instance as well.

adam closed this issue

2025-12-29 18:28:37 +01:00

adam commented

2025-12-29 18:28:38 +01:00

@karlism commented on GitHub (Feb 24, 2020):

Following NetBox prometheus metric query also shows severe performance degradation after the update:
django_http_requests_latency_seconds_by_view_method_sum{view="dcim-api:device-list"} / django_http_requests_latency_seconds_by_view_method_count

@karlism commented on GitHub (Feb 24, 2020): Following NetBox prometheus metric query also shows severe performance degradation after the update: `django_http_requests_latency_seconds_by_view_method_sum{view="dcim-api:device-list"} / django_http_requests_latency_seconds_by_view_method_count` ![image](https://user-images.githubusercontent.com/19568035/75144538-6b941880-56f7-11ea-9dd6-359932284da7.png)

adam commented

2025-12-29 18:28:38 +01:00

@jeremystretch commented on GitHub (Feb 24, 2020):

Thank you for your interest in NetBox. GitHub issues are intended for reporting reproducible bugs and requesting features, and must be submitted using one of the templates provided here. For general discussion, questions, or assistance with installation issues, please post to our mailing list instead.

@jeremystretch commented on GitHub (Feb 24, 2020): Thank you for your interest in NetBox. GitHub issues are intended for reporting reproducible bugs and requesting features, and must be submitted using one of the templates provided [here](https://github.com/netbox-community/netbox/issues/new/choose). For general discussion, questions, or assistance with installation issues, please post to our [mailing list](https://groups.google.com/forum/#!forum/netbox-discuss) instead.

Sign in to join this conversation.

Branches Tags

main

21524-invlaid-paths-exception

21518-cf-decimal-zero

21356-etags

feature

20787-spectacular

21477-extend-graphql-api-filters-for-cables

21331-deprecate-querystring-tag

21304-deprecate-housekeeping-command

21481-facility-id-doesnt-show-in-rack-page

21429-cable-create-add-another-does-not-carry-over-termination

21364-swagger

20442-callable-audit

feature-ip-prefix-link

20923-dcim-templates

20911-dropdown-3

fix_module_substitution

21203-q-attr-denorm

21160-filterset

21118-site

20911-dropdown-2

21102-fix-graphiql-explorer

20044-elevation-stuck-lightmode

v4.5-beta1-release

20068-import-moduletype-attrs

20766-fix-german-translation-code-literals

20378-del-script

7604-filter-modifiers-v3

circuit-swap

12318-case-insensitive-uniqueness

20637-improve-device-q-filter

20660-script-load

19724-graphql

20614-update-ruff

14884-script

02496-max-page

19720-macaddress-interface-generic-relation

19408-circuit-terminations-export-templates

20203-openapi-check

fix-19669-api-image-download

7604-filter-modifiers

19275-fixes-interface-bulk-edit

fix-17794-get_field_value_return_list

11507-show-aggregate-and-rir-on-api

9583-add_column_specific_search_field_to_tables

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: starred/netbox#3385