Poor performances on create/edit tenant field with high number of tenants #8961

Closed
opened 2025-12-29 20:43:23 +01:00 by adam · 1 comment
Owner

Originally created by @lchabert on GitHub (Dec 13, 2023).

Originally assigned to: @jeremystretch on GitHub.

Deployment Type

Self-hosted

NetBox Version

v3.6.6

Python Version

3.11

Steps to Reproduce

  • Create around 85k tenant in a single tenant group
  • Create around 150k prefixes linked with differents tenants
  • Create some sites, linked with differents tenants
  • Then edit one of the prefix, and click on tenant selector.

Expected Behavior

Maybe one second or less should be a good request timer
Or maybe the searchable select should be async to allow user to enter characters while the first request is running in background.

Observed Behavior

With a postgres 16 database, and 16GB of ram and 4 cpus, It will take 3.5 seconds to respond and give the user the opportunity to enter some search characters.

As far i can go through the debug process, i can see theses kinds of SQL request with long running timer ( >3s):

SELECT 
  DISTINCT "tenancy_tenant"."id", 
  "tenancy_tenant"."created", 
  "tenancy_tenant"."last_updated", 
  "tenancy_tenant"."custom_field_data", 
  "tenancy_tenant"."description", 
  "tenancy_tenant"."comments", 
  "tenancy_tenant"."name", 
  "tenancy_tenant"."slug", 
  "tenancy_tenant"."group_id", 
  COALESCE(
    (
      SELECT 
        COUNT(*) AS "c" 
      FROM 
        "circuits_circuit" U0 
      WHERE 
        U0."tenant_id" = ("tenancy_tenant"."id") 
      GROUP BY 
        U0."tenant_id"
    ), 
    0
  ) AS "circuit_count", 
  COALESCE(
    (
      SELECT 
        COUNT(*) AS "c" 
      FROM 
        "dcim_device" U0 
      WHERE 
        U0."tenant_id" = ("tenancy_tenant"."id") 
      GROUP BY 
        U0."tenant_id"
    ), 
    0
  ) AS "device_count", 
  COALESCE(
    (
      SELECT 
        COUNT(*) AS "c" 
      FROM 
        "ipam_ipaddress" U0 
      WHERE 
        U0."tenant_id" = ("tenancy_tenant"."id") 
      GROUP BY 
        U0."tenant_id"
    ), 
    0
  ) AS "ipaddress_count", 
  COALESCE(
    (
      SELECT 
        COUNT(*) AS "c" 
      FROM 
        "ipam_prefix" U0 
      WHERE 
        U0."tenant_id" = ("tenancy_tenant"."id") 
      GROUP BY 
        U0."tenant_id"
    ), 
    0
  ) AS "prefix_count", 
  COALESCE(
    (
      SELECT 
        COUNT(*) AS "c" 
      FROM 
        "dcim_rack" U0 
      WHERE 
        U0."tenant_id" = ("tenancy_tenant"."id") 
      GROUP BY 
        U0."tenant_id"
    ), 
    0
  ) AS "rack_count", 
  COALESCE(
    (
      SELECT 
        COUNT(*) AS "c" 
      FROM 
        "dcim_site" U0 
      WHERE 
        U0."tenant_id" = ("tenancy_tenant"."id") 
      GROUP BY 
        U0."tenant_id"
    ), 
    0
  ) AS "site_count", 
  COALESCE(
    (
      SELECT 
        COUNT(*) AS "c" 
      FROM 
        "virtualization_virtualmachine" U0 
      WHERE 
        U0."tenant_id" = ("tenancy_tenant"."id") 
      GROUP BY 
        U0."tenant_id"
    ), 
    0
  ) AS "virtualmachine_count", 
  COALESCE(
    (
      SELECT 
        COUNT(*) AS "c" 
      FROM 
        "ipam_vlan" U0 
      WHERE 
        U0."tenant_id" = ("tenancy_tenant"."id") 
      GROUP BY 
        U0."tenant_id"
    ), 
    0
  ) AS "vlan_count", 
  COALESCE(
    (
      SELECT 
        COUNT(*) AS "c" 
      FROM 
        "ipam_vrf" U0 
      WHERE 
        U0."tenant_id" = ("tenancy_tenant"."id") 
      GROUP BY 
        U0."tenant_id"
    ), 
    0
  ) AS "vrf_count", 
  COALESCE(
    (
      SELECT 
        COUNT(*) AS "c" 
      FROM 
        "virtualization_cluster" U0 
      WHERE 
        U0."tenant_id" = ("tenancy_tenant"."id") 
      GROUP BY 
        U0."tenant_id"
    ), 
    0
  ) AS "cluster_count" 
FROM 
  "tenancy_tenant" 
WHERE 
  "tenancy_tenant"."group_id" IN (
    SELECT 
      U0."id" 
    FROM 
      "tenancy_tenantgroup" U0 
    WHERE 
      U0."id" = 4
  ) 
ORDER BY 
  "tenancy_tenant"."name" ASC 
LIMIT 
  50

Theses request, for each kind of related objects making a count of them. So with 85k tenant, each count is executed 85k times.
From my point of view, to display a dynamic selector with customer display field, it's not necessary to count every objects.

To be more specific on the pgsql server, the request is executed on RAM buffered datas. No hit on system SSD. The cache usage is around 30% and hit objects.

We already have a discussion about it on slack: https://netdev-community.slack.com/archives/C01P0FRSXRV/p1702476390699559

Originally created by @lchabert on GitHub (Dec 13, 2023). Originally assigned to: @jeremystretch on GitHub. ### Deployment Type Self-hosted ### NetBox Version v3.6.6 ### Python Version 3.11 ### Steps to Reproduce * Create around 85k tenant in a single tenant group * Create around 150k prefixes linked with differents tenants * Create some sites, linked with differents tenants * Then edit one of the prefix, and click on tenant selector. ### Expected Behavior Maybe one second or less should be a good request timer Or maybe the searchable select should be async to allow user to enter characters while the first request is running in background. ### Observed Behavior With a postgres 16 database, and 16GB of ram and 4 cpus, It will take 3.5 seconds to respond and give the user the opportunity to enter some search characters. As far i can go through the debug process, i can see theses kinds of SQL request with long running timer ( >3s): ```sql SELECT DISTINCT "tenancy_tenant"."id", "tenancy_tenant"."created", "tenancy_tenant"."last_updated", "tenancy_tenant"."custom_field_data", "tenancy_tenant"."description", "tenancy_tenant"."comments", "tenancy_tenant"."name", "tenancy_tenant"."slug", "tenancy_tenant"."group_id", COALESCE( ( SELECT COUNT(*) AS "c" FROM "circuits_circuit" U0 WHERE U0."tenant_id" = ("tenancy_tenant"."id") GROUP BY U0."tenant_id" ), 0 ) AS "circuit_count", COALESCE( ( SELECT COUNT(*) AS "c" FROM "dcim_device" U0 WHERE U0."tenant_id" = ("tenancy_tenant"."id") GROUP BY U0."tenant_id" ), 0 ) AS "device_count", COALESCE( ( SELECT COUNT(*) AS "c" FROM "ipam_ipaddress" U0 WHERE U0."tenant_id" = ("tenancy_tenant"."id") GROUP BY U0."tenant_id" ), 0 ) AS "ipaddress_count", COALESCE( ( SELECT COUNT(*) AS "c" FROM "ipam_prefix" U0 WHERE U0."tenant_id" = ("tenancy_tenant"."id") GROUP BY U0."tenant_id" ), 0 ) AS "prefix_count", COALESCE( ( SELECT COUNT(*) AS "c" FROM "dcim_rack" U0 WHERE U0."tenant_id" = ("tenancy_tenant"."id") GROUP BY U0."tenant_id" ), 0 ) AS "rack_count", COALESCE( ( SELECT COUNT(*) AS "c" FROM "dcim_site" U0 WHERE U0."tenant_id" = ("tenancy_tenant"."id") GROUP BY U0."tenant_id" ), 0 ) AS "site_count", COALESCE( ( SELECT COUNT(*) AS "c" FROM "virtualization_virtualmachine" U0 WHERE U0."tenant_id" = ("tenancy_tenant"."id") GROUP BY U0."tenant_id" ), 0 ) AS "virtualmachine_count", COALESCE( ( SELECT COUNT(*) AS "c" FROM "ipam_vlan" U0 WHERE U0."tenant_id" = ("tenancy_tenant"."id") GROUP BY U0."tenant_id" ), 0 ) AS "vlan_count", COALESCE( ( SELECT COUNT(*) AS "c" FROM "ipam_vrf" U0 WHERE U0."tenant_id" = ("tenancy_tenant"."id") GROUP BY U0."tenant_id" ), 0 ) AS "vrf_count", COALESCE( ( SELECT COUNT(*) AS "c" FROM "virtualization_cluster" U0 WHERE U0."tenant_id" = ("tenancy_tenant"."id") GROUP BY U0."tenant_id" ), 0 ) AS "cluster_count" FROM "tenancy_tenant" WHERE "tenancy_tenant"."group_id" IN ( SELECT U0."id" FROM "tenancy_tenantgroup" U0 WHERE U0."id" = 4 ) ORDER BY "tenancy_tenant"."name" ASC LIMIT 50 ``` Theses request, for each kind of related objects making a count of them. So with 85k tenant, each count is executed 85k times. From my point of view, to display a dynamic selector with customer display field, it's not necessary to count every objects. To be more specific on the pgsql server, the request is executed on RAM buffered datas. No hit on system SSD. The cache usage is around 30% and hit objects. We already have a discussion about it on slack: https://netdev-community.slack.com/archives/C01P0FRSXRV/p1702476390699559
adam added the type: bugstatus: acceptedseverity: low labels 2025-12-29 20:43:23 +01:00
adam closed this issue 2025-12-29 20:43:23 +01:00
Author
Owner

@DanSheps commented on GitHub (Dec 13, 2023):

We had a discussion on Slack regarding this, this morning:

TLDR; is that this increase on the select fields is being caused by api calls which have annotations for counts of related items. Since we are performing brief API calls, this shouldn't be required.

Preliminary investigation suggests we could clear the annotations before resolving the query by calling self.queryset.query.annotations.clear()

We could also be more selective about it, depending on the outcome of #14302 is (only keeping relevant annotations instead)

Removing the annotations in testing resulted in a 50% increase in performance.

@DanSheps commented on GitHub (Dec 13, 2023): We had a discussion on Slack regarding this, this morning: TLDR; is that this increase on the select fields is being caused by api calls which have annotations for counts of related items. Since we are performing brief API calls, this shouldn't be required. Preliminary investigation suggests we could clear the annotations before resolving the query by calling `self.queryset.query.annotations.clear()` We could also be more selective about it, depending on the outcome of #14302 is (only keeping relevant annotations instead) Removing the annotations in testing resulted in a 50% increase in performance.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/netbox#8961