API requests that fetch a larger number of devices are very slow when using config contexts #11599

Closed
opened 2025-12-29 21:47:29 +01:00 by adam · 3 comments
Owner

Originally created by @fabi125 on GitHub (Sep 11, 2025).

Originally assigned to: @jnovinger on GitHub.

NetBox Edition

NetBox Community

NetBox Version

v4.2.9

Python Version

3.12

Steps to Reproduce

  1. Have a large number of devices (10k+)
  2. Have a decent amount of config contexts (100+) with various different rules on how they get assigned to devices
  3. Run a query like http://netbox/api/dcim/devices/?limit=1

Expected Behavior

The query should return reasonably fast (<1s).

Observed Behavior

The query takes over a minute to complete.

Running the same query with &exclude=config_context returns very fast as expected.

Using the debug mode we can observe that the following query is being constructed:

SELECT DISTINCT "dcim_device"."id",
       "dcim_device"."created",
       "dcim_device"."last_updated",
       "dcim_device"."custom_field_data",
       "dcim_device"."description",
       "dcim_device"."comments",
       "dcim_device"."local_context_data",
       "dcim_device"."config_template_id",
       "dcim_device"."device_type_id",
       "dcim_device"."role_id",
       "dcim_device"."tenant_id",
       "dcim_device"."platform_id",
       "dcim_device"."name",
       "dcim_device"."serial",
       "dcim_device"."asset_tag",
       "dcim_device"."site_id",
       "dcim_device"."location_id",
       "dcim_device"."rack_id",
       "dcim_device"."position",
       "dcim_device"."face",
       "dcim_device"."status",
       "dcim_device"."airflow",
       "dcim_device"."primary_ip4_id",
       "dcim_device"."primary_ip6_id",
       "dcim_device"."oob_ip_id",
       "dcim_device"."cluster_id",
       "dcim_device"."virtual_chassis_id",
       "dcim_device"."vc_position",
       "dcim_device"."vc_priority",
       "dcim_device"."latitude",
       "dcim_device"."longitude",
       "dcim_device"."console_port_count",
       "dcim_device"."console_server_port_count",
       "dcim_device"."power_port_count",
       "dcim_device"."power_outlet_count",
       "dcim_device"."interface_count",
       "dcim_device"."front_port_count",
       "dcim_device"."rear_port_count",
       "dcim_device"."device_bay_count",
       "dcim_device"."module_bay_count",
       "dcim_device"."inventory_item_count",
       (
        SELECT JSONB_AGG(V0."data" ORDER BY V0."weight", V0."name") AS "_data"
          FROM "extras_configcontext" V0
          LEFT OUTER JOIN "extras_configcontext_platforms" V1
            ON (V0."id" = V1."configcontext_id")
          LEFT OUTER JOIN "extras_configcontext_cluster_types" V3
            ON (V0."id" = V3."configcontext_id")
          LEFT OUTER JOIN "extras_configcontext_cluster_groups" V5
            ON (V0."id" = V5."configcontext_id")
          LEFT OUTER JOIN "extras_configcontext_clusters" V7
            ON (V0."id" = V7."configcontext_id")
          LEFT OUTER JOIN "extras_configcontext_tenant_groups" V9
            ON (V0."id" = V9."configcontext_id")
          LEFT OUTER JOIN "extras_configcontext_tenants" V11
            ON (V0."id" = V11."configcontext_id")
          LEFT OUTER JOIN "extras_configcontext_tags" V13
            ON (V0."id" = V13."configcontext_id")
          LEFT OUTER JOIN "extras_configcontext_locations" V15
            ON (V0."id" = V15."configcontext_id")
          LEFT OUTER JOIN "extras_configcontext_device_types" V17
            ON (V0."id" = V17."configcontext_id")
          LEFT OUTER JOIN "extras_configcontext_roles" V19
            ON (V0."id" = V19."configcontext_id")
          LEFT OUTER JOIN "extras_configcontext_sites" V21
            ON (V0."id" = V21."configcontext_id")
          LEFT OUTER JOIN "extras_configcontext_regions" V23
            ON (V0."id" = V23."configcontext_id")
          LEFT OUTER JOIN "dcim_region" V24
            ON (V23."region_id" = V24."id")
          LEFT OUTER JOIN "extras_configcontext_site_groups" V25
            ON (V0."id" = V25."configcontext_id")
          LEFT OUTER JOIN "dcim_sitegroup" V26
            ON (V25."sitegroup_id" = V26."id")
         WHERE ((V1."platform_id" = ("dcim_device"."platform_id") OR V1."platform_id" IS NULL) AND (V3."clustertype_id" = ("virtualization_cluster"."type_id") OR V3."clustertype_id" IS NULL) AND (V5."clustergroup_id" = ("virtualization_cluster"."group_id") OR V5."clustergroup_id" IS NULL) AND (V7."cluster_id" = ("dcim_device"."cluster_id") OR V7."cluster_id" IS NULL) AND (V9."tenantgroup_id" = ("tenancy_tenant"."group_id") OR V9."tenantgroup_id" IS NULL) AND (V11."tenant_id" = ("dcim_device"."tenant_id") OR V11."tenant_id" IS NULL) AND (V13."tag_id" IN (SELECT U0."tag_id" FROM "extras_taggeditem" U0 INNER JOIN "django_content_type" U1 ON (U0."content_type_id" = U1."id") WHERE (U1."app_label" = 'dcim' AND U1."model" = 'device' AND U0."object_id" = ("dcim_device"."id"))) OR V13."tag_id" IS NULL) AND V0."is_active" AND (V15."location_id" = ("dcim_device"."location_id") OR V15."location_id" IS NULL) AND (V17."devicetype_id" = ("dcim_device"."device_type_id") OR V17."devicetype_id" IS NULL) AND (V19."devicerole_id" = ("dcim_device"."role_id") OR V19."devicerole_id" IS NULL) AND (V21."site_id" = ("dcim_device"."site_id") OR V21."site_id" IS NULL) AND ((V24."level" <= ("dcim_region"."level") AND V24."lft" <= ("dcim_region"."lft") AND V24."rght" >= ("dcim_region"."rght") AND V24."tree_id" = ("dcim_region"."tree_id")) OR V23."region_id" IS NULL) AND ((V26."level" <= ("dcim_sitegroup"."level") AND V26."lft" <= ("dcim_sitegroup"."lft") AND V26."rght" >= ("dcim_sitegroup"."rght") AND V26."tree_id" = ("dcim_sitegroup"."tree_id")) OR V25."sitegroup_id" IS NULL))
       ) AS "config_context_data"
  FROM "dcim_device"
  LEFT OUTER JOIN "virtualization_cluster"
    ON ("dcim_device"."cluster_id" = "virtualization_cluster"."id")
  LEFT OUTER JOIN "tenancy_tenant"
    ON ("dcim_device"."tenant_id" = "tenancy_tenant"."id")
 INNER JOIN "dcim_site"
    ON ("dcim_device"."site_id" = "dcim_site"."id")
  LEFT OUTER JOIN "dcim_region"
    ON ("dcim_site"."region_id" = "dcim_region"."id")
  LEFT OUTER JOIN "dcim_sitegroup"
    ON ("dcim_site"."group_id" = "dcim_sitegroup"."id")
 ORDER BY "dcim_device"."name" ASC,
          "dcim_device"."id" ASC
 LIMIT 10

This unfortunately seems to mean that the database calculates the config context data for all devices in the result set, not just the current page of 10 devices.

I'm not very good at Django but with a bit of digging the only halfway reasonable solution that I could come up with is the following:

diff --git a/netbox/extras/api/mixins.py b/netbox/extras/api/mixins.py
index aafdf32d4..a591862a8 100644
--- a/netbox/extras/api/mixins.py
+++ b/netbox/extras/api/mixins.py
@@ -29,12 +29,32 @@ class ConfigContextQuerySetMixin:
 
         Else, return the queryset annotated with config context data
         """
+        self.return_config_context = True
         queryset = super().get_queryset()
         request = self.get_serializer_context()['request']
         if self.brief or 'config_context' in request.query_params.get('exclude', []):
-            return queryset
-        return queryset.annotate_config_context_data()
+            self.return_config_context = False
+        if self.return_config_context and self.action != "list":
+            return queryset.annotate_config_context_data()
+        return queryset
 
+    def list(self, request, *args, **kwargs):
+        """
+        Override the default list action so we can annotate just the current page with
+        the config context data.
+        """
+        queryset = self.filter_queryset(self.get_queryset())
+
+        page = self.paginate_queryset(queryset)
+        if page is not None:
+            if self.return_config_context:
+                pks = [obj.pk for obj in page]
+                page = list(self.get_queryset().filter(pk__in=pks).annotate_config_context_data())
+            serializer = self.get_serializer(page, many=True)
+            return self.get_paginated_response(serializer.data)
+
+        serializer = self.get_serializer(queryset, many=True)
+        return Response(serializer.data)
 
 class ConfigTemplateRenderMixin:
     """

This will essentially run the query twice for api list requests. Once without calculating the config context (should be very fast) and then again with the calculation but limited to the keys that are being returned.

To me this feels kinda hacky, so I'm looking for some feedback here. In my testing this made the requests as expected.

Originally created by @fabi125 on GitHub (Sep 11, 2025). Originally assigned to: @jnovinger on GitHub. ### NetBox Edition NetBox Community ### NetBox Version v4.2.9 ### Python Version 3.12 ### Steps to Reproduce 1. Have a large number of devices (10k+) 2. Have a decent amount of config contexts (100+) with various different rules on how they get assigned to devices 3. Run a query like http://netbox/api/dcim/devices/?limit=1 ### Expected Behavior The query should return reasonably fast (<1s). ### Observed Behavior The query takes over a minute to complete. Running the same query with `&exclude=config_context` returns very fast as expected. Using the debug mode we can observe that the following query is being constructed: ``` SELECT DISTINCT "dcim_device"."id", "dcim_device"."created", "dcim_device"."last_updated", "dcim_device"."custom_field_data", "dcim_device"."description", "dcim_device"."comments", "dcim_device"."local_context_data", "dcim_device"."config_template_id", "dcim_device"."device_type_id", "dcim_device"."role_id", "dcim_device"."tenant_id", "dcim_device"."platform_id", "dcim_device"."name", "dcim_device"."serial", "dcim_device"."asset_tag", "dcim_device"."site_id", "dcim_device"."location_id", "dcim_device"."rack_id", "dcim_device"."position", "dcim_device"."face", "dcim_device"."status", "dcim_device"."airflow", "dcim_device"."primary_ip4_id", "dcim_device"."primary_ip6_id", "dcim_device"."oob_ip_id", "dcim_device"."cluster_id", "dcim_device"."virtual_chassis_id", "dcim_device"."vc_position", "dcim_device"."vc_priority", "dcim_device"."latitude", "dcim_device"."longitude", "dcim_device"."console_port_count", "dcim_device"."console_server_port_count", "dcim_device"."power_port_count", "dcim_device"."power_outlet_count", "dcim_device"."interface_count", "dcim_device"."front_port_count", "dcim_device"."rear_port_count", "dcim_device"."device_bay_count", "dcim_device"."module_bay_count", "dcim_device"."inventory_item_count", ( SELECT JSONB_AGG(V0."data" ORDER BY V0."weight", V0."name") AS "_data" FROM "extras_configcontext" V0 LEFT OUTER JOIN "extras_configcontext_platforms" V1 ON (V0."id" = V1."configcontext_id") LEFT OUTER JOIN "extras_configcontext_cluster_types" V3 ON (V0."id" = V3."configcontext_id") LEFT OUTER JOIN "extras_configcontext_cluster_groups" V5 ON (V0."id" = V5."configcontext_id") LEFT OUTER JOIN "extras_configcontext_clusters" V7 ON (V0."id" = V7."configcontext_id") LEFT OUTER JOIN "extras_configcontext_tenant_groups" V9 ON (V0."id" = V9."configcontext_id") LEFT OUTER JOIN "extras_configcontext_tenants" V11 ON (V0."id" = V11."configcontext_id") LEFT OUTER JOIN "extras_configcontext_tags" V13 ON (V0."id" = V13."configcontext_id") LEFT OUTER JOIN "extras_configcontext_locations" V15 ON (V0."id" = V15."configcontext_id") LEFT OUTER JOIN "extras_configcontext_device_types" V17 ON (V0."id" = V17."configcontext_id") LEFT OUTER JOIN "extras_configcontext_roles" V19 ON (V0."id" = V19."configcontext_id") LEFT OUTER JOIN "extras_configcontext_sites" V21 ON (V0."id" = V21."configcontext_id") LEFT OUTER JOIN "extras_configcontext_regions" V23 ON (V0."id" = V23."configcontext_id") LEFT OUTER JOIN "dcim_region" V24 ON (V23."region_id" = V24."id") LEFT OUTER JOIN "extras_configcontext_site_groups" V25 ON (V0."id" = V25."configcontext_id") LEFT OUTER JOIN "dcim_sitegroup" V26 ON (V25."sitegroup_id" = V26."id") WHERE ((V1."platform_id" = ("dcim_device"."platform_id") OR V1."platform_id" IS NULL) AND (V3."clustertype_id" = ("virtualization_cluster"."type_id") OR V3."clustertype_id" IS NULL) AND (V5."clustergroup_id" = ("virtualization_cluster"."group_id") OR V5."clustergroup_id" IS NULL) AND (V7."cluster_id" = ("dcim_device"."cluster_id") OR V7."cluster_id" IS NULL) AND (V9."tenantgroup_id" = ("tenancy_tenant"."group_id") OR V9."tenantgroup_id" IS NULL) AND (V11."tenant_id" = ("dcim_device"."tenant_id") OR V11."tenant_id" IS NULL) AND (V13."tag_id" IN (SELECT U0."tag_id" FROM "extras_taggeditem" U0 INNER JOIN "django_content_type" U1 ON (U0."content_type_id" = U1."id") WHERE (U1."app_label" = 'dcim' AND U1."model" = 'device' AND U0."object_id" = ("dcim_device"."id"))) OR V13."tag_id" IS NULL) AND V0."is_active" AND (V15."location_id" = ("dcim_device"."location_id") OR V15."location_id" IS NULL) AND (V17."devicetype_id" = ("dcim_device"."device_type_id") OR V17."devicetype_id" IS NULL) AND (V19."devicerole_id" = ("dcim_device"."role_id") OR V19."devicerole_id" IS NULL) AND (V21."site_id" = ("dcim_device"."site_id") OR V21."site_id" IS NULL) AND ((V24."level" <= ("dcim_region"."level") AND V24."lft" <= ("dcim_region"."lft") AND V24."rght" >= ("dcim_region"."rght") AND V24."tree_id" = ("dcim_region"."tree_id")) OR V23."region_id" IS NULL) AND ((V26."level" <= ("dcim_sitegroup"."level") AND V26."lft" <= ("dcim_sitegroup"."lft") AND V26."rght" >= ("dcim_sitegroup"."rght") AND V26."tree_id" = ("dcim_sitegroup"."tree_id")) OR V25."sitegroup_id" IS NULL)) ) AS "config_context_data" FROM "dcim_device" LEFT OUTER JOIN "virtualization_cluster" ON ("dcim_device"."cluster_id" = "virtualization_cluster"."id") LEFT OUTER JOIN "tenancy_tenant" ON ("dcim_device"."tenant_id" = "tenancy_tenant"."id") INNER JOIN "dcim_site" ON ("dcim_device"."site_id" = "dcim_site"."id") LEFT OUTER JOIN "dcim_region" ON ("dcim_site"."region_id" = "dcim_region"."id") LEFT OUTER JOIN "dcim_sitegroup" ON ("dcim_site"."group_id" = "dcim_sitegroup"."id") ORDER BY "dcim_device"."name" ASC, "dcim_device"."id" ASC LIMIT 10 ``` This unfortunately seems to mean that the database calculates the config context data for all devices in the result set, not just the current page of 10 devices. I'm not very good at Django but with a bit of digging the only halfway reasonable solution that I could come up with is the following: ``` diff --git a/netbox/extras/api/mixins.py b/netbox/extras/api/mixins.py index aafdf32d4..a591862a8 100644 --- a/netbox/extras/api/mixins.py +++ b/netbox/extras/api/mixins.py @@ -29,12 +29,32 @@ class ConfigContextQuerySetMixin: Else, return the queryset annotated with config context data """ + self.return_config_context = True queryset = super().get_queryset() request = self.get_serializer_context()['request'] if self.brief or 'config_context' in request.query_params.get('exclude', []): - return queryset - return queryset.annotate_config_context_data() + self.return_config_context = False + if self.return_config_context and self.action != "list": + return queryset.annotate_config_context_data() + return queryset + def list(self, request, *args, **kwargs): + """ + Override the default list action so we can annotate just the current page with + the config context data. + """ + queryset = self.filter_queryset(self.get_queryset()) + + page = self.paginate_queryset(queryset) + if page is not None: + if self.return_config_context: + pks = [obj.pk for obj in page] + page = list(self.get_queryset().filter(pk__in=pks).annotate_config_context_data()) + serializer = self.get_serializer(page, many=True) + return self.get_paginated_response(serializer.data) + + serializer = self.get_serializer(queryset, many=True) + return Response(serializer.data) class ConfigTemplateRenderMixin: """ ``` This will essentially run the query twice for api list requests. Once without calculating the config context (should be very fast) and then again with the calculation but limited to the keys that are being returned. To me this feels kinda hacky, so I'm looking for some feedback here. In my testing this made the requests as expected.
adam added the type: bugstatus: acceptedseverity: low labels 2025-12-29 21:47:29 +01:00
adam closed this issue 2025-12-29 21:47:29 +01:00
Author
Owner

@jnovinger commented on GitHub (Sep 12, 2025):

I was able to script the insertion of a similar number of device/config contexts and confirm that there is definitely a performance gap in these circumstances. On a local development laptop, making the API request as described above took ~1375ms (where 1018ms of that was time spent in SQL queries and 985ms spent in one particular query) while making the same request with exclude=config_context as a GET param took only 222ms with 41ms of that spent in SQL queries.

The offending query I saw was the same as mentioned above. From what I can see, I concur with @fabi125 , that the root cause is the aggregate being run on a full queryset before being paginated. I'm not sure what the best approach to fix this is.

@jnovinger commented on GitHub (Sep 12, 2025): I was able to script the insertion of a similar number of device/config contexts and confirm that there is definitely a performance gap in these circumstances. On a local development laptop, making the API request as described above took ~1375ms (where 1018ms of that was time spent in SQL queries and 985ms spent in one particular query) while making the same request with `exclude=config_context` as a GET param took only 222ms with 41ms of that spent in SQL queries. The offending query I saw was the same as mentioned above. From what I can see, I concur with @fabi125 , that the root cause is the aggregate being run on a full queryset before being paginated. I'm not sure what the best approach to fix this is.
Author
Owner

@fabi125 commented on GitHub (Sep 15, 2025):

May I request to keep this issue open for now? I think #20346 is a great start but even with it I'm still seeing those slow requests.

@fabi125 commented on GitHub (Sep 15, 2025): May I request to keep this issue open for now? I think #20346 is a great start but even with it I'm still seeing those slow requests.
Author
Owner

@jnovinger commented on GitHub (Sep 15, 2025):

May I request to keep this issue open for now? I think #20346 is a great start but even with it I'm still seeing those slow requests.

My goal with #20346 was to get something in to main quickly so that users could see some improvement with this behavior in the v4.4.1 release that is scheduled for tomorrow.

I think the right way to handle this is for us to file a follow-up issue for additional improvement. I was planning to do that shortly, but please feel free to do so if you would like.

@jnovinger commented on GitHub (Sep 15, 2025): > May I request to keep this issue open for now? I think [#20346](https://github.com/netbox-community/netbox/pull/20346) is a great start but even with it I'm still seeing those slow requests. My goal with #20346 was to get something in to `main` quickly so that users could see some improvement with this behavior in the v4.4.1 release that is scheduled for tomorrow. I think the right way to handle this is for us to file a follow-up issue for additional improvement. I was planning to do that shortly, but please feel free to do so if you would like.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/netbox#11599