Provide a more efficient way to query CustomFieldValues across a query set #2555

Closed
opened 2025-12-29 18:19:54 +01:00 by adam · 2 comments
Owner

Originally created by @chaomodus on GitHub (Apr 26, 2019).

Environment

  • Python version: 3.5.3
  • NetBox version: 2.5.8

Proposed Functionality

Provide a way to query custom fields efficiently which apply to a query set, similar in functionality to CustomFieldFilter.

Use Case

In developing several reports we have had to query custom fields for each item in a query set. For example, checking the purchase date of a device, across all devices (even if a purchase date isn't specified for a particular device). These reports operate over a few thousand devices, so the real time to query runs into seconds.

The naive way of using .cf(), or querying .custom_field_values on the object produces thousands of extra queries, and even .prefetch_related("custom_field_values") (and variations thereof) still cause a great number of extra queries.

In exploring workarounds a particular way of doing it stands out: doing a single query against CustomFieldValues filtering based on the object IDs, types and field names, present in the "parent" device query, and then retrieving those values to operate on in the loop over the parent device query. This technique in our case reduces the number of queries from nearly 4000 to 8 (with only a few being particular to the operation of the code) and the query time from 1400 ms to 43 ms.

Example code:

        devquery = (Device.objects
                            .exclude(status__in=(DEVICE_STATUS_INVENTORY, 
                                                 DEVICE_STATUS_OFFLINE))
                            .exclude(device_role__slug__in=EXCLUDE_ROLES))
        cf_ids = devquery.values_list('custom_field_values__pk', flat=True)
        cfs = {
            cf.pk: cf.value
            for cf in CustomFieldValue.objects.prefetch_related('field')
            .filter(field__name='purchase_date')
            .filter(pk__in=cf_ids)
        }

       for device in devquery:
           try:
               purchase_date = cfs[device.pk]
          except KeyError:
               purchase_date = None

          # do something with purchase_date ...

The above is roughly equivalent to querying .cfs() on each device in the loop, except as stated above, results in a much more rapid operation.

Database Changes

N/A

External Dependencies

N/A

Originally created by @chaomodus on GitHub (Apr 26, 2019). <!-- NOTE: This form is only for proposing specific new features or enhancements. If you have a general idea or question, please post to our mailing list instead of opening an issue: https://groups.google.com/forum/#!forum/netbox-discuss NOTE: Due to an excessive backlog of feature requests, we are not currently accepting any proposals which significantly extend NetBox's feature scope. Please describe the environment in which you are running NetBox. Be sure that you are running an unmodified instance of the latest stable release before submitting a bug report. --> ### Environment * Python version: 3.5.3 * NetBox version: 2.5.8 <!-- Describe in detail the new functionality you are proposing. Include any specific changes to work flows, data models, or the user interface. --> ### Proposed Functionality Provide a way to query custom fields efficiently which apply to a query set, similar in functionality to CustomFieldFilter. <!-- Convey an example use case for your proposed feature. Write from the perspective of a NetBox user who would benefit from the proposed functionality and describe how. ---> ### Use Case In developing several reports we have had to query custom fields for each item in a query set. For example, checking the purchase date of a device, across all devices (even if a purchase date isn't specified for a particular device). These reports operate over a few thousand devices, so the real time to query runs into seconds. The naive way of using `.cf()`, or querying `.custom_field_values` on the object produces thousands of extra queries, and even `.prefetch_related("custom_field_values")` (and variations thereof) still cause a great number of extra queries. In exploring workarounds a particular way of doing it stands out: doing a single query against CustomFieldValues filtering based on the object IDs, types and field names, present in the "parent" device query, and then retrieving those values to operate on in the loop over the parent device query. This technique in our case reduces the number of queries from nearly 4000 to 8 (with only a few being particular to the operation of the code) and the query time from 1400 ms to 43 ms. Example code: ``` devquery = (Device.objects .exclude(status__in=(DEVICE_STATUS_INVENTORY, DEVICE_STATUS_OFFLINE)) .exclude(device_role__slug__in=EXCLUDE_ROLES)) cf_ids = devquery.values_list('custom_field_values__pk', flat=True) cfs = { cf.pk: cf.value for cf in CustomFieldValue.objects.prefetch_related('field') .filter(field__name='purchase_date') .filter(pk__in=cf_ids) } for device in devquery: try: purchase_date = cfs[device.pk] except KeyError: purchase_date = None # do something with purchase_date ... ``` The above is roughly equivalent to querying .cfs() on each device in the loop, except as stated above, results in a much more rapid operation. <!-- Note any changes to the database schema necessary to support the new feature. For example, does the proposal require adding a new model or field? (Not all new features require database changes.) ---> ### Database Changes N/A <!-- List any new dependencies on external libraries or services that this new feature would introduce. For example, does the proposal require the installation of a new Python package? (Not all new features introduce new dependencies.) --> ### External Dependencies N/A
adam closed this issue 2025-12-29 18:19:54 +01:00
Author
Owner

@jeremystretch commented on GitHub (Apr 26, 2019):

Provide a way to query custom fields efficiently which apply to a query set, similar in functionality to CustomFieldFilter.

Please expand on this. What is the specific feature/change you are proposing?

@jeremystretch commented on GitHub (Apr 26, 2019): > Provide a way to query custom fields efficiently which apply to a query set, similar in functionality to CustomFieldFilter. Please expand on this. What is the specific feature/change you are proposing?
Author
Owner

@chaomodus commented on GitHub (May 2, 2019):

I suppose the approach I would take is similar to the way CustomFieldFilter is available as a part of the extras library. I feel like some sort of CustomFieldFetch(queryset, fieldname) which would do approximately the aforementioned code, and return the values (or populate them in the query set, if I recall that is a possibility). I may have an opportunity to create a PR with this functionality if that is desired.

Thanks!

@chaomodus commented on GitHub (May 2, 2019): I suppose the approach I would take is similar to the way CustomFieldFilter is available as a part of the extras library. I feel like some sort of CustomFieldFetch(queryset, fieldname) which would do approximately the aforementioned code, and return the values (or populate them in the query set, if I recall that is a possibility). I may have an opportunity to create a PR with this functionality if that is desired. Thanks!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/netbox#2555