Netbox API returns duplicate resources during paging with offset #10806

New Issue

adam · 2025-12-29T21:36:06+01:00

adam commented

2025-12-29 21:36:06 +01:00

Originally created by @dankotrajkovic on GitHub (Feb 25, 2025).

Originally assigned to: @bctiemann on GitHub.

Deployment Type

Self-hosted

NetBox Version

v4.2.3

Python Version

3.12

Steps to Reproduce

Use Python or Postman API to GET the clusters from Netbox by paging through the resources.
The real-world use case is to load a larger list of clusters using the paging mechanism.

To reproduce Run the following code

import requests


def main():
    """
    Pull netbox clusters for demo.netbox.dev using limit=5
    The idea of small limit is just to simulate a bigger database where we have to do multiple
    requests of 50 items to load a larger list of for example 250 clusters.

    The idea is to show that within the returned items few duplicates appear.
    :return:
    """

    cluster_list = [] # Place to store the clusters after with each requests call
    cluster_unique_ids = set() # Set to store the unique IDs of clusters loaded from netbox


    # Collect the clusters
    headers = {
        'Accept': 'application/json',
        'Authorization': 'Token 6a768e6363830a536ffa07abf261c1d64d365b9a'
    }

    parameters = {
        'limit': 5,
        'offset': 0
    }
    while True:
        response = requests.get('https://demo.netbox.dev/api/virtualization/clusters', headers=headers,
                                params=parameters)
        if response.status_code == 200:
            print(f'Collected clusters from Netbox with offset: {parameters["offset"]}, limit: {parameters["limit"]}')
            cluster_list.extend(response.json()['results'])
            parameters['offset'] += parameters['limit']
        if  not response.json()['next']:
            break

    # Check if there are any duplicates in the clusters
    for cluster in cluster_list:
        if cluster['name'] not in cluster_unique_ids:
            cluster_unique_ids.add(cluster['name'])
        else:
            print(f'Duplicate Cluster. Name: {cluster["name"]}, ID: {cluster["id"]}')


if __name__ == '__main__':
    main()

To Reproduce in Postman:
Issue the GET requests with following path:
https://demo.netbox.dev/api/virtualization/clusters/?limit=5&offset=5
and then issue again with
https://demo.netbox.dev/api/virtualization/clusters/?limit=5&offset=30

You will see that Cluster with ID 10 repeats in two responses. The ID might differ at times.

The Example above uses demo.netbox.dev, but the same behavior we experience on our on-prem self-hosted instance. With 150 clusters we see about 30-35 duplicates.

Expected Behavior

Expect not to have duplicates returned as we page through the clusters.

Observed Behavior

Clusters with duplicate IDs are present in the responses.

We see that few IDs are duplicated in the response

Collected clusters from Netbox with offset: 0, limit: 5
Collected clusters from Netbox with offset: 5, limit: 5
Collected clusters from Netbox with offset: 10, limit: 5
Collected clusters from Netbox with offset: 15, limit: 5
Collected clusters from Netbox with offset: 20, limit: 5
Collected clusters from Netbox with offset: 25, limit: 5
Collected clusters from Netbox with offset: 30, limit: 5
Duplicate Cluster. Name: gc-us-west1, ID: 10
Duplicate Cluster. Name: gc-europe-west4, ID: 22

Originally created by @dankotrajkovic on GitHub (Feb 25, 2025). Originally assigned to: @bctiemann on GitHub. ### Deployment Type Self-hosted ### NetBox Version v4.2.3 ### Python Version 3.12 ### Steps to Reproduce Use Python or Postman API to GET the clusters from Netbox by paging through the resources. The real-world use case is to load a larger list of clusters using the paging mechanism. To reproduce Run the following code ``` import requests def main(): """ Pull netbox clusters for demo.netbox.dev using limit=5 The idea of small limit is just to simulate a bigger database where we have to do multiple requests of 50 items to load a larger list of for example 250 clusters. The idea is to show that within the returned items few duplicates appear. :return: """ cluster_list = [] # Place to store the clusters after with each requests call cluster_unique_ids = set() # Set to store the unique IDs of clusters loaded from netbox # Collect the clusters headers = { 'Accept': 'application/json', 'Authorization': 'Token 6a768e6363830a536ffa07abf261c1d64d365b9a' } parameters = { 'limit': 5, 'offset': 0 } while True: response = requests.get('https://demo.netbox.dev/api/virtualization/clusters', headers=headers, params=parameters) if response.status_code == 200: print(f'Collected clusters from Netbox with offset: {parameters["offset"]}, limit: {parameters["limit"]}') cluster_list.extend(response.json()['results']) parameters['offset'] += parameters['limit'] if not response.json()['next']: break # Check if there are any duplicates in the clusters for cluster in cluster_list: if cluster['name'] not in cluster_unique_ids: cluster_unique_ids.add(cluster['name']) else: print(f'Duplicate Cluster. Name: {cluster["name"]}, ID: {cluster["id"]}') if __name__ == '__main__': main() ``` To Reproduce in Postman: Issue the GET requests with following path: https://demo.netbox.dev/api/virtualization/clusters/?limit=5&offset=5 and then issue again with https://demo.netbox.dev/api/virtualization/clusters/?limit=5&offset=30 You will see that Cluster with ID 10 repeats in two responses. The ID might differ at times. The Example above uses demo.netbox.dev, but the same behavior we experience on our on-prem self-hosted instance. With 150 clusters we see about 30-35 duplicates. ### Expected Behavior Expect not to have duplicates returned as we page through the clusters. ### Observed Behavior Clusters with duplicate IDs are present in the responses. ![Image](https://github.com/user-attachments/assets/58b7830a-f417-48cb-a2fe-e9f6c0ea01b1) ![Image](https://github.com/user-attachments/assets/692e0ebc-4c3c-45cb-a91e-8ffbe43cd912) We see that few IDs are duplicated in the response ``` Collected clusters from Netbox with offset: 0, limit: 5 Collected clusters from Netbox with offset: 5, limit: 5 Collected clusters from Netbox with offset: 10, limit: 5 Collected clusters from Netbox with offset: 15, limit: 5 Collected clusters from Netbox with offset: 20, limit: 5 Collected clusters from Netbox with offset: 25, limit: 5 Collected clusters from Netbox with offset: 30, limit: 5 Duplicate Cluster. Name: gc-us-west1, ID: 10 Duplicate Cluster. Name: gc-europe-west4, ID: 22 ```

adam added the type: bug status: accepted severity: medium labels 2025-12-29 21:36:07 +01:00

adam closed this issue

2025-12-29 21:36:07 +01:00

adam commented

2025-12-29 21:36:07 +01:00

@bctiemann commented on GitHub (Feb 27, 2025):

This seems fairly high severity as the API pagination ought to be predictable and orderly. Is this reproducible in any other models?

@bctiemann commented on GitHub (Feb 27, 2025): This seems fairly high severity as the API pagination ought to be predictable and orderly. Is this reproducible in any other models?

adam commented

2025-12-29 21:36:08 +01:00

@dankotrajkovic commented on GitHub (Feb 27, 2025):

In our local environment, we can reproduce this with the IPAddress model. This is where we initially found it but we have not been able to reproduce it on the public Netbox instance so we held back from raising the issue.

We thought it was because we have 500,000 IPAddresses in netbox across various VRFs that this was causing the duplicates. But even then the duplicates were not severe. Fetching with limit=1000 (so 500 pages) we were getting only about 10 duplicates. Still very problematic for our code and we had to write methods to recover from this, but luckily on this model the issue is severe and should lead to the quicker discovery of the problem. Its possible we are doing something wrong, but either way knowing how to fix would really help us.

@dankotrajkovic commented on GitHub (Feb 27, 2025): In our local environment, we can reproduce this with the IPAddress model. This is where we initially found it but we have not been able to reproduce it on the public Netbox instance so we held back from raising the issue. We thought it was because we have 500,000 IPAddresses in netbox across various VRFs that this was causing the duplicates. But even then the duplicates were not severe. Fetching with limit=1000 (so 500 pages) we were getting only about 10 duplicates. Still very problematic for our code and we had to write methods to recover from this, but luckily on this model the issue is severe and should lead to the quicker discovery of the problem. Its possible we are doing something wrong, but either way knowing how to fix would really help us.

adam commented

2025-12-29 21:36:08 +01:00

@atownson commented on GitHub (Feb 27, 2025):

I have seen this issue as well, when performing GET requests for Services.

@atownson commented on GitHub (Feb 27, 2025): I have seen this issue as well, when performing GET requests for Services.

adam commented

2025-12-29 21:36:08 +01:00

@cruse1977 commented on GitHub (Mar 4, 2025):

https://demo.netbox.dev/api/virtualization/clusters/?offset=5&limit=5

https://demo.netbox.dev/api/virtualization/clusters/?offset=5&limit=30

ID 10 shown in both

@cruse1977 commented on GitHub (Mar 4, 2025): https://demo.netbox.dev/api/virtualization/clusters/?offset=5&limit=5 https://demo.netbox.dev/api/virtualization/clusters/?offset=5&limit=30 ID 10 shown in both

adam commented

2025-12-29 21:36:08 +01:00

@bctiemann commented on GitHub (Mar 4, 2025):

It looks like the issue is just that Django isn't obeying the model's ordering setting when annotation is applied to the queryset, i.e. in the case of ClusterViewSet:

913405a3ae/netbox/virtualization/api/views.py (L37-L41)

Note that ordering = ["name"] for Cluster:

In [29]: queryset = Cluster.objects.all()

In [30]: [(r.id, r.name) for r in queryset[0:10]]
Out[30]: 
[(9, 'DO-AMS3'),
 (8, 'DO-BLR1'),
 (7, 'DO-FRA1'),
 (6, 'DO-LON1'),
 (1, 'DO-NYC1'),
 (2, 'DO-NYC3'),
 (3, 'DO-SFO3'),
 (5, 'DO-SGP1'),
 (4, 'DO-TOR1'),
 (36, 'gc-asia-east1')]

In [27]: queryset = Cluster.objects.prefetch_related('virtual_machines').annotate(
    ...:         allocated_vcpus=Sum('virtual_machines__vcpus'),
    ...:         allocated_memory=Sum('virtual_machines__memory'),
    ...:         allocated_disk=Sum('virtual_machines__disk'),
    ...:     )

In [28]: [(r.id, r.name) for r in queryset[0:10]]
Out[28]: 
[(4, 'DO-TOR1'),
 (34, 'gc-asia-southeast1'),
 (40, 'gc-asia-northeast3'),
 (10, 'gc-us-west1'),
 (9, 'DO-AMS3'),
 (7, 'DO-FRA1'),
 (35, 'gc-asia-southeast2'),
 (38, 'gc-asia-northeast1'),
 (15, 'gc-us-east1'),
 (6, 'DO-LON1')]

If .order_by("name") is added to the custom queryset, it sorts predictably. Same if you add &ordering=name to the query parameters on the API call.

https://code.djangoproject.com/ticket/32811

We may need to identify all the ViewSets that use annotation in this way and add explicit ordering to the queryset statement.

@bctiemann commented on GitHub (Mar 4, 2025): It looks like the issue is just that Django isn't obeying the model's `ordering` setting when annotation is applied to the queryset, i.e. in the case of `ClusterViewSet`: https://github.com/netbox-community/netbox/blob/913405a3ae93ec28b8970a2dbdd81c99508dd557/netbox/virtualization/api/views.py#L37-L41 Note that `ordering = ["name"]` for `Cluster`: ``` In [29]: queryset = Cluster.objects.all() In [30]: [(r.id, r.name) for r in queryset[0:10]] Out[30]: [(9, 'DO-AMS3'), (8, 'DO-BLR1'), (7, 'DO-FRA1'), (6, 'DO-LON1'), (1, 'DO-NYC1'), (2, 'DO-NYC3'), (3, 'DO-SFO3'), (5, 'DO-SGP1'), (4, 'DO-TOR1'), (36, 'gc-asia-east1')] ``` ``` In [27]: queryset = Cluster.objects.prefetch_related('virtual_machines').annotate( ...: allocated_vcpus=Sum('virtual_machines__vcpus'), ...: allocated_memory=Sum('virtual_machines__memory'), ...: allocated_disk=Sum('virtual_machines__disk'), ...: ) In [28]: [(r.id, r.name) for r in queryset[0:10]] Out[28]: [(4, 'DO-TOR1'), (34, 'gc-asia-southeast1'), (40, 'gc-asia-northeast3'), (10, 'gc-us-west1'), (9, 'DO-AMS3'), (7, 'DO-FRA1'), (35, 'gc-asia-southeast2'), (38, 'gc-asia-northeast1'), (15, 'gc-us-east1'), (6, 'DO-LON1')] ``` If `.order_by("name")` is added to the custom queryset, it sorts predictably. Same if you add `&ordering=name` to the query parameters on the API call. https://code.djangoproject.com/ticket/32811 We may need to identify all the ViewSets that use annotation in this way and add explicit ordering to the queryset statement.

adam referenced this issue

2025-12-29 23:20:32 +01:00

[PR #11092] [CLOSED] 10806 add deactivate to upgrade docs #13740

adam referenced this issue

2025-12-29 23:20:35 +01:00

[PR #11174] [MERGED] Release v3.3.10 #13750

Sign in to join this conversation.

Branches Tags

main

20152-support-for-marking-module-bays-and-device-bays-as-disabled

20923-convert-wireless-views-to-new-ui-layout

20077-tom-select-focus-bug

feature

21157-export-template-public-models

21580-virtual-machines-add-components-button-is-displayed-even

fix-claude-action

20698-add-a-read-only-vlan-ids-count-to-the-vlanrange-model

21114-data-source

21579-script-add-button-permissions

20923-migrate-vpn-views

21440-oob-ip-import

19867-preserve-per_page-param

21357-register-model-actions

21025-pre-render-config-contexts

21518-cf-decimal-zero

21364-swagger

20442-callable-audit

feature-ip-prefix-link

20911-dropdown-3

fix_module_substitution

21203-q-attr-denorm

21160-filterset

21118-site

20911-dropdown-2

21102-fix-graphiql-explorer

20044-elevation-stuck-lightmode

v4.5-beta1-release

20068-import-moduletype-attrs

20766-fix-german-translation-code-literals

20378-del-script

7604-filter-modifiers-v3

circuit-swap

12318-case-insensitive-uniqueness

20637-improve-device-q-filter

20660-script-load

19724-graphql

20614-update-ruff

14884-script

02496-max-page

19720-macaddress-interface-generic-relation

19408-circuit-terminations-export-templates

20203-openapi-check

fix-19669-api-image-download

7604-filter-modifiers

19275-fixes-interface-bulk-edit

fix-17794-get_field_value_return_list

11507-show-aggregate-and-rir-on-api

9583-add_column_specific_search_field_to_tables

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: starred/netbox#10806