Incorrect API result "virtual_disk_count" for some VM #11859

New Issue

adam · 2025-12-29T21:50:47+01:00

adam commented

2025-12-29 21:50:47 +01:00

Originally created by @stavr666 on GitHub (Nov 21, 2025).

Originally assigned to: @jeremystretch on GitHub.

NetBox Edition

NetBox Community

NetBox Version

v4.4.6

Python Version

3.11

Steps to Reproduce

Encountered on v4.3.4. Reproducible after upgrading to v4.4.6. Not all (new) VMs affected for some reason.

Create VM
Create virtual disk for VM

Expected Behavior

VM shows number of disks on "Virtual Disks" tab
VM return correct "virtual_disk_count" number from API request

Observed Behavior

No disks on UI and API counters.

Originally created by @stavr666 on GitHub (Nov 21, 2025). Originally assigned to: @jeremystretch on GitHub. ### NetBox Edition NetBox Community ### NetBox Version v4.4.6 ### Python Version 3.11 ### Steps to Reproduce Encountered on v4.3.4. Reproducible after upgrading to v4.4.6. Not all (new) VMs affected for some reason. 1. Create VM 2. Create virtual disk for VM ### Expected Behavior 1. VM shows number of disks on "Virtual Disks" tab 2. VM return correct "virtual_disk_count" number from API request ### Observed Behavior No disks on UI and API counters. <img width="468" height="220" alt="Image" src="https://github.com/user-attachments/assets/b49ed2dd-6c75-4a0f-b573-820ee0122cb9" /> <img width="526" height="302" alt="Image" src="https://github.com/user-attachments/assets/e8f97382-720b-4778-9f88-3987580857f6" /> <img width="384" height="99" alt="Image" src="https://github.com/user-attachments/assets/7883ade3-413f-45cc-b5eb-71303b0b8729" />

adam added the type: bug pending closure status: revisions needed netbox severity: low labels 2025-12-29 21:50:47 +01:00

adam commented

2025-12-29 21:50:48 +01:00

@jnovinger commented on GitHub (Nov 21, 2025):

Thanks for the report, @stavr666. I'm not able to reproduce your STR exactly, but I do see something very similar.

In the representation of my new VM from the API (from /api/virtualization/virtual-machines/ list endpoint and from the /api/virtualization/virtual-machines/541/ detail endpoint), the nested role object show zeroes for device_count and virtualmachine_count (which is inaccurate on its face, since it's nested in a VM with that role!).

However, when I view the detail of that device role (/api/dcim/device-roles/7/) the counts match what is displayed in the web UI.

@jnovinger commented on GitHub (Nov 21, 2025): Thanks for the report, @stavr666. I'm not able to reproduce your STR exactly, but I do see something very similar. In the representation of my new VM from the API (from `/api/virtualization/virtual-machines/` list endpoint and from the `/api/virtualization/virtual-machines/541/` detail endpoint), the nested `role` object show zeroes for `device_count` and `virtualmachine_count` (which is inaccurate on its face, since it's nested in a VM with that role!). <img width="542" height="208" alt="Image" src="https://github.com/user-attachments/assets/77b37a20-4948-4eab-a5c6-6620056e4f67" /> However, when I view the detail of that device role (`/api/dcim/device-roles/7/`) the counts match what is displayed in the web UI.

adam commented

2025-12-29 21:50:48 +01:00

@stavr666 commented on GitHub (Nov 21, 2025):

However, when I view the detail of that device role (/api/dcim/device-roles/7/) the counts match what is displayed in the web UI.

Yes, similarity of UI and API wrong results also seems strange to me. Encountered this rare UI bug before (never reported since it was not critical).
But now (several days before trying to fix it by upgrading to v4.6.4) it's causes our automation to broke in pipeline "compare by disk count". We can rewrite our scripts, but it'll become slow again.

Can I somehow collect any technical details, that can help diagnose source of error? SQL request or something?

@stavr666 commented on GitHub (Nov 21, 2025): > However, when I view the detail of that device role (`/api/dcim/device-roles/7/`) the counts match what is displayed in the web UI. Yes, similarity of UI and API wrong results also seems strange to me. Encountered this rare UI bug before (never reported since it was not critical). But now (several days before trying to fix it by upgrading to v4.6.4) it's causes our automation to broke in pipeline "compare by disk count". We can rewrite our scripts, but it'll become slow again. Can I somehow collect any technical details, that can help diagnose source of error? SQL request or something?

adam commented

2025-12-29 21:50:48 +01:00

@jnovinger commented on GitHub (Nov 21, 2025):

Can I somehow collect any technical details, that can help diagnose source of error? SQL request ore something?

Python stack traces (in the case of unhandled exceptions), SQL queries, versions, and things along those lines are the most useful for really isolating where the problem is originating from.

Although, in this case, I suspect that it has something to do with how our CounterCacheField and how it's being handled by API serializers.

@jnovinger commented on GitHub (Nov 21, 2025): > Can I somehow collect any technical details, that can help diagnose source of error? SQL request ore something? Python stack traces (in the case of unhandled exceptions), SQL queries, versions, and things along those lines are the most useful for really isolating where the problem is originating from. Although, in this case, I suspect that it has something to do with how our `CounterCacheField` and how it's being handled by API serializers.

adam commented

2025-12-29 21:50:49 +01:00

@stavr666 commented on GitHub (Nov 21, 2025):

it has something to do with how our CounterCacheField

Any way to bump it's refresh manually?

@stavr666 commented on GitHub (Nov 21, 2025): > it has something to do with how our `CounterCacheField` Any way to bump it's refresh manually?

adam commented

2025-12-29 21:50:49 +01:00

@jnovinger commented on GitHub (Nov 21, 2025):

I don't actually believe it's an issue with the actual count being wrong, so much as it is an issue with the serializer not reading the value correctly and defaulting to zero. But, that's just speculation. I have not had any time to dig in to this one at all.

@jnovinger commented on GitHub (Nov 21, 2025): I don't actually believe it's an issue with the actual count being wrong, so much as it is an issue with the serializer not reading the value correctly and defaulting to zero. But, that's just speculation. I have not had any time to dig in to this one at all.

adam commented

2025-12-29 21:50:50 +01:00

@jnovinger commented on GitHub (Nov 21, 2025):

Seems like this and #19976 are likely related.

@jnovinger commented on GitHub (Nov 21, 2025): Seems like this and #19976 are likely related.

adam commented

2025-12-29 21:50:50 +01:00

@stavr666 commented on GitHub (Nov 21, 2025):

It's not that critical for now, so I'll keep track on mentioned issue.

@stavr666 commented on GitHub (Nov 21, 2025): It's not that critical for now, so I'll keep track on mentioned issue.

adam commented

2025-12-29 21:50:50 +01:00

@pheus commented on GitHub (Nov 21, 2025):

I might be wrong here, but to me this looks like two slightly different problems.

The API nested object counts (e.g. the device_count on role) seem to be coming from queryset annotations. Those I can reproduce pretty easily, including on the public demo.

By contrast, the virtual_disk_count field is a cached integer field (a CounterCacheField) on the model. While working on #19523 I ran into similar problems with cached counters and opened #20697 to track a CounterCacheField double‑counting bug. In that investigation I managed to push some counters into negative values (for example -2 devices) when the initial value was 0 and a related Device was deleted. With the CounterCacheField mechanism in place, every creation bumps the counter by +1 and every deletion by -1, so if the counter ever gets out of sync, it can drift into odd values.

There is a management command that recalculates all CounterCacheField values:

python3 netbox/manage.py calculate_cached_counts

@stavr666 Could you try running this on your instance and then repeat the steps you used to trigger the issue? It would be helpful to know whether the problem persists after the counters have been rebuilt, or if it only affected stale values from before.

If I’m misunderstanding the root cause here, please feel free to correct me. Just sharing what I’ve seen while working with the counter fields recently. 🙌

@pheus commented on GitHub (Nov 21, 2025): I might be wrong here, but to me this looks like two slightly different problems. The API nested object counts (e.g. the `device_count` on `role`) seem to be coming from queryset annotations. Those I can reproduce pretty easily, including on the public demo. By contrast, the `virtual_disk_count` field is a cached integer field (a `CounterCacheField`) on the model. While working on #19523 I ran into similar problems with cached counters and opened #20697 to track a `CounterCacheField` double‑counting bug. In that investigation I managed to push some counters into negative values (for example `-2` devices) when the initial value was `0` and a related `Device` was deleted. With the `CounterCacheField` mechanism in place, every creation bumps the counter by `+1` and every deletion by `-1`, so if the counter ever gets out of sync, it can drift into odd values. There is a management command that recalculates all `CounterCacheField` values: ```bash python3 netbox/manage.py calculate_cached_counts ``` @stavr666 Could you try running this on your instance and then repeat the steps you used to trigger the issue? It would be helpful to know whether the problem persists after the counters have been rebuilt, or if it only affected stale values from before. If I’m misunderstanding the root cause here, please feel free to correct me. Just sharing what I’ve seen while working with the counter fields recently. 🙌

adam commented

2025-12-29 21:50:51 +01:00

@stavr666 commented on GitHub (Nov 24, 2025):

There is a management command that recalculates all CounterCacheField values:

python3 netbox/manage.py calculate_cached_counts

It helped. Both UI and API shows correct values now:

@stavr666 commented on GitHub (Nov 24, 2025): > There is a management command that recalculates all `CounterCacheField` values: > > python3 netbox/manage.py calculate_cached_counts It helped. Both UI and API shows correct values now: <img width="792" height="91" alt="Image" src="https://github.com/user-attachments/assets/932b0fcd-b04a-4dcb-b77d-190c7afb5d76" /> <img width="394" height="154" alt="Image" src="https://github.com/user-attachments/assets/125171e6-d8e7-4dee-a96c-d5eaaa3d5f78" /> <img width="385" height="95" alt="Image" src="https://github.com/user-attachments/assets/23e3f02d-03b7-4ab9-bc86-fc6cc63aa53d" />

adam commented

2025-12-29 21:50:51 +01:00

@pheus commented on GitHub (Nov 26, 2025):

Thanks for confirming, @stavr666 ! Glad to hear the values look correct now! 🙌

If you have a moment, could you try to repeat the steps that originally triggered the mismatch (both with existing objects and with newly created ones) and see whether you can still reproduce the issue?

That would help a lot to confirm whether this was just a case of stale cached counters or if there’s still an underlying bug we should keep digging into. No pressure if you don’t have time right away, of course 🙂

@pheus commented on GitHub (Nov 26, 2025): Thanks for confirming, @stavr666 ! Glad to hear the values look correct now! 🙌 If you have a moment, could you try to repeat the steps that originally triggered the mismatch (both with existing objects and with newly created ones) and see whether you can still reproduce the issue? That would help a lot to confirm whether this was just a case of stale cached counters or if there’s still an underlying bug we should keep digging into. No pressure if you don’t have time right away, of course 🙂

adam commented

2025-12-29 21:50:51 +01:00

@stavr666 commented on GitHub (Nov 28, 2025):

@pheus
New VMs have same caching issues:

@stavr666 commented on GitHub (Nov 28, 2025): @pheus New VMs have same caching issues: <img width="527" height="357" alt="Image" src="https://github.com/user-attachments/assets/c8427837-50a5-4588-a32e-fba4eba9b448" /> <img width="382" height="84" alt="Image" src="https://github.com/user-attachments/assets/34d40bbb-88cb-4e0f-bb9b-e2e5b94091dc" />

adam commented

2025-12-29 21:50:51 +01:00

@jeremystretch commented on GitHub (Dec 17, 2025):

@stavr666 I'm not able to reproduce this on NetBox v4.4.8. If you're still encountering this issue after upgrading, could you please share updated reproduction steps?

@jeremystretch commented on GitHub (Dec 17, 2025): @stavr666 I'm not able to reproduce this on NetBox v4.4.8. If you're still encountering this issue after upgrading, could you please share updated reproduction steps?

adam commented

2025-12-29 21:50:52 +01:00

@github-actions[bot] commented on GitHub (Dec 25, 2025):

This is a reminder that additional information is needed in order to further triage this issue. If the requested details are not provided, the issue will soon be closed automatically.

@github-actions[bot] commented on GitHub (Dec 25, 2025): This is a reminder that additional information is needed in order to further triage this issue. If the requested details are not provided, the issue will soon be closed automatically.

adam referenced this issue

2025-12-29 23:21:12 +01:00

[PR #11859] [MERGED] Closes #8958: Add webhook support for jobs #13858

Sign in to join this conversation.

Branches Tags

main

19034-rackreservation-unit-count

19953-configtemplate-debug-rendering-mode

15513-add-bulk-create-for-prefixes

21556-fix-dropdown-clearing

20152-support-for-marking-module-bays-and-device-bays-as-disabled

feature

21157-export-template-public-models

fix-claude-action

20698-add-a-read-only-vlan-ids-count-to-the-vlanrange-model

21114-data-source

21579-script-add-button-permissions

19867-preserve-per_page-param

21357-register-model-actions

21025-pre-render-config-contexts

21518-cf-decimal-zero

21364-swagger

20442-callable-audit

feature-ip-prefix-link

20911-dropdown-3

fix_module_substitution

21203-q-attr-denorm

21160-filterset

21118-site

20911-dropdown-2

21102-fix-graphiql-explorer

20044-elevation-stuck-lightmode

v4.5-beta1-release

20068-import-moduletype-attrs

20766-fix-german-translation-code-literals

20378-del-script

7604-filter-modifiers-v3

circuit-swap

12318-case-insensitive-uniqueness

20637-improve-device-q-filter

20660-script-load

19724-graphql

20614-update-ruff

14884-script

02496-max-page

19720-macaddress-interface-generic-relation

19408-circuit-terminations-export-templates

20203-openapi-check

fix-19669-api-image-download

7604-filter-modifiers

19275-fixes-interface-bulk-edit

fix-17794-get_field_value_return_list

11507-show-aggregate-and-rir-on-api

9583-add_column_specific_search_field_to_tables

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: starred/netbox#11859