Provide a "dry-run" functionality for bulk import/update #8622

Closed
opened 2025-12-29 20:38:59 +01:00 by adam · 8 comments
Owner

Originally created by @peteeckel on GitHub (Sep 15, 2023).

NetBox version

v3.6.1

Feature type

New functionality

Proposed functionality

Based on the brief discussion in #13773 I suggest implementing a "dry-run" functionality for bulk import/update data.

Use case

This FR needs to be seen in conjunction with #13775 und #13777. Importing or updating data in bulk can be complex and deal with large amounts of data and certain errors such as mis-spelled or mis-cased headers currently result in data being silently ignored.

In case of an input, data sets with some invalid header names are currently silently ignored while the remainder of the columns is imported, which requires a subsequent update run with a new data set including IDs. It would be helpful to validate the input data and check for this kind of error before the import is actually executed so errors can be fixed.

Database changes

None

External dependencies

None

Originally created by @peteeckel on GitHub (Sep 15, 2023). ### NetBox version v3.6.1 ### Feature type New functionality ### Proposed functionality Based on the brief discussion in #13773 I suggest implementing a "dry-run" functionality for bulk import/update data. ### Use case This FR needs to be seen in conjunction with #13775 und #13777. Importing or updating data in bulk can be complex and deal with large amounts of data and certain errors such as mis-spelled or mis-cased headers currently result in data being silently ignored. In case of an input, data sets with some invalid header names are currently silently ignored while the remainder of the columns is imported, which requires a subsequent update run with a new data set including IDs. It would be helpful to validate the input data and check for this kind of error before the import is actually executed so errors can be fixed. ### Database changes None ### External dependencies None
adam added the type: feature label 2025-12-29 20:38:59 +01:00
adam closed this issue 2025-12-29 20:39:00 +01:00
Author
Owner

@jeremystretch commented on GitHub (Sep 15, 2023):

I don't see how this would really help. After completing a "dry run" import, you would presumably only be able to see what attributes are listed in the resulting import table: Any others that happen not to be displayed in the table cannot be verified.

Additionally, the mechanism by which imported objects are displayed would not permit this behavior. After objects are imported, NetBox redirects the user to a list of objects filtered by request ID. This would not be feasible if the imported objects don't actually exist.

@jeremystretch commented on GitHub (Sep 15, 2023): I don't see how this would really help. After completing a "dry run" import, you would presumably only be able to see what attributes are listed in the resulting import table: Any others that happen not to be displayed in the table cannot be verified. Additionally, the mechanism by which imported objects are displayed would not permit this behavior. After objects are imported, NetBox redirects the user to a list of objects filtered by request ID. This would not be feasible if the imported objects don't actually exist.
Author
Owner

@peteeckel commented on GitHub (Sep 15, 2023):

In combination with the features suggested in #13775 and especially #13777 a dry-run would give the user a list of ignored columns that would not be used in the import without actually performing the import.

Without the "dry-run" the import is performed, but lacking the affected columns. Currently this might not even be noticed, with a notice that these rows haven't been imported would at the very least require a bulk update which needs an additional ID column - which in turn requires exporting the records and generating a new import set.

With a dry-run the user will be aware that there are some problematic columns and can fix them for the import, thus avoiding the need for the subsequent update.

@peteeckel commented on GitHub (Sep 15, 2023): In combination with the features suggested in #13775 and especially #13777 a dry-run would give the user a list of ignored columns that would not be used in the import without actually performing the import. Without the "dry-run" the import is performed, but lacking the affected columns. Currently this might not even be noticed, with a notice that these rows haven't been imported would at the very least require a bulk update which needs an additional ID column - which in turn requires exporting the records and generating a new import set. With a dry-run the user will be aware that there are some problematic columns and can fix them for the import, thus avoiding the need for the subsequent update.
Author
Owner

@jeremystretch commented on GitHub (Sep 15, 2023):

a dry-run would give the user a list of ignored columns that would not be used in the import without actually performing the import.

How? Maybe an example would help.

@jeremystretch commented on GitHub (Sep 15, 2023): > a dry-run would give the user a list of ignored columns that would not be used in the import without actually performing the import. How? Maybe an example would help.
Author
Owner

@peteeckel commented on GitHub (Sep 15, 2023):

Let's assume someone is trying to import the following data set:

address,status,dnsname
10.0.0.1/16,active,node1.zone1.example.com
10.0.0.2/16,active,node2.zone1.example.com
10.0.0.3/16,active,node3.zone1.example.com
10.0.0.4/16,active,node4.zone1.example.com
10.0.0.5/16,active,node5.zone1.example.com
10.0.0.6/16,active,node6.zone1.example.com
10.0.0.7/16,active,node7.zone1.example.com
[...]
10.0.0.254/16,active,node7.zone1.example.com
10.0.1.1/16,active,node1.zone2.example.com
10.0.1.2/16,active,node2.zone2.example.com
10.0.1.3/16,active,node3.zone2.example.com
10.0.1.4/16,active,node4.zone2.example.com
10.0.1.5/16,active,node5.zone2.example.com
10.0.1.6/16,active,node6.zone2.example.com
10.0.1.7/16,active,node7.zone2.example.com
[...]
10.0.1.254/16,active,node7.zone2.example.com
[...]
10.0.16.1/16,active,node1.zone16.example.com
10.0.16.2/16,active,node2.zone16.example.com
10.0.16.3/16,active,node3.zone16.example.com
10.0.16.4/16,active,node4.zone16.example.com
10.0.16.5/16,active,node5.zone16.example.com
10.0.16.6/16,active,node6.zone16.example.com
10.0.16.7/16,active,node7.zone16.example.com
[...]
10.0.16.254/16,active,node7.zone16.example.com

(and imagine the data being less schematic to add a bit of complexity).

What currently happens, as the dns_name field is optional, is that all data would be imported without the dns_name as the column header is not spelled correctly.

Now since the field is missing for all records, the only way to fix it is a bulk update. For that, the user needs the IDs for the IPAddress objects in question, so the CSV data need to be amended:

id,address,status,dns_name
1,10.0.0.1/16,active,node1.zone1.example.com
2,10.0.0.2/16,active,node2.zone1.example.com
3,10.0.0.3/16,active,node3.zone1.example.com
4,10.0.0.4/16,active,node4.zone1.example.com
5,10.0.0.5/16,active,node5.zone1.example.com
6,10.0.0.6/16,active,node6.zone1.example.com
7,10.0.0.7/16,active,node7.zone1.example.com
[...]
8,10.0.0.254/16,active,node7.zone1.example.com
9,10.0.1.1/16,active,node1.zone2.example.com
10,10.0.1.2/16,active,node2.zone2.example.com
11,10.0.1.3/16,active,node3.zone2.example.com
12,10.0.1.4/16,active,node4.zone2.example.com
13,10.0.1.5/16,active,node5.zone2.example.com
14,10.0.1.6/16,active,node6.zone2.example.com
15,10.0.1.7/16,active,node7.zone2.example.com
[...]
16,10.0.1.254/16,active,node7.zone2.example.com
[...]
17,10.0.16.1/16,active,node1.zone16.example.com
18,10.0.16.2/16,active,node2.zone16.example.com
19,10.0.16.3/16,active,node3.zone16.example.com
20,10.0.16.4/16,active,node4.zone16.example.com
21,10.0.16.5/16,active,node5.zone16.example.com
22,10.0.16.6/16,active,node6.zone16.example.com
23,10.0.16.7/16,active,node7.zone16.example.com
[...]
24,10.0.16.254/16,active,node7.zone16.example.com

A dry run would have returned the message like 'Field "dnsname" is unknown and will not be imported' (provided #13777 gets implemented) without actually importing anything, thereby giving the user the chance to fix the issue by correcting the header field.

@peteeckel commented on GitHub (Sep 15, 2023): Let's assume someone is trying to import the following data set: ``` address,status,dnsname 10.0.0.1/16,active,node1.zone1.example.com 10.0.0.2/16,active,node2.zone1.example.com 10.0.0.3/16,active,node3.zone1.example.com 10.0.0.4/16,active,node4.zone1.example.com 10.0.0.5/16,active,node5.zone1.example.com 10.0.0.6/16,active,node6.zone1.example.com 10.0.0.7/16,active,node7.zone1.example.com [...] 10.0.0.254/16,active,node7.zone1.example.com 10.0.1.1/16,active,node1.zone2.example.com 10.0.1.2/16,active,node2.zone2.example.com 10.0.1.3/16,active,node3.zone2.example.com 10.0.1.4/16,active,node4.zone2.example.com 10.0.1.5/16,active,node5.zone2.example.com 10.0.1.6/16,active,node6.zone2.example.com 10.0.1.7/16,active,node7.zone2.example.com [...] 10.0.1.254/16,active,node7.zone2.example.com [...] 10.0.16.1/16,active,node1.zone16.example.com 10.0.16.2/16,active,node2.zone16.example.com 10.0.16.3/16,active,node3.zone16.example.com 10.0.16.4/16,active,node4.zone16.example.com 10.0.16.5/16,active,node5.zone16.example.com 10.0.16.6/16,active,node6.zone16.example.com 10.0.16.7/16,active,node7.zone16.example.com [...] 10.0.16.254/16,active,node7.zone16.example.com ``` (and imagine the data being less schematic to add a bit of complexity). What currently happens, as the `dns_name` field is optional, is that all data would be imported without the `dns_name` as the column header is not spelled correctly. Now since the field is missing for all records, the only way to fix it is a bulk update. For that, the user needs the IDs for the IPAddress objects in question, so the CSV data need to be amended: ``` id,address,status,dns_name 1,10.0.0.1/16,active,node1.zone1.example.com 2,10.0.0.2/16,active,node2.zone1.example.com 3,10.0.0.3/16,active,node3.zone1.example.com 4,10.0.0.4/16,active,node4.zone1.example.com 5,10.0.0.5/16,active,node5.zone1.example.com 6,10.0.0.6/16,active,node6.zone1.example.com 7,10.0.0.7/16,active,node7.zone1.example.com [...] 8,10.0.0.254/16,active,node7.zone1.example.com 9,10.0.1.1/16,active,node1.zone2.example.com 10,10.0.1.2/16,active,node2.zone2.example.com 11,10.0.1.3/16,active,node3.zone2.example.com 12,10.0.1.4/16,active,node4.zone2.example.com 13,10.0.1.5/16,active,node5.zone2.example.com 14,10.0.1.6/16,active,node6.zone2.example.com 15,10.0.1.7/16,active,node7.zone2.example.com [...] 16,10.0.1.254/16,active,node7.zone2.example.com [...] 17,10.0.16.1/16,active,node1.zone16.example.com 18,10.0.16.2/16,active,node2.zone16.example.com 19,10.0.16.3/16,active,node3.zone16.example.com 20,10.0.16.4/16,active,node4.zone16.example.com 21,10.0.16.5/16,active,node5.zone16.example.com 22,10.0.16.6/16,active,node6.zone16.example.com 23,10.0.16.7/16,active,node7.zone16.example.com [...] 24,10.0.16.254/16,active,node7.zone16.example.com ``` A dry run would have returned the message like 'Field "dnsname" is unknown and will not be imported' (provided #13777 gets implemented) without actually importing anything, thereby giving the user the chance to fix the issue by correcting the header field.
Author
Owner

@jeremystretch commented on GitHub (Sep 15, 2023):

Ok, I think I understand the concern better, thanks. I believe this would be addressed by #11617, which seeks to raise a validation error on the presence of an unrecognized column header.

In general I don't like the concept of dry runs because in the best case scenario, they require wasting time, and in the worst the user forgets to utilize them in the first place.

@jeremystretch commented on GitHub (Sep 15, 2023): Ok, I think I understand the concern better, thanks. I believe this would be addressed by #11617, which seeks to raise a validation error on the presence of an unrecognized column header. In general I don't like the concept of dry runs because in the best case scenario, they require wasting time, and in the worst the user forgets to utilize them in the first place.
Author
Owner

@peteeckel commented on GitHub (Sep 15, 2023):

Absolutely d'accord, but in #13773 @pv2b answered that silently ignoring this kind of error was a feature and not a bug, and he suggested the dry-run feature as a way to solve the issue. I'd prefer the error message in combination with not accepting errorneous data as well.

@peteeckel commented on GitHub (Sep 15, 2023): Absolutely d'accord, but in #13773 @pv2b answered that silently ignoring this kind of error was a feature and not a bug, and he suggested the dry-run feature as a way to solve the issue. I'd prefer the error message in combination with not accepting errorneous data as well.
Author
Owner

@jeremystretch commented on GitHub (Sep 15, 2023):

in https://github.com/netbox-community/netbox/issues/13773 @pv2b answered that silently ignoring this kind of error was a feature and not a bug

I'll admit it's a bit subjective, but I'd prefer to treat it as a bug per the principle of least astonishment.

@jeremystretch commented on GitHub (Sep 15, 2023): > in https://github.com/netbox-community/netbox/issues/13773 @pv2b answered that silently ignoring this kind of error was a feature and not a bug I'll admit it's a bit subjective, but I'd prefer to treat it as a bug per the [principle of least astonishment](https://en.wikipedia.org/wiki/Principle_of_least_astonishment).
Author
Owner

@peteeckel commented on GitHub (Sep 15, 2023):

I'll admit it's a bit subjective, but I'd prefer to treat it as a bug per the principle of least astonishment.

Since I was quite astounded when I stumbled across this behaviour today I'm totally with you on that. Especially since in many cases you won't even notice that something is missing, i.e. when the misspelled column is not in the list of columns displayed in the table popping up after the import.

@peteeckel commented on GitHub (Sep 15, 2023): > I'll admit it's a bit subjective, but I'd prefer to treat it as a bug per the [principle of least astonishment](https://en.wikipedia.org/wiki/Principle_of_least_astonishment). Since I was quite astounded when I stumbled across this behaviour today I'm totally with you on that. Especially since in many cases you won't even notice that something is missing, i.e. when the misspelled column is not in the list of columns displayed in the table popping up after the import.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/netbox#8622