[Enhancement]: Authors with alternative names/nicknames to prevent duplicate authors #582

Open
opened 2026-04-24 23:13:37 +02:00 by adam · 9 comments
Owner

Originally created by @wtanksleyjr on GitHub (Aug 12, 2022).

Describe the feature/enhancement

J.R.R. Tolkien, C.S. Lewis, L.E. Modesitt Jr., G.R.R. Martin: all have something in common, that their names are spelled slightly differently across different import sources, which sometimes insert a space between the initials and sometimes not.

My first (and only planned) proposal is to keep a cache of all such initialisms, and when a newly imported book contains an author with an initialism, scan that cache in a space-insensitive way, so that you can collapse newly input authors into whatever was entered into the database first. The user can then adapt the spelling using the author's page if desired (also updating the cache, removing the entry if the user changes the author's name to not include an initialism).

Originally created by @wtanksleyjr on GitHub (Aug 12, 2022). ### Describe the feature/enhancement J.R.R. Tolkien, C.S. Lewis, L.E. Modesitt Jr., G.R.R. Martin: all have something in common, that their names are spelled slightly differently across different import sources, which sometimes insert a space between the initials and sometimes not. My first (and only planned) proposal is to keep a cache of all such initialisms, and when a newly imported book contains an author with an initialism, scan that cache in a space-insensitive way, so that you can collapse newly input authors into whatever was entered into the database first. The user can then adapt the spelling using the author's page if desired (also updating the cache, removing the entry if the user changes the author's name to not include an initialism).
adam added the enhancement label 2026-04-24 23:13:37 +02:00
Author
Owner

@advplyr commented on GitHub (Aug 13, 2022):

I believe I saw this in other software where you could list alternative names for an author/artist.
I think the first step would just be to add an alternative names multi-input for authors and have the scanner check against those names when looking for matching authors. Adding the authors would be manual until we figure out a good way to auto-populate it.

@advplyr commented on GitHub (Aug 13, 2022): I believe I saw this in other software where you could list alternative names for an author/artist. I think the first step would just be to add an alternative names multi-input for authors and have the scanner check against those names when looking for matching authors. Adding the authors would be manual until we figure out a good way to auto-populate it.
Author
Owner

@wtanksleyjr commented on GitHub (Aug 13, 2022):

That's a lot more flexible, yes. Good sense.

@wtanksleyjr commented on GitHub (Aug 13, 2022): That's a lot more flexible, yes. Good sense.
Author
Owner

@hobesman commented on GitHub (Dec 27, 2022):

Yes to this!

Sometimes authors add a middle initial and sometimes don't. (Patrick Laplante vs. Patrick G. Laplante). Sometimes they use initials and other times spell out the name (Katherine Applegate vs. K.A. Applegate).

And even Audible sometimes has different versions for different books, so matching doesn't always ensure consistency.

@hobesman commented on GitHub (Dec 27, 2022): Yes to this! Sometimes authors add a middle initial and sometimes don't. (Patrick Laplante vs. Patrick G. Laplante). Sometimes they use initials and other times spell out the name (Katherine Applegate vs. K.A. Applegate). And even Audible sometimes has different versions for different books, so matching doesn't always ensure consistency.
Author
Owner

@ZLoth commented on GitHub (Sep 28, 2024):

I think this should be handled under some sort of aliasing. The parent name is J. R. R. Tolkien, and aliases would be J.R.R. Tolkien.

@ZLoth commented on GitHub (Sep 28, 2024): I think this should be handled under some sort of aliasing. The parent name is J. R. R. Tolkien, and aliases would be J.R.R. Tolkien.
Author
Owner

@zzyzx-dc commented on GitHub (Sep 28, 2024):

I think this should be handled under some sort of aliasing. The parent name is J. R. R. Tolkien, and aliases would be J.R.R. Tolkien.

That's what I was suggesting. Having the metadata analysis just lump them all under one name automatically. Sort of like how Discogs will list multiple variations of band names under one band.

I suppose the 'album artist' and 'artist' tag could be used here - I use that to organize music. That way the band in the artist list is 'The Smashing Pumpkins' but if a particular album was released as 'Smashing Pumpkins' you just make the albumartist tag 'The Smashing Pumpkins' and it is organized correctly but still shows up faithfully to the original release.

I think the current recommendation from the devs is to manually change each author name in the author list to one author name. That will merge them all together.

@zzyzx-dc commented on GitHub (Sep 28, 2024): > I think this should be handled under some sort of aliasing. The parent name is J. R. R. Tolkien, and aliases would be J.R.R. Tolkien. That's what I was suggesting. Having the metadata analysis just lump them all under one name automatically. Sort of like how Discogs will list multiple variations of band names under one band. I suppose the 'album artist' and 'artist' tag could be used here - I use that to organize music. That way the band in the artist list is 'The Smashing Pumpkins' but if a particular album was released as 'Smashing Pumpkins' you just make the albumartist tag 'The Smashing Pumpkins' and it is organized correctly but still shows up faithfully to the original release. I think the current recommendation from the devs is to manually change each author name in the author list to one author name. That will merge them all together.
Author
Owner

@Valentin-Metz commented on GitHub (Oct 1, 2024):

Suggestion: Treat ., and . as identical in author names
Maybe enforce the latter with an autoformatter.

@Valentin-Metz commented on GitHub (Oct 1, 2024): Suggestion: Treat `.`, ` ` and `. ` as identical in author names Maybe enforce the latter with an autoformatter.
Author
Owner

@mdeeter commented on GitHub (Jan 26, 2025):

I started to create an enhancement request and finally found this one...

I was going to suggest:

It's annoying that some matches save author tags inconsistently. When I'm cleaning up authors I often find that I have duplicate authors only because the formatting of the initials in the name are formatted with different spacing.

Example:
A.G. Riddle
A. G. Riddle

As @Valentin-Metz mentioned, I was going to suggest forcing some normalizing of the formatting? Personally, I'd prefer a rule that automatically adds a space after any period in the author name (like the second example).

However, the suggestion of aliases would provide a lot of flexibility (which we all know, flexibility = expensive... in implementation, development time).

As great as it would be to somehow automate the tracking of aliases or allow the user to manage aliases, would it be simpler to provide some UI that the admin could use to manage authors that may have a duplicate entries. Perhaps a button next to the Match All Authors that says Find Possible Duplicates? When selected, the page would show authors that have partial matches of first/last names. The admin could then update/match authors to the name they want to use. Establishing a good algorithm for the search/filter is probably the most complex piece of that solution.

It's definitely not an elegant solution... but trying to think towards a simpler implementation to let us solve the problem sooner.

@mdeeter commented on GitHub (Jan 26, 2025): I started to create an enhancement request and finally found this one... I was going to suggest: It's annoying that some matches save author tags inconsistently. When I'm cleaning up authors I often find that I have duplicate authors only because the formatting of the initials in the name are formatted with different spacing. Example: A.G. Riddle A. G. Riddle As @Valentin-Metz mentioned, I was going to suggest forcing some normalizing of the formatting? Personally, I'd prefer a rule that automatically adds a space after any period in the author name (like the second example). However, the suggestion of aliases would provide a lot of flexibility (which we all know, flexibility = expensive... in implementation, development time). As great as it would be to somehow automate the tracking of aliases or allow the user to manage aliases, would it be simpler to provide some UI that the admin could use to manage authors that may have a duplicate entries. Perhaps a button next to the `Match All Authors` that says `Find Possible Duplicates`? When selected, the page would show authors that have partial matches of first/last names. The admin could then update/match authors to the name they want to use. Establishing a good algorithm for the search/filter is probably the most complex piece of that solution. It's definitely not an elegant solution... but trying to think towards a simpler implementation to let us solve the problem sooner.
Author
Owner

@ReaderGuy42 commented on GitHub (Feb 23, 2025):

Any news on this? This is something I've been looking for.

@ReaderGuy42 commented on GitHub (Feb 23, 2025): Any news on this? This is something I've been looking for.
Author
Owner

@Bishop-trevorstuart commented on GitHub (Feb 6, 2026):

I'd offer to actually help on this, but I'm not even good at vibe coding. I'd think some sort of regex could be used here though. An AI suggested implementing some sort of Dynamic Regex Generator. Just thinking out loud incase it helps at all...

@Bishop-trevorstuart commented on GitHub (Feb 6, 2026): I'd offer to actually help on this, but I'm not even good at vibe coding. I'd think some sort of regex could be used here though. An AI suggested implementing some sort of Dynamic Regex Generator. Just thinking out loud incase it helps at all...
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/audiobookshelf#582