mirror of
https://github.com/advplyr/audiobookshelf.git
synced 2026-05-30 23:40:40 +02:00
No Branch/Tag Specified
master
book_tags_genres_dedupe
episode_download_fallback
Issue-4540-SortBy-StartedDate-and-FinishedDate
episode_meta_tagging
fix_authorize_race_condition
redirect_transcode_requests
progress_updated_sort
fix_ereader_socket_event
fix_change_empty_root_password
fix_podcast_session_track_index
fix_set_token
session_modal_user
localize_durations
fix_oidc_create_user
jwt_auth_refactor
fix_scanner_deleting_single_file_books
fix_mediaprogress_updatedat_2
experimental_next_client
podcast_episode_duration
episode-timestamps-clickable
book_author_secondary_sort_title
podcast_useragents
pathexists_user_access
fix_pathexists_join
book_author_secondary_sort
clean_duplicate_mediaprogress
sanitize_html_description
trix_prevent_attachments
check_path_api_fix
fix_mediaprogress_updatedat
increase_express_json_limit
fix_dockerfile_nunicode
search_episodes
audiobook_tools_update
episode_secondary_sorts
hls_stream_url_update
new_session_track_endpoint
audiobook_tools_enhancements
watcher_rescans_update
player_track_tooltip
fix_exclude_prefixes_crash
socket_item_events
fix_podcast_episode_scanner_promise
new_stats_controller
count_cache_for_userpermissions
parsing-opf-v3
validate_migration_files
fix-quick-match-all-crash
fix-chapter-end-sleep-timer
stringify_sequelize_query
remove-col-ambiguity
fix_next_prev_edit_description
details_trim_whitespace
fix_content_url_basepath
fix_logger_fatal
progress_bar_visibility
batch-edit-populate-map-details
feed_generator_updates
bookmark-modal-updates
migrate-library-item-in-scanner
migrate-new-library-items
migrate-podcasts-new-library-item-2
migrate-podcasts-new-library-item
fix-remove-episode-from-playlist
playback-session-use-new-library-item
refactor-library-item
fix-heatmap-caption
feed-episodes-upsert
share-media-player-media-session-api
remove-old-playlist
remove_old_collection_object
plugin-implementation-demo
feed_migration
refactor-feeds-from-item
fix_remove_authors_no_books
v2.17.3-fk-constraints-migration
migrations-first-upgrade
sqlite_2
feature/nuxt-target-server
waveform
sqlite
playlists
video
v2.35.1
v2.35.0
v2.34.0
v2.33.2
v2.33.1
v2.33.0
v2.32.1
v2.32.0
v2.31.0
v2.30.0
v2.29.0
v2.28.0
v2.27.0
v2.26.3
v2.26.2
v2.26.1
v2.26.0
v2.25.1
v2.25.0
v2.24.0
v2.23.0
v2.22.0
v2.21.0
v2.20.0
v2.19.5
v2.19.4
v2.19.3
v2.19.2
v2.19.1
v2.19.0
v2.18.1
v2.18.0
v2.17.7
v2.17.6
v2.17.5
v2.17.4
v2.17.3
v2.17.2
v2.17.1
v2.17.0
v2.16.2
v2.16.1
v2.16.0
v2.15.1
v2.15.0
v2.14.0
v2.13.4
v2.13.3
v2.13.2
v2.13.1
v2.13.0
v2.12.3
v2.12.2
v2.12.1
v2.12.0
v2.11.0
v2.10.1
v2.10.0
v2.9.0
v2.8.1
v2.8.0
v2.7.2
v2.7.1
v2.7.0
v2.6.0
v2.5.0
v2.4.4
v2.4.3
v2.4.2
v2.4.1
v2.4.0
v2.3.5
v2.3.4
v2.3.3
v2.3.2
v2.3.1
v2.3.0
v2.2.23
v2.2.22
v2.2.21
v2.2.20
v2.2.19
v2.2.18
v2.2.17
v2.2.16
v2.2.15
v2.2.14
v2.2.13
v2.2.12
v2.2.11
v2.2.10
v2.2.9
v2.2.8
v2.2.7
v2.2.6
v2.2.5
v2.2.4
v2.2.3
v2.2.2
v2.2.1
v2.2.0
v2.1.5
v2.1.4
v2.1.3
v2.1.2
v2.1.1
v2.1.0
v2.0.24
v2.0.23
v2.0.22
v2.0.21
v2.0.20
v2.0.19
v2.0.18
v2.0.17
v2.0.16
v2.0.15
v2.0.14
v2.0.13
v2.0.12
v2.0.11
v2.0.10
v2.0.9
v2.0.8
v2.0.7
v2.0.6
v2.0.5
v2.0.4
v2.0.3
v2.0.2
v2.0.1
v1.7.2
v1.7.1
v1.7.0
v1.6.0
v1.5.5
v1.5.0
v1.4.11
v1.4.9
v1.4.7
v1.4.6
v1.4.4
v1.4.2
v1.4.0
v1.4.1
v1.3.4
v1.3.3
v1.3.1
v1.2.8
v1.2.6
v1.2.5
v1.2.4
v1.2.1
v1.1.15
v1.1.14
v1.1.13
v1.1.12
v1.1.11
v1.1.10
v1.1.9
v1.1.8
v1.0.0
0.9.61-beta.0
0.9.61-beta
Labels
Clear labels
authentication
backlog
bug
chapter editor
config-issue
ebooks
encoding/embedding
enhancement
help wanted
listening sessions & progress
planned
possible plugin
progress sync
pull-request
sorting/filtering/searching
unable to reproduce
upload
users & permissions
waiting
Mirrored from GitHub Pull Request
No Label
enhancement
Milestone
No items
No Milestone
Projects
Clear projects
No project
Assignees
adam (Adam Melkus)
Clear assignees
No Assignees
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: starred/audiobookshelf#1691
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @sevenlayercookie on GitHub (Jan 25, 2024).
Describe the feature/enhancement
Include a hash of audiobooks with the metadata so that if files are moved locations, the metadata will still apply.
Currently: I have a perfectly good library and perfect metadata setup. If I move the files to a different location, the metadata no longer applies.
Enhancement: if the app automatically hashed (MD5, SHA, whatever) every book/file and stored and linked that hash with the metadata, then anytime that specific book is added to the library, the metadata would be applied to it.
Also this hash could be used for other purposes, such as matching chapter data from a community-driven database to audiobooks in ABS. (Plex uses similar strategy for credits markers)
@Silther commented on GitHub (Jan 25, 2024):
Can you move the books if the metadata is inside the bookfolder?
@sevenlayercookie commented on GitHub (Jan 25, 2024):
I haven't tried since I store my metadata in the default centralized location, but I suppose that could be a solution, assuming nothing else changed about the file such as its file name.
@nichwall commented on GitHub (Jan 25, 2024):
ABS uses the inode to detect moved files. This doesn't work on every filesystem (especially network shares) or if you're performing an operation which copies or deletes the old file since that deletes the inode associated with the file. If you're using filesystem move commands, the inode should be preserved so the updated path is reflected in ABS.
There is some ongoing discussion around improving this functionality.
@Hallo951 commented on GitHub (Jan 25, 2024):
@nichwall
Where is the discussion about improving this function? I would be very interested in an improvement as my audio books are on my NAS and Abs runs on an extra server. This means that the watcher, which is based on this Innode, does not work properly...
@nichwall commented on GitHub (Jan 25, 2024):
There isn't a good central spot where all of the discussion has happened.
It mostly comes up periodically in Discord (searches for "inode" and "hash" should return results). There was a recent discussion about switching from using the
ctimeto using themtimeto help with inodes changing on NAS that @FreedomBen was thinking of testing out (references https://github.com/advplyr/audiobookshelf/issues/2509). Near the end of that conversation there was discussion of using hashing in addition to/instead of the inode, but one of the main concerns is how that impacts scan performance for large libraries or network shares.A few server releases ago, the filename has priority over inode changes in case of the network share, but it doesn't seem to work all the time right now (server version 2.7.2 is currently latest)
Also this issue
https://github.com/advplyr/audiobookshelf/issues/1447
@advplyr commented on GitHub (Jan 25, 2024):
This should be working. Where did you get the impression it wasn't?
@nichwall commented on GitHub (Jan 25, 2024):
Specifically this and following comments where only the inode changed, but I could also have gotten confused.
https://github.com/advplyr/audiobookshelf/issues/2509#issuecomment-1890828274
@sevenlayercookie commented on GitHub (Jan 25, 2024):
I'm far from an expert in hashing etc., but since security isn't an issue and this is simply for convenience, what if instead of hashing the entire file, only portions were hashed? Such as 1 MB from the beginning, 1 MB from the end, and 1 MB from somewhere in the middle? Include file size for good measure. Would be very efficient computationally, and should do a good enough job of preventing collisions.
@nichwall commented on GitHub (Jan 25, 2024):
Well that's a clever idea. Looks like ID3 tags are stored at the end of the file for ID3-v1 and beginning of files for ID3-v2, and the 1MB should get some of the data itself too in case of a reencode (and tags aren't edited). Not sure if every container keeps metadata at the beginning or end of file, but I would assume the file size changing could catch that.
That would probably work for ebooks and other files (cover images, sidecar metadata).
@sevenlayercookie commented on GitHub (Jan 25, 2024):
I'll experiment within my own library and see if it seems reproducible and avoiding collisions... small sample size.
On another note, I've been experimenting with xxHash on my RPI 4. It's incredibly fast even on this device. Once file loaded in memory, it was running at 1000 MB/s. 100 gb library could be hashed in <2 mins (hard drive speed is the real bottleneck).
ABS could be programmed to only run the hash when the file is already loaded into memory for other reasons (playback) to prevent redundant reads (or when a 'force hash' command is given on the library).
@agittins commented on GitHub (Jan 26, 2024):
Would audio fingerprinting as used in Picard Musicbrainz and similar apps work well with spoken material? This is a pretty efficient way to identify music at least. The chromaprint fpcalc utility calculates the fingerprint (it took about 3 seconds on a 100mb m4b over nfs, vs 7 seconds for md5sum even after fpcalc had cached some of it), and the resulting fingerprint can be stored in the metadata and used to query / submit to the musicbrainz (or perhaps bookbrainz?) services.
The AcoustID fingerprinting uses characteristics of the recording rather than relying on the exact file or bitstream - so it can recognise a given recording even if it's been transcoded to another format or if the metadata has been edited or stripped.
The web service allows users to collaboratively share metadata and tie together editions, releases, works, authors etc. I suspect the bookbrainz service might be fairly new (I only learned of it while writing this comment) but the whole musicbrainz thing is really well thought-out and makes a huge difference to organising a music collection, I am sure it could be leveraged to do the same for audiobooks, if only it were adopted by more apps. The database is open, and indeed you can download a full dump of their postgresql if one feels the need!
I couldn't see any existing issues specifically pointing to its use (but it was mentioned, perhaps in passing in some other threads I've not read).
@sevenlayercookie commented on GitHub (Jan 27, 2024):
I was wondering this too, would be nice to identify editions of audiobooks despite different encodings. But I wonder with how much dynamic range compression and filtering audiobooks get and how non-dynamic spoken word is in general how accurate it would be. Seems worth experimenting with.
@nichwall commented on GitHub (Jan 27, 2024):
That would probably only work for single file books, since books broken up into multiple files are not consistent (by chapter, fixed length, fixed count, etc).
@sevenlayercookie commented on GitHub (Jan 27, 2024):
I believe Musicbrainz/AcoustID/fingerprints the entire sample file vs entire file fingerprint in database, so seems like would be less helpful for multiple file books. However algorithms like Shazam, EchoPrint, Panako excel at matching short segments, which I think would work well with multiple mp3s. Maybe could even align the mp3s with the "gold standard" recording in the database.
Or to go another step, take an audiobook edition with known time stamped chapters, fingerprint the five seconds around every chapter name and then upload those fingerprints to a database, allowing users to run each chapter name fingerprint against their own files, then ABS would timestamp the user's files when a match is made. Seems like that would work regardless of how books are divided among files.
@agittins commented on GitHub (Jan 27, 2024):
Spoken word has way more dynamic range than popular music, and probably more than most classical music, as spoken word has gaps between words/sentences, ie silence, which many (most?) forms of music only have periodically, if at all. MusicBrainz still works despite the loudness wars, I doubt dynamic range will be an issue.
What speech does lack is spectral diversity - we humans mostly just honk around 350Hz or so with not a lot of variety compared to many musical sources. This might affect how well the AcoustID algo performs, but that's just my speculation, and pretty low quality speculation at that :-)
As far as I can tell, AcoustID appears to work well for things humans can hear. An hypothesis going counter to that would probably need some evidence.
Hashing is designed to find identical things, fingerprinting (in this context) is to match similar things.
These aren't un-examined possibilities, it's part of the design of musicbrainz (and bookbrainz) that is fairly obvious once you start looking at how they structure things. Chapters/books, tracks/albums, editions/release variants. No new ground here, we don't need to reinvent the wheel.
What I am proposing is that MB/BB already has this problem solved, and ABS could probably implement it, if there's an appetite to do so.
In order of Minimum Viable Products, the features could look like:
fpcalcand store it in the local database, optionally writing it to the existing file's metadata.Only the first step is required to make this a usable feature. The second two can be performed by Picard musicbrainz or other client tools that already exist.
MusicBrainz already has an official style guide for audiobook metadata and how it should be handled.
@jwillikers commented on GitHub (Mar 30, 2025):
I've been adding audiobooks to MusicBrainz and BookBrainz for the past few months as a way to solve metadata / chapters for my audiobooks with the hope of that data being utilized more broadly. I've been submitting AcoustID's for these with Picard, but this is only possible for tracks below a certain length, somewhere in the 9-10 hours range. This looks like it will require expanding the size used for the integer type of the length field in the AcoustID database. See https://github.com/acoustid/acoustid-server/issues/43.
AcoustID fingerprinting only occurs for the first two minutes of a track, so it may not be helpful in situations where books share the first two minutes, though I haven't come across that in my experience. I think I may have read somewhere that it also takes into account a portion at the end of the track, too. Problems of AcoustID are documented at the bottom of this page.
For books split into multiple files, MusicBrainz will work best for this when those files are distributed as part of a release, since MusicBrainz really tries to capture releases. I primarily buy my audiobooks from Libro.fm these days, and I add the mp3s they distribute as a release, so in these cases, it will work for multiple tracks. Hopefully in the future, something like Alternative Tracklists could help with common ways of splitting up the tracks as well as chapters.
The last thing to be aware of is that Audible releases end up being distinct from other releases, like those from Libro.fm. This is in part related to the fact that Audible releases are associated with an ASIN instead of an ISBN, but it's also important to be aware of the fact that Audible's releases contain a little snippet at the beginning and end of the book, making the audio different, along with all of the chapter offsets.
@ekellstrand commented on GitHub (Sep 18, 2025):
Any updates on this?
Alternate Idea: Would people be open to ABS writing a custom abs_uid tag to the audio files when importing to the library? or, drop an abs.uid sidecar file? The uid "wouldn't have to be" a globally unique hash or id. Or would it be considered poor form to write a tag that could only every be useful to that one ABS instance?
My setup:
@freitagdavid commented on GitHub (Jan 23, 2026):
I definitely wouldn't love writing an actual tag, I think the idea of hashing is perfectly reasonable. Just take the file strip the metadata and hash just the audio stream. This would prevent out of band metadata changes from stopping a match.