[Enhancement] Embed metadata in audio files & scan OPF file metadata #74

Closed
opened 2026-04-24 22:57:42 +02:00 by adam · 21 comments
Owner

Originally created by @rubbo898 on GitHub (Oct 25, 2021).

I'm using Calibre to fetch metadata from online datasources (Amazon and Google in the first place).
For it's internal logic it creates .opf files for backup/restore purposes, it would be great having the feature to embed these metadata into audio files (eg mp3tag) from .opf files.

With a wider perspective: embed metadata from an external datasource (online? file?) into the audio files

Originally created by @rubbo898 on GitHub (Oct 25, 2021). I'm using Calibre to fetch metadata from online datasources (Amazon and Google in the first place). For it's internal logic it creates .opf files for backup/restore purposes, it would be great having the feature to embed these metadata into audio files (eg mp3tag) from .opf files. With a wider perspective: embed metadata from an external datasource (online? file?) into the audio files
adam added the enhancement label 2026-04-24 22:57:42 +02:00
adam closed this issue 2026-04-24 22:57:42 +02:00
Author
Owner

@advplyr commented on GitHub (Oct 25, 2021):

I think this will be 2 separate things:

  1. Detect .opf files, parse them, then populate the audiobookshelf book details (on scans).
  2. An option to sync your audio file meta tags with the audiobookshelf book details.

The opf file metadata has been requested a few times, this should be fairly straightforward.
Updating audio file metadata has been a long time coming and is a bigger project. I want to be sure that anything done to the users files is explicit and well documented about what is happening. When syncing your audio file meta tags it will need to be clear which book details will be mapped to which ID3 tags and on what audio files.

Just talking it out.

I think you will need to elaborate on your last suggestion about external datasources.

@advplyr commented on GitHub (Oct 25, 2021): I think this will be 2 separate things: 1. Detect .opf files, parse them, then populate the audiobookshelf book details (on scans). 2. An option to sync your audio file meta tags with the audiobookshelf book details. The opf file metadata has been requested a few times, this should be fairly straightforward. Updating audio file metadata has been a long time coming and is a bigger project. I want to be sure that anything done to the users files is explicit and well documented about what is happening. When syncing your audio file meta tags it will need to be clear which book details will be mapped to which ID3 tags and on what audio files. Just talking it out. I think you will need to elaborate on your last suggestion about external datasources.
Author
Owner

@jarrodCoombes commented on GitHub (Oct 26, 2021):

Personally I'd be happy if the meta data was pulled from the internet and simply stored in a text file with the audio audio files, and just parsed into the database from there, similar to how the covers work. This might be a good place to start.

I get nervous when you start talking about modifying the MP3s directly (that's mostly due to the mess that is the meta data in all my audiobook files). Then again, if I was to have the option of doing it manually by book, author, series or the entire library, then I'd welcome Audiobookshelf in cleaning up the id3 tags on my files, it is a logical step IMO, just not automatically just yet.

The more I think about it, a button that says "Save Metadata to ID3" button would be pretty awesome, especially after all the data has been scraped and verified. Documenting what goes were would be done via a simple wiki page and ? icon in the confirmation dialogue.

@jarrodCoombes commented on GitHub (Oct 26, 2021): Personally I'd be happy if the meta data was pulled from the internet and simply stored in a text file with the audio audio files, and just parsed into the database from there, similar to how the covers work. This might be a good place to start. I get nervous when you start talking about modifying the MP3s directly (that's mostly due to the mess that is the meta data in all my audiobook files). Then again, if I was to have the option of doing it manually by book, author, series or the entire library, then I'd welcome Audiobookshelf in cleaning up the id3 tags on my files, it is a logical step IMO, just not automatically just yet. The more I think about it, a button that says "Save Metadata to ID3" button would be pretty awesome, especially after all the data has been scraped and verified. Documenting what goes were would be done via a simple wiki page and ? icon in the confirmation dialogue.
Author
Owner

@rubbo898 commented on GitHub (Oct 26, 2021):

@advplyr forget my last suggestion, it's more or less what @jarrodCoombes said in the first couple of lines of his comment: data pulled from internet (like Calibre) and stored in an file (text o different format) that can be used by Audiobookshelf for the "embed to ID3" feature. My idea was to interact not only with .opf files for a wider compatibility, but I understand that is a huge work even with a single extension.

@jarrodCoombes totally agree with you. If I did understand well what you described in the first place is the "job" performed by Calibre, Audiobookshelf could get involved for the last piece working with the scraped and verified data of Calibre.

@rubbo898 commented on GitHub (Oct 26, 2021): @advplyr forget my last suggestion, it's more or less what @jarrodCoombes said in the first couple of lines of his comment: data pulled from internet (like Calibre) and stored in an file (text o different format) that can be used by Audiobookshelf for the "embed to ID3" feature. My idea was to interact not only with .opf files for a wider compatibility, but I understand that is a huge work even with a single extension. @jarrodCoombes totally agree with you. If I did understand well what you described in the first place is the "job" performed by Calibre, Audiobookshelf could get involved for the last piece working with the scraped and verified data of Calibre.
Author
Owner

@advplyr commented on GitHub (Oct 27, 2021):

Audiobookshelf will never modify or add files to your filesystem outside of the /config and /metadata folders unless manually opted in with sufficient explanation.

Side note: The config and metadata directories are poorly named because they have changed purposes since I started the project. Config = database & metadata = streams, backups, metadata, downloads. Unfortunately I also learned that unraid community apps overrides the default mapping I chose for /config, so most people have their metadata inside their config. It's a proper mess now.

Back to the point, I think the 2 items I initially identified are still the way to go. When you update metadata in calibre it is first stored in the .opf file, then you can press some buttons to embed that data into your ebook. I think audiobookshelf should function the same way.

When scanning, if an .opf file is present it will be parsed and mapped to audiobookshelf fields.
There is an order of precedence in audiobookshelf for mapping data. For example, the folder names will take precedence over audio file ID3 tags, ID3 tags will take precedence over an .opf file, etc.

Eventually data will be pulled from the internet, but so far I haven't come across a good reliable source yet.
If audiobookshelf can take advantage of all the information from your current files though, it should be pretty solid for most people.

Hopefully that isn't too much rambling and makes sense.

@advplyr commented on GitHub (Oct 27, 2021): Audiobookshelf will never modify or add files to your filesystem outside of the `/config` and `/metadata` folders unless manually opted in with sufficient explanation. Side note: The config and metadata directories are poorly named because they have changed purposes since I started the project. Config = database & metadata = streams, backups, metadata, downloads. Unfortunately I also learned that unraid community apps overrides the default mapping I chose for `/config`, so most people have their metadata inside their config. It's a proper mess now. Back to the point, I think the 2 items I initially identified are still the way to go. When you update metadata in calibre it is first stored in the .opf file, then you can press some buttons to embed that data into your ebook. I think audiobookshelf should function the same way. When scanning, if an .opf file is present it will be parsed and mapped to audiobookshelf fields. There is an order of precedence in audiobookshelf for mapping data. For example, the folder names will take precedence over audio file ID3 tags, ID3 tags will take precedence over an .opf file, etc. Eventually data will be pulled from the internet, but so far I haven't come across a good reliable source yet. If audiobookshelf can take advantage of all the information from your current files though, it should be pretty solid for most people. Hopefully that isn't too much rambling and makes sense.
Author
Owner

@rubbo898 commented on GitHub (Oct 27, 2021):

All crystal clear!!

When scanning, if an .opf file is present it will be parsed and mapped to audiobookshelf fields.
There is an order of precedence in audiobookshelf for mapping data. For example, the folder names will take precedence over audio file ID3 tags, ID3 tags will take precedence over an .opf file, etc.

About this I can only suggest to allow the users to explicitly choose their own priority.
For example in my case Calibre mess up with both Author (folder) and Title (audio file) names (replace ":" with "_" and cut too long file names), don't know why maybe I'm doing something wrong...
Also the ID3 tags are not consistent, I should use a tool to erase them all and re-tag accordingly to the library structure (but here would come back the above issue...)

Anyway! If I'd able to use the .opf file with priority over both folder/file name and ID3 tags it resolves all my pains.

@rubbo898 commented on GitHub (Oct 27, 2021): All crystal clear!! > When scanning, if an .opf file is present it will be parsed and mapped to audiobookshelf fields. > There is an order of precedence in audiobookshelf for mapping data. For example, the folder names will take precedence over audio file ID3 tags, ID3 tags will take precedence over an .opf file, etc. About this I can only suggest to allow the users to explicitly choose their own priority. For example in my case Calibre mess up with both Author (folder) and Title (audio file) names (replace ":" with "_" and cut too long file names), don't know why maybe I'm doing something wrong... Also the ID3 tags are not consistent, I should use a tool to erase them all and re-tag accordingly to the library structure (but here would come back the above issue...) Anyway! If I'd able to use the .opf file with priority over both folder/file name and ID3 tags it resolves all my pains.
Author
Owner

@advplyr commented on GitHub (Oct 28, 2021):

Metadata from the internet was discussed here, so I want to announce the new experimental match feature #157

@advplyr commented on GitHub (Oct 28, 2021): Metadata from the internet was discussed here, so I want to announce the new experimental match feature #157
Author
Owner

@advplyr commented on GitHub (Nov 9, 2021):

I just pushed v1.6.13 that looks for a metadata.opf or metadata.xml file and extracts the following:

title, author, narrator, publishYear, publisher, isbn

I skipped description for now because calibre stores description as HTML and audiobookshelf uses plain text only. It may make sense for audiobookshelf to support HTML in the future.

The current order of precedence for setting metadata is

  1. Folder & filenames
  2. desc.txt & reader.txt
  3. metadata.opf OR metadata.xml
  4. Audio file ID3 tags

This order is not adjustable yet.

@advplyr commented on GitHub (Nov 9, 2021): I just pushed `v1.6.13` that looks for a `metadata.opf` or `metadata.xml` file and extracts the following: ``` title, author, narrator, publishYear, publisher, isbn ``` I skipped description for now because calibre stores description as HTML and audiobookshelf uses plain text only. It may make sense for audiobookshelf to support HTML in the future. The current order of precedence for setting metadata is 1. Folder & filenames 2. desc.txt & reader.txt 3. metadata.opf OR metadata.xml 4. Audio file ID3 tags This order is not adjustable yet.
Author
Owner

@rubbo898 commented on GitHub (Nov 9, 2021):

What a great news! It looks for a metadata.opf file with the same name of the audio file? Or is mandatory that the .opf is named "metadata"?

I've performed some test and it seems that the .opf file isn't recognized:

InkedImmagine 2021-11-09 095444_LI
InkedImmagine 2021-11-09 095511_LI

Here below the .opf file:

Il sangue della lupa.zip

OT: I'm trying to set up Treafik as reverse proxy, it works via web UI but not with the App.

@rubbo898 commented on GitHub (Nov 9, 2021): What a great news! It looks for a metadata.opf file with the same name of the audio file? Or is mandatory that the .opf is named "metadata"? I've performed some test and it seems that the .opf file isn't recognized: ![InkedImmagine 2021-11-09 095444_LI](https://user-images.githubusercontent.com/55589568/140894446-da00be26-d1f4-4296-8157-6adb0d44998f.jpg) ![InkedImmagine 2021-11-09 095511_LI](https://user-images.githubusercontent.com/55589568/140894451-250e2ec5-c93d-4572-be1d-94f8abc3cc9b.jpg) Here below the .opf file: [Il sangue della lupa.zip](https://github.com/advplyr/audiobookshelf/files/7503387/Il.sangue.della.lupa.zip) OT: I'm trying to set up Treafik as reverse proxy, it works via web UI but not with the App.
Author
Owner

@advplyr commented on GitHub (Nov 10, 2021):

Thanks for sharing your opf file, I was able to extract more information after seeing your example.

I updated the scanner to find any file with the extension .opf, instead of just metadata.opf.

I noticed that the description in the one you shared was plain text and not html, so I added a check to see if it is plain text and to use it if it is.
I added genres too, so the genres in your example will populate.

The new full list of metadata extracted is:

title, author, narrator, publishYear, publisher, isbn, description, genres, language

To test it I removed all the details of a book in my library, put your opf file in the directory, then pressed re-scan.
image

This is v1.6.14

@advplyr commented on GitHub (Nov 10, 2021): Thanks for sharing your opf file, I was able to extract more information after seeing your example. I updated the scanner to find any file with the extension `.opf`, instead of just `metadata.opf`. I noticed that the description in the one you shared was plain text and not html, so I added a check to see if it is plain text and to use it if it is. I added genres too, so the genres in your example will populate. The new full list of metadata extracted is: ``` title, author, narrator, publishYear, publisher, isbn, description, genres, language ``` To test it I removed all the details of a book in my library, put your opf file in the directory, then pressed re-scan. ![image](https://user-images.githubusercontent.com/67830747/141040080-2cb96645-2b49-4771-a518-c42e8a40422f.png) This is `v1.6.14`
Author
Owner

@jarrodCoombes commented on GitHub (Nov 10, 2021):

Would it be possible to add an option to save the meta data out to a file in the folder with the MP3s, similar to what happens with cover images?

@jarrodCoombes commented on GitHub (Nov 10, 2021): Would it be possible to add an option to save the meta data out to a file in the folder with the MP3s, similar to what happens with cover images?
Author
Owner

@advplyr commented on GitHub (Nov 10, 2021):

That's what the "Save Metadata" button is supposed to do. Although it is probably not complete since it was before a lot of the details were added.
Is that not working for you?

@advplyr commented on GitHub (Nov 10, 2021): That's what the "Save Metadata" button is supposed to do. Although it is probably not complete since it was before a lot of the details were added. Is that not working for you?
Author
Owner

@jarrodCoombes commented on GitHub (Nov 10, 2021):

Ahh, ok. I honestly had no idea what that button did, and completely missed the tool tip when you hover it. I did test it, and it does work.

@jarrodCoombes commented on GitHub (Nov 10, 2021): Ahh, ok. I honestly had no idea what that button did, and completely missed the tool tip when you hover it. I did test it, and it does work.
Author
Owner

@rubbo898 commented on GitHub (Nov 13, 2021):

It's working as described, awsome!

Where can I find the language tag? I didn't spot it in any menu.
Embed metadata in the ID3 is the last piece but I understand is a critical step, thanks a lot for your efforts!

@rubbo898 commented on GitHub (Nov 13, 2021): It's working as described, awsome! Where can I find the language tag? I didn't spot it in any menu. Embed metadata in the ID3 is the last piece but I understand is a critical step, thanks a lot for your efforts!
Author
Owner

@advplyr commented on GitHub (Nov 13, 2021):

I haven't exposed the language field yet. I added it because I anticipate this will be requested at some point. I try not to clutter up the UI with stuff people won't use.

@advplyr commented on GitHub (Nov 13, 2021): I haven't exposed the language field yet. I added it because I anticipate this will be requested at some point. I try not to clutter up the UI with stuff people won't use.
Author
Owner

@jarrodCoombes commented on GitHub (Nov 14, 2021):

I haven't exposed the language field yet. I added it because I anticipate this will be requested at some point. I try not to clutter up the UI with stuff people won't use.

This would suggest the need for the option in the "Settings" page where people can turn this on and off.

@jarrodCoombes commented on GitHub (Nov 14, 2021): > I haven't exposed the language field yet. I added it because I anticipate this will be requested at some point. I try not to clutter up the UI with stuff people won't use. This would suggest the need for the option in the "Settings" page where people can turn this on and off.
Author
Owner

@rubbo898 commented on GitHub (Nov 16, 2021):

I haven't exposed the language field yet. I added it because I anticipate this will be requested at some point. I try not to clutter up the UI with stuff people won't use.

Got the point!

2 secondary question:

  1. Is there any way to perform bulk actions like scan & grab metadata from .opf files and save metadata on the whole library or a portion of it? Select all + modify is missing this feature right now, are you planning to add it in the future?
  2. "Save metadata" now create a "metadata.nfo" file, from a library perspective do you think a metadata file named like the audio file could be more "accurate"?
@rubbo898 commented on GitHub (Nov 16, 2021): > I haven't exposed the language field yet. I added it because I anticipate this will be requested at some point. I try not to clutter up the UI with stuff people won't use. Got the point! 2 secondary question: 1. Is there any way to perform bulk actions like scan & grab metadata from .opf files and save metadata on the whole library or a portion of it? Select all + modify is missing this feature right now, are you planning to add it in the future? 2. "Save metadata" now create a "metadata.nfo" file, from a library perspective do you think a metadata file named like the audio file could be more "accurate"?
Author
Owner

@advplyr commented on GitHub (Dec 17, 2021):

There is still no option for batch saving metadata, I think there is another issue open for this. I'm doing some house cleaning now because I lost track of a lot of threads it seems.
I'm not sure about the naming of the metadata file, it seems common practice to use the same name in each folder.

@advplyr commented on GitHub (Dec 17, 2021): There is still no option for batch saving metadata, I think there is another issue open for this. I'm doing some house cleaning now because I lost track of a lot of threads it seems. I'm not sure about the naming of the metadata file, it seems common practice to use the same name in each folder.
Author
Owner

@wtanksleyjr commented on GitHub (Jan 18, 2022):

When scanning, if an .opf file is present it will be parsed and mapped to audiobookshelf fields.

Can someone who's gotten this working please post an example, ideally using all of the supported fields? There doesn't seem to be any document that defines this format in the terms used in this discussion (the fields of that format don't include an "author", for example, but rather a "contributor" with a subfield something like "aut" IIRC). It's incredibly complicated so far as I can tell.

Is it possible to set a book's series using this OPF file? I can't tell how to do that at all, but some of my books appear in series for reasons I ... can't tell at all. Some of them appear in a series named as the author, I'm guessing because I'm following the wrong naming conventions.

@wtanksleyjr commented on GitHub (Jan 18, 2022): > When scanning, if an .opf file is present it will be parsed and mapped to audiobookshelf fields. Can someone who's gotten this working please post an example, ideally using all of the supported fields? There doesn't seem to be any document that defines this format in the terms used in this discussion (the fields of that format don't include an "author", for example, but rather a "contributor" with a subfield something like "aut" IIRC). It's incredibly complicated so far as I can tell. Is it possible to set a book's series using this OPF file? I can't tell how to do that at all, but some of my books appear in series for reasons I ... can't tell at all. Some of them appear in a series named as the author, I'm guessing because I'm following the wrong naming conventions.
Author
Owner

@advplyr commented on GitHub (Jan 18, 2022):

There is no documentation written for the specifics of how OPF files are mapped to abs details. Here is the relevant code for the parsing: https://github.com/advplyr/audiobookshelf/blob/master/server/utils/parseOpfMetadata.js#L73.
I linked to the line that will pull the series from the OPF file, using tag calibre:series

The incorrect series getting filled out for you is most likely due to your folder structure. See https://www.audiobookshelf.org/docs#structure
You can use the folder structures:

<Author>/<Series>/<Title>/
<Author>/<Title>/
<Title>/
@advplyr commented on GitHub (Jan 18, 2022): There is no documentation written for the specifics of how OPF files are mapped to abs details. Here is the relevant code for the parsing: https://github.com/advplyr/audiobookshelf/blob/master/server/utils/parseOpfMetadata.js#L73. I linked to the line that will pull the series from the OPF file, using tag `calibre:series` The incorrect series getting filled out for you is most likely due to your folder structure. See https://www.audiobookshelf.org/docs#structure You can use the folder structures: ``` <Author>/<Series>/<Title>/ <Author>/<Title>/ <Title>/ ```
Author
Owner

@wtanksleyjr commented on GitHub (Jan 18, 2022):

Thank you for both pointers! That doesn't look too hard to figure out. And
of course I see what my directory structure got wrong.

-Wm

On Mon, Jan 17, 2022 at 4:17 PM advplyr @.***> wrote:

There is no documentation written for the specifics of how OPF files are
mapped to abs details. Here is the relevant code for the parsing:
https://github.com/advplyr/audiobookshelf/blob/master/server/utils/parseOpfMetadata.js#L73
.
I linked to the line that will pull the series from the OPF file, using
tag calibre:series

The incorrect series getting filled out for you is most likely due to your
folder structure. See https://www.audiobookshelf.org/docs#structure
You can use the folder structures:
//

@wtanksleyjr commented on GitHub (Jan 18, 2022): Thank you for both pointers! That doesn't look too hard to figure out. And of course I see what my directory structure got wrong. -Wm On Mon, Jan 17, 2022 at 4:17 PM advplyr ***@***.***> wrote: > There is no documentation written for the specifics of how OPF files are > mapped to abs details. Here is the relevant code for the parsing: > https://github.com/advplyr/audiobookshelf/blob/master/server/utils/parseOpfMetadata.js#L73 > . > I linked to the line that will pull the series from the OPF file, using > tag calibre:series > > The incorrect series getting filled out for you is most likely due to your > folder structure. See https://www.audiobookshelf.org/docs#structure > You can use the folder structures: > //<Title>/ > /<Title>/ > <Title>/ > > — > Reply to this email directly, view it on GitHub > <https://github.com/advplyr/audiobookshelf/issues/141#issuecomment-1014968269>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAJ7H6JMRCAL5OB7WYVADELUWSWQNANCNFSM5GVGCGTA> > . > Triage notifications on the go with GitHub Mobile for iOS > <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> > or Android > <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. > > You are receiving this because you commented.Message ID: > ***@***.***> >
Author
Owner

@advplyr commented on GitHub (Jun 18, 2022):

This was added a while back and just getting refined now

@advplyr commented on GitHub (Jun 18, 2022): This was added a while back and just getting refined now
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/audiobookshelf#74