[Bug]: Some podcasts are showing incorrect characters #943

Closed
opened 2026-04-24 23:27:10 +02:00 by adam · 3 comments
Owner

Originally created by @adocampo on GitHub (Feb 7, 2023).

Describe the issue

I've just several podcasts and on some of them (2 at the moment) on some letters with accents (i.e.: á, à, é, è, í, ò, ó and ú) and other non-English letters like ç, all those are downloaded and shown as � both in audiobookshelf and file system. Other Spanish and Catalan podcasts are downloading and showing the text just fine, the problem is just with those two

The RSS are : https://dinamics.ccma.cat/public/podcast/catradio/xml/9/5/podprograma1859.xml (another from the same source, with the same issues: http://dinamics.ccma.cat/public/podcast/catradio/xml/4/4/podprograma1744.xml) and https://www.ivoox.com/feed_fg_f1455079_filtro_1.xml
As you can see, the origin has the proper characters
image
But when downloaded, they all are shown like �
image
image
or the one from ivoox, with this &#XXX; encoding, in green , some entries with the proper encoding, weird...
image

I've noticed they are shown wrong just adding the show
image

By renaming the files at filesystem level, they show properly on ABS, and editing manually the details on ABS and saving them, also shows the texts just fine, the problem is only when downloading new episodes.

Steps to reproduce the issue

  1. Add those podcast with some non-english text https://dinamics.ccma.cat/public/podcast/catradio/xml/9/5/podprograma1859.xml, http://dinamics.ccma.cat/public/podcast/catradio/xml/4/4/podprograma1744.xml) and https://www.ivoox.com/feed_fg_f1455079_filtro_1.xml
  2. See how the strange characters are printed both in ABS and in the file system when de episodes are downloaded.

Audiobookshelf version

v2.2.12

How are you running audiobookshelf?

Docker

Originally created by @adocampo on GitHub (Feb 7, 2023). ### Describe the issue I've just several podcasts and on some of them (2 at the moment) on some letters with accents (i.e.: á, à, é, è, í, ò, ó and ú) and other non-English letters like ç, all those are downloaded and shown as � both in audiobookshelf and file system. **Other Spanish and Catalan podcasts are downloading and showing the text just fine, the problem is just with those two** The RSS are : https://dinamics.ccma.cat/public/podcast/catradio/xml/9/5/podprograma1859.xml (another from the same source, with the same issues: http://dinamics.ccma.cat/public/podcast/catradio/xml/4/4/podprograma1744.xml) and https://www.ivoox.com/feed_fg_f1455079_filtro_1.xml As you can see, the origin has the proper characters ![image](https://user-images.githubusercontent.com/2266612/217105280-12968fd2-6ff2-490a-9644-55f75a2c8eaa.png) But when downloaded, they all are shown like � ![image](https://user-images.githubusercontent.com/2266612/217105728-f9330125-50b8-4ed2-a1a4-6986bca18b19.png) ![image](https://user-images.githubusercontent.com/2266612/217105954-55533d98-7989-4d55-a42e-823895428c24.png) or the one from ivoox, with this `&#XXX;` encoding, in green , some entries with the proper encoding, weird... ![image](https://user-images.githubusercontent.com/2266612/217290773-4b7a4417-6ddf-4df5-9590-75743ab88286.png) I've noticed they are shown wrong just adding the show ![image](https://user-images.githubusercontent.com/2266612/217105868-ec0b49e9-05d0-4253-bddf-493e617edc36.png) By renaming the files at filesystem level, they show properly on ABS, and editing manually the details on ABS and saving them, also shows the texts just fine, the problem is only when downloading new episodes. ### Steps to reproduce the issue 1. Add those podcast with some non-english text https://dinamics.ccma.cat/public/podcast/catradio/xml/9/5/podprograma1859.xml, http://dinamics.ccma.cat/public/podcast/catradio/xml/4/4/podprograma1744.xml) and https://www.ivoox.com/feed_fg_f1455079_filtro_1.xml 2. See how the strange characters are printed both in ABS and in the file system when de episodes are downloaded. ### Audiobookshelf version v2.2.12 ### How are you running audiobookshelf? Docker
adam added the bug label 2026-04-24 23:27:10 +02:00
adam closed this issue 2026-04-24 23:27:10 +02:00
Author
Owner

@advplyr commented on GitHub (Feb 11, 2023):

I was only able to see mojibake with the first RSS feed.

With https://www.ivoox.com/feed_fg_f1455079_filtro_1.xml this was coming back with utf-8 encoding.

This one is not returning anything http://dinamics.ccma.cat/public/podcast/catradio/xml/4/4/podprograma1744.xml

The solution I implemented relies on the content-type header in the response to include charset like charset=iso-8859-1

@advplyr commented on GitHub (Feb 11, 2023): I was only able to see mojibake with the first RSS feed. With https://www.ivoox.com/feed_fg_f1455079_filtro_1.xml this was coming back with utf-8 encoding. This one is not returning anything http://dinamics.ccma.cat/public/podcast/catradio/xml/4/4/podprograma1744.xml The solution I implemented relies on the content-type header in the response to include charset like `charset=iso-8859-1`
Author
Owner

@advplyr commented on GitHub (Feb 12, 2023):

Fixed in v2.2.15

@advplyr commented on GitHub (Feb 12, 2023): Fixed in [v2.2.15](https://github.com/advplyr/audiobookshelf/releases/tag/v2.2.15)
Author
Owner

@nation-wide commented on GitHub (Oct 30, 2024):

Is it possible to have this same fix for .nfo files parser? If .nfo file's charset is iso-8859-1 it will show the audiobook's "ä, å, ö" etc. as �.

@nation-wide commented on GitHub (Oct 30, 2024): Is it possible to have this same fix for .nfo files parser? If .nfo file's charset is iso-8859-1 it will show the audiobook's "ä, å, ö" etc. as �.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/audiobookshelf#943