[Enhancement]: TTS (Text to speech) #362

Open
opened 2026-04-24 23:06:03 +02:00 by adam · 15 comments
Owner

Originally created by @kobemtl on GitHub (May 17, 2022).

Describe the feature/enhancement

Hi, Thanks a lot the great app first. I am just wondering if its possible to integrate TTS? I am using some apps with TTS feature to listen epub or aws ebook all the time. It will be nice audiobookshelf could have this feature. Thanks again.

Originally created by @kobemtl on GitHub (May 17, 2022). ### Describe the feature/enhancement Hi, Thanks a lot the great app first. I am just wondering if its possible to integrate TTS? I am using some apps with TTS feature to listen epub or aws ebook all the time. It will be nice audiobookshelf could have this feature. Thanks again.
adam added the enhancementebooks labels 2026-04-24 23:06:03 +02:00
Author
Owner

@kobemtl commented on GitHub (Jun 14, 2022):

I am using Moon+ Reader for many years.

@kobemtl commented on GitHub (Jun 14, 2022): I am using Moon+ Reader for many years.
Author
Owner

@danielzrob commented on GitHub (Mar 22, 2023):

oh so this, I would LOVE to see some tts in this. My dumb brain will only allow books go in my ears and there are so many books that have not audio option.

I currently use SayIt on android, but with all this cool AI TTS going on right I am trying to find something better.

@danielzrob commented on GitHub (Mar 22, 2023): oh so this, I would LOVE to see some tts in this. My dumb brain will only allow books go in my ears and there are so many books that have not audio option. I currently use SayIt on android, but with all this cool AI TTS going on right I am trying to find something better.
Author
Owner

@p0n1 commented on GitHub (Nov 10, 2023):

Just noticed this feature request. I'm a heavy audiobookshelf user. Yes, many books I love to read don't have audio version. So I made this simple tool https://github.com/p0n1/epub_to_audiobook to convert epub books into audiobooks. It works great for me. I have consumed many books using this tool together with the great audiobookshelf.

I just supported OpenAI TTS which is unbelievable awesome. Not sure if it's a good idea to integrate the TTS features into audiobookshelf itself because we can convert outside and import easily.

@p0n1 commented on GitHub (Nov 10, 2023): Just noticed this feature request. I'm a heavy audiobookshelf user. Yes, many books I love to read don't have audio version. So I made this simple tool https://github.com/p0n1/epub_to_audiobook to convert epub books into audiobooks. It works great for me. I have consumed many books using this tool together with the great audiobookshelf. I just supported OpenAI TTS which is unbelievable awesome. Not sure if it's a good idea to integrate the TTS features into audiobookshelf itself because we can convert outside and import easily.
Author
Owner

@uniquePWD commented on GitHub (Dec 7, 2025):

There hasn't been any activity in this bug since 2023?

Is this something the project would be interested in? In #1189 someone even offered a PR and that seems to have fallen through the cracks. It was also requested in #601

My own request was in #1743 where I requested

### Describe the Feature/Enhancement

Can I bother you to request Text To Speech on Android. 

https://developer.android.com/reference/android/speech/tts/TextToSpeech


Double bubble would be if it uses a proper media notification rather than creates a custom notification. 

https://developer.android.com/media/implement/surfaces/mobile

### Why would this be helpful?

Well it is ABS after all. So allowing people to listen to their books seems like a great idea. 

### Future Implementation (Screenshot)

It would add a play button to the navigation bar. 

### Audiobookshelf App Version

Android App - 0.11.0

### Current Implementation (Screenshot)

N/A
@uniquePWD commented on GitHub (Dec 7, 2025): There hasn't been any activity in this bug since 2023? Is this something the project would be interested in? In #1189 someone even offered a PR and that seems to have fallen through the cracks. It was also requested in #601 My own request was in #1743 where I requested ``` ### Describe the Feature/Enhancement Can I bother you to request Text To Speech on Android. https://developer.android.com/reference/android/speech/tts/TextToSpeech Double bubble would be if it uses a proper media notification rather than creates a custom notification. https://developer.android.com/media/implement/surfaces/mobile ### Why would this be helpful? Well it is ABS after all. So allowing people to listen to their books seems like a great idea. ### Future Implementation (Screenshot) It would add a play button to the navigation bar. ### Audiobookshelf App Version Android App - 0.11.0 ### Current Implementation (Screenshot) N/A ```
Author
Owner

@Phoenix-Grand commented on GitHub (Dec 7, 2025):

I would very much like this TTS feature.
+1 vote from me.

@Phoenix-Grand commented on GitHub (Dec 7, 2025): I would very much like this TTS feature. +1 vote from me.
Author
Owner

@uniquePWD commented on GitHub (Dec 11, 2025):

@advplyr would you be willing to accept a patch from @cutiepoka?

@uniquePWD commented on GitHub (Dec 11, 2025): @advplyr would you be willing to accept a patch from @cutiepoka?
Author
Owner

@nichwall commented on GitHub (Dec 11, 2025):

There hasn't been any activity in this bug since 2023?

There are currently over 1000 open bug reports and feature requests between the server and app repositories, and this is a project we work on for fun in our free time (we all have day jobs).

Is this something the project would be interested in? In #1189 someone even offered a PR and that seems to have fallen through the cracks. It was also requested in #601

Not sure what you are referencing, because I'm not finding a PR numbered 1189 and this is 601.

@nichwall commented on GitHub (Dec 11, 2025): > There hasn't been any activity in this bug since 2023? There are currently over 1000 open bug reports and feature requests between the server and app repositories, and this is a project we work on for fun in our free time (we all have day jobs). > Is this something the project would be interested in? In #1189 someone even offered a PR and that seems to have fallen through the cracks. It was also requested in #601 Not sure what you are referencing, because I'm not finding a PR numbered 1189 and this is 601.
Author
Owner

@Oxika95 commented on GitHub (Jan 23, 2026):

Not sure if it's a good idea to integrate the TTS features into audiobookshelf itself because we can convert outside and import easily.

Generating full length audio for a book seems a bit overkill and while possible it certainly isn't easy. I have tried several times to get full length generation set up but I don't have hardware good enough to support it on my server. I have also tried it on a decently powered home PC and it can take hours and is prone to failure. It isn't even a better solution in my opinion as I like being able to switch between reading and listening which TTS enables. I feel like TTS at least for mobile apps would be qualitative upgrade. Simply allowing for onboard device TTS would be great. Really hope to see a turn around on the stance regarding TTS support.

@Oxika95 commented on GitHub (Jan 23, 2026): > Not sure if it's a good idea to integrate the TTS features into audiobookshelf itself because we can convert outside and import easily. Generating full length audio for a book seems a bit overkill and while possible it certainly isn't easy. I have tried several times to get full length generation set up but I don't have hardware good enough to support it on my server. I have also tried it on a decently powered home PC and it can take hours and is prone to failure. It isn't even a better solution in my opinion as I like being able to switch between reading and listening which TTS enables. I feel like TTS at least for mobile apps would be qualitative upgrade. Simply allowing for onboard device TTS would be great. Really hope to see a turn around on the stance regarding TTS support.
Author
Owner

@nichwall commented on GitHub (Jan 24, 2026):

Generating full length audio for a book seems a bit overkill and while possible it certainly isn't easy. I have tried several times to get full length generation set up but I don't have hardware good enough to support it on my server. I have also tried it on a decently powered home PC and it can take hours and is prone to failure. It isn't even a better solution in my opinion as I like being able to switch between reading and listening which TTS enables. I feel like TTS at least for mobile apps would be qualitative upgrade. Simply allowing for onboard device TTS would be great. Really hope to see a turn around on the stance regarding TTS support.

Not sure exactly what we would need to do here since most devices already have built in TTS engines as you mentioned that can read what is on the screen. We don't include automatic caption generation support either (speech to text) because this is also supported natively by many devices, and generating text ahead of time (or audio like mentioned above) is basically just combining the ebook and audiobook in the same library item as has already been discussed.

@nichwall commented on GitHub (Jan 24, 2026): > Generating full length audio for a book seems a bit overkill and while possible it certainly isn't easy. I have tried several times to get full length generation set up but I don't have hardware good enough to support it on my server. I have also tried it on a decently powered home PC and it can take hours and is prone to failure. It isn't even a better solution in my opinion as I like being able to switch between reading and listening which TTS enables. I feel like TTS at least for mobile apps would be qualitative upgrade. Simply allowing for onboard device TTS would be great. Really hope to see a turn around on the stance regarding TTS support. Not sure exactly what we would need to do here since most devices already have built in TTS engines as you mentioned that can read what is on the screen. We don't include automatic caption generation support either (speech to text) because this is also supported natively by many devices, and generating text ahead of time (or audio like mentioned above) is basically just combining the ebook and audiobook in the same library item as has already been discussed.
Author
Owner

@Oxika95 commented on GitHub (Jan 24, 2026):

On board TTS engines are great but the screen reader functionality is absolutely awful. They don't allow continuous reading, often requiring you to monitor the screen restart TTS after it finishes doesn't track progress often turns off if the screen or app switches is usually part of an accessibility service that reads out everything like incoming calls texts and notifications naming the buttons or elements you click interupt or prevent clicks until it has finished announcing out loud what it is you clicked last.

Look at any TTS ereader and look at how its implemented. Look at @voice aloud reader literally the golden standard if dated in appearance. TTS can be so much better and now with options for nural AI voices it is frequently better than many voice actors, imho.

A play button that initiates internal TTS or linked to the built in engine that can track progress would be huge. 75% of my library is ebooks. I don't use audiobookshelf to read them even though I would love to because manual download and upload to another app with TTS is a better experience.

@Oxika95 commented on GitHub (Jan 24, 2026): On board TTS engines are great but the screen reader functionality is absolutely awful. They don't allow continuous reading, often requiring you to monitor the screen restart TTS after it finishes doesn't track progress often turns off if the screen or app switches is usually part of an accessibility service that reads out everything like incoming calls texts and notifications naming the buttons or elements you click interupt or prevent clicks until it has finished announcing out loud what it is you clicked last. Look at any TTS ereader and look at how its implemented. Look at @voice aloud reader literally the golden standard if dated in appearance. TTS can be so much better and now with options for nural AI voices it is frequently better than many voice actors, imho. A play button that initiates internal TTS or linked to the built in engine that can track progress would be huge. 75% of my library is ebooks. I don't use audiobookshelf to read them even though I would love to because manual download and upload to another app with TTS is a better experience.
Author
Owner

@uniquePWD commented on GitHub (Jan 24, 2026):

Clarification on the Request: Client-Side TTS with Progress Syncing

There appears to be a misunderstanding regarding the desired implementation. This request is not for the server to generate audio files (which is storage/CPU intensive).

The request is for a Client-Side feature for the mobile app that utilizes the device's native capabilities.

The Goal
For the Android app to open an epub and use the device's native System TTS (or a user-selected engine) to read the text aloud. This playback should behave exactly like a standard audio player, appearing in the notification shade, continuing playback with the screen off, responding to Bluetooth controls, and syncing progress to the server.

Technical Implementation
This feature would likely require integrating the standard Android TTS engine into a Foreground Service that manages a MediaSession.

  • Text-to-Speech: The app would utilize the TextToSpeech class. This allows the app to offload generation to the user's preferred installed engine (supporting modern Neural/AI on-device models without extra app overhead).

  • Media Controls: To ensure the TTS acts like a media player (lock screen controls, Bluetooth play/pause), the audio stream must be managed via a MediaSession (or MediaSessionService in Media3).

  • Background Playback: To keep reading while the screen is off, the app must run a Foreground Service with the mediaPlayback type.

  • Progress Syncing: The UtteranceProgressListener can be used to track which sentence/paragraph is currently being spoken, allowing the app to calculate progress and sync it back to the AudioBookShelf server.

@uniquePWD commented on GitHub (Jan 24, 2026): **Clarification on the Request: Client-Side TTS with Progress Syncing** There appears to be a misunderstanding regarding the desired implementation. This request is **not** for the server to generate audio files (which is storage/CPU intensive). The request is for a **Client-Side** feature for the mobile app that utilizes the device's native capabilities. **The Goal** For the Android app to open an epub and use the device's native System TTS (or a user-selected engine) to read the text aloud. This playback should behave exactly like a standard audio player, appearing in the notification shade, continuing playback with the screen off, responding to Bluetooth controls, and syncing progress to the server. **Technical Implementation** This feature would likely require integrating the standard Android TTS engine into a Foreground Service that manages a MediaSession. * **Text-to-Speech:** The app would utilize the `TextToSpeech` class. This allows the app to offload generation to the user's preferred installed engine (supporting modern **Neural/AI on-device models** without extra app overhead). * [Android Docs: TextToSpeech](https://developer.android.com/reference/android/speech/tts/TextToSpeech) * **Media Controls:** To ensure the TTS acts like a media player (lock screen controls, Bluetooth play/pause), the audio stream must be managed via a `MediaSession` (or `MediaSessionService` in Media3). * [Android Docs: Background playback with MediaSessionService](https://developer.android.com/media/media3/session/background-playback) * **Background Playback:** To keep reading while the screen is off, the app must run a Foreground Service with the `mediaPlayback` type. * [Android Docs: Foreground service types (Media Playback)](https://developer.android.com/develop/background-work/services/fgs/service-types#media-playback) * **Progress Syncing:** The `UtteranceProgressListener` can be used to track which sentence/paragraph is currently being spoken, allowing the app to calculate progress and sync it back to the AudioBookShelf server. * [Android Docs: UtteranceProgressListener](https://developer.android.com/reference/android/speech/tts/UtteranceProgressListener)
Author
Owner

@uniquePWD commented on GitHub (Feb 15, 2026):

I've been following the progress on PR #1747 (Media3 Architecture), and it looks fantastic.

Just wanted to note that the move to Media3 and MediaSessionService effectively solves the biggest technical hurdle for this request. With that architecture in place, plugging in a Local TTS engine as a playback source becomes much cleaner since the MediaSession handling and background service logic are now standardized.

Exciting to see the groundwork being laid!

@uniquePWD commented on GitHub (Feb 15, 2026): I've been following the progress on PR #1747 (Media3 Architecture), and it looks fantastic. Just wanted to note that the move to Media3 and MediaSessionService effectively solves the biggest technical hurdle for this request. With that architecture in place, plugging in a Local TTS engine as a playback source becomes much cleaner since the MediaSession handling and background service logic are now standardized. Exciting to see the groundwork being laid!
Author
Owner

@iredmedia commented on GitHub (Feb 23, 2026):

@uniquePWD Link seems wrong? Links to "Add dutch translation".

@iredmedia commented on GitHub (Feb 23, 2026): @uniquePWD Link seems wrong? Links to "Add dutch translation".
Author
Owner

@uniquePWD commented on GitHub (Feb 23, 2026):

@uniquePWD Link seems wrong? Links to "Add dutch translation".

Try this link: https://github.com/advplyr/audiobookshelf-app/pull/1747/commits/

@uniquePWD commented on GitHub (Feb 23, 2026): > [@uniquePWD](https://github.com/uniquePWD) Link seems wrong? Links to "Add dutch translation". Try this link: https://github.com/advplyr/audiobookshelf-app/pull/1747/commits/
Author
Owner

@v4u6h4n commented on GitHub (Mar 15, 2026):

I would love to be able to have TTS speech options with ebooks. There is just so much of what I read that isn't available in audiobook form, and I am really not happy with the options on linux for listening to ebooks.

@v4u6h4n commented on GitHub (Mar 15, 2026): I would love to be able to have TTS speech options with ebooks. There is just so much of what I read that isn't available in audiobook form, and I am really not happy with the options on linux for listening to ebooks.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/audiobookshelf#362