[Enhancement]: Add audioteka.com.pl as metadata provider #2073

Closed
opened 2026-04-25 00:03:10 +02:00 by adam · 2 comments
Owner

Originally created by @izikeros on GitHub (Jun 27, 2024).

Type of Enhancement

Server Backend

Describe the Feature/Enhancement

There is a service (Audioteka) that provides wide collection of audiobooks and has metadata for them. In audiobookshelf there are already providers of metadata, covers - this would be another possible source of metadata.

Why would this be helpful?

It would complement existing sources since it can provide metadata in Polish language. Use case: user has a collection of polish audiobooks (e.g. titles are in polish) and would like to enrich the library with metadata

Future Implementation (Screenshot)

I have some context info to AI and generated the code that has some chance to work. I'm not a javascript programmer and can't run/debug it. This might be some starting point.

const axios = require('axios').default
const cheerio = require('cheerio')
const Logger = require('../Logger')

class AudiotekaProvider {
  #responseTimeout = 30000

  constructor() {}

  /**
   * Search for an audiobook on audioteka.com.pl
   * @param {string} title
   * @param {string} author
   * @param {string} isbn
   * @param {string} providerSlug
   * @param {string} mediaType
   * @param {number} [timeout] response timeout in ms
   * @returns {Promise<Object[]>}
   */
  async search(title, author, isbn, providerSlug, mediaType, timeout = this.#responseTimeout) {
    if (!timeout || isNaN(timeout)) timeout = this.#responseTimeout

    const encodedTitle = encodeURIComponent(title)
    const url = `https://audioteka.com/pl/search?query=${encodedTitle}`

    try {
      const response = await axios.get(url, { timeout })
      const $ = cheerio.load(response.data)

      const results = []
      $('.product-tile').each((index, element) => {
        const productUrl = $(element).find('a').attr('href')
        results.push(this.scrapeAudiobookDetails(productUrl))
      })

      return Promise.all(results)
    } catch (error) {
      Logger.error('[AudiotekaProvider] Search error', error)
      return []
    }
  }

  /**
   * Scrape audiobook details from a specific URL
   * @param {string} url
   * @returns {Promise<Object>}
   */
  async scrapeAudiobookDetails(url) {
    try {
      const response = await axios.get(url, { timeout: this.#responseTimeout })
      const $ = cheerio.load(response.data)

      const jsonLd = JSON.parse($('script[type="application/ld+json"]').html())

      const title = jsonLd.name
      const authors = jsonLd.author.split(', ')
      const narrator = jsonLd.readBy
      const publisher = jsonLd.publisher
      const publishedYear = new Date(jsonLd.datePublished).getFullYear()
      const description = $('article p').text().trim()
      const cover = jsonLd.image

      return {
        title,
        subtitle: null,
        author: authors.join(', '),
        narrator,
        publisher,
        publishedYear,
        description,
        cover,
        isbn: null,
        asin: null,
        genres: null,
        tags: null,
        series: null,
        language: 'pl',
        duration: null
      }
    } catch (error) {
      Logger.error('[AudiotekaProvider] Scraping error', error)
      return null
    }
  }
}

module.exports = AudiotekaProvider

Audiobookshelf Server Version

v2.10.1

Current Implementation (Screenshot)

None

Originally created by @izikeros on GitHub (Jun 27, 2024). ### Type of Enhancement Server Backend ### Describe the Feature/Enhancement There is a service (Audioteka) that provides wide collection of audiobooks and has metadata for them. In audiobookshelf there are already providers of metadata, covers - this would be another possible source of metadata. ### Why would this be helpful? It would complement existing sources since it can provide metadata in Polish language. Use case: user has a collection of polish audiobooks (e.g. titles are in polish) and would like to enrich the library with metadata ### Future Implementation (Screenshot) I have some context info to AI and generated the code that has some chance to work. I'm not a javascript programmer and can't run/debug it. This might be some starting point. ```js const axios = require('axios').default const cheerio = require('cheerio') const Logger = require('../Logger') class AudiotekaProvider { #responseTimeout = 30000 constructor() {} /** * Search for an audiobook on audioteka.com.pl * @param {string} title * @param {string} author * @param {string} isbn * @param {string} providerSlug * @param {string} mediaType * @param {number} [timeout] response timeout in ms * @returns {Promise<Object[]>} */ async search(title, author, isbn, providerSlug, mediaType, timeout = this.#responseTimeout) { if (!timeout || isNaN(timeout)) timeout = this.#responseTimeout const encodedTitle = encodeURIComponent(title) const url = `https://audioteka.com/pl/search?query=${encodedTitle}` try { const response = await axios.get(url, { timeout }) const $ = cheerio.load(response.data) const results = [] $('.product-tile').each((index, element) => { const productUrl = $(element).find('a').attr('href') results.push(this.scrapeAudiobookDetails(productUrl)) }) return Promise.all(results) } catch (error) { Logger.error('[AudiotekaProvider] Search error', error) return [] } } /** * Scrape audiobook details from a specific URL * @param {string} url * @returns {Promise<Object>} */ async scrapeAudiobookDetails(url) { try { const response = await axios.get(url, { timeout: this.#responseTimeout }) const $ = cheerio.load(response.data) const jsonLd = JSON.parse($('script[type="application/ld+json"]').html()) const title = jsonLd.name const authors = jsonLd.author.split(', ') const narrator = jsonLd.readBy const publisher = jsonLd.publisher const publishedYear = new Date(jsonLd.datePublished).getFullYear() const description = $('article p').text().trim() const cover = jsonLd.image return { title, subtitle: null, author: authors.join(', '), narrator, publisher, publishedYear, description, cover, isbn: null, asin: null, genres: null, tags: null, series: null, language: 'pl', duration: null } } catch (error) { Logger.error('[AudiotekaProvider] Scraping error', error) return null } } } module.exports = AudiotekaProvider ``` ### Audiobookshelf Server Version v2.10.1 ### Current Implementation (Screenshot) None
adam added the enhancement label 2026-04-25 00:03:10 +02:00
adam closed this issue 2026-04-25 00:03:10 +02:00
Author
Owner

@izikeros commented on GitHub (Jun 27, 2024):

There are guys that tried scrap the data, and produce OPF file in #602

@izikeros commented on GitHub (Jun 27, 2024): There are guys that tried scrap the data, and produce OPF file in #602
Author
Owner

@nichwall commented on GitHub (Jun 27, 2024):

Duplicate of https://github.com/advplyr/audiobookshelf/issues/2598

Can you provide public API documentation? If there is not a public API (requiring scraping the web page within ABS), this would fit better as a custom metadata provider instead of being within ABS.

https://www.audiobookshelf.org/guides/custom-metadata-providers

@nichwall commented on GitHub (Jun 27, 2024): Duplicate of https://github.com/advplyr/audiobookshelf/issues/2598 Can you provide public API documentation? If there is not a public API (requiring scraping the web page within ABS), this would fit better as a custom metadata provider instead of being within ABS. https://www.audiobookshelf.org/guides/custom-metadata-providers
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/audiobookshelf#2073