I admit it! Audiobooks are great. I listen to audiobooks when washing dishes, doing laundry, hitting the gym or commuting. Everyone thought that books are a thing of the past. But no sir! Audiobooks have changed the landscape totally! They are handy when multitasking and also suits those who are seeking accessible content. That’s why I wanted to share this post!

A recent paper, “Large-Scale Automatic Audiobook Creation,” introduced an AI-driven system. This system uses Large Language Models (like ChatGPT) and Audio Generation to convert e-books into quality audiobooks. Let’s take a look what that means.

Issues with Traditional Audiobooks

Producing audiobooks is both time-intensive and costly. Typically, they rely on human to read the text, resulting in varying quality. We all have started listening an audiobook and had to stop as you just didn’t like the narrator. A pity, as the book itself might have been good.

With books being published at a rapid pace, the process gets more complex. But what if AI could simplify this? What if you could have your favorite reader narrating every book? Or change the narrator if you didn’t like the current one?

AI: The Game Changer

Researchers from Microsoft and MIT collaborated with Project Gutenberg a project to bring AI into the audiobook game. Their goal was to develop a system for automatic e-book to audiobook conversion. The researchers tapped into Project Gutenberg’s vast e-book collection and harnessed advanced neural text-to-speech techniques. Consequently, thousands of audiobooks emerged, rivalling human quality.

Yes, similar systems have existed previously and companies such as Eleven Labs and Descript have created text-to-voice systems. But this system has a unique feature. It can discern content types within a book. So, non-fiction gets a clear, neutral narration, while fictional pieces with dialogues receive a dynamic reading, mimicking “acting.” Moreover, users can personalize the narration style and even voice, making it truly their own.

Sample the Magic!

I checked some of the books available at Project Gutenberg and I have to say, they are pretty good! Here is one example of the collaboration, an audio book of Edgar Allan Poe’s texts. For more, check the Link here. Impressive, right?

A Leap for Audiobook Accessibility

The project yielded over 5,000 open-license audiobooks, amassing 35,000 speech hours. The Project Gutenberg is a treasure trove that spans from literature classics to biographies. You can go and check it out. There’s an interactive demo, where users can sample titles, adjust voices, and even mimic their own voice.

Wrapping Up

This AI system is a landmark for audiobooks. Merging AI with Audio Generation offers quality, speed, and affordability. With the growing demand for audiobooks, such advancements herald an era of richer auditory experiences. What a time to be alive.

Explore the full AI-audiobook collection here.

Curious about Large Language Models? Dive into our detailed blog post!

Facebook
Twitter
LinkedIn

Subscribe to Newsletter

Enter your email address to register to our newsletter subscription!

About QuestForAI

QuestForAI is a free resource site for AI enthusiasts. The main goal of this site is to provide high quality tutorials, instructions, ideas and examples on how to use Artificial Intelligence to make your life easier.

Follow us

Copyright Orkestr.io Oy 2023