Skip to main content
Midsummerr
ListenFeaturesServicesPricingAboutBlog
Sign InGet Started
  1. Blog
  2. /
  3. Guides

How Long Does It Take to Make an Audiobook?

Traditional audiobook production runs weeks to months. Here's where the time goes — casting, recording, editing, retail review — and how full-cast production compresses it to days.

Midsummerr|July 5, 2026|7 min read
Watercolor hourglass

TL;DR

A traditional human-narrated audiobook usually takes several weeks to a few months from decision to on-sale: casting and auditions, then production (about six hours of narrator and studio work per finished hour), then retail review of roughly two to three weeks. A finished hour of audio is about 9,300 words, so a 60,000-word novel is around 6.5 listening hours — and about 40 hours of production work before a single retail step. Cinematic full-cast production generates the entire performance — cast, music, sound design — in a few hours; what remains is tweaking, so a fully dramatized audiobook ships in days rather than months.

Ready to price your audiobook? Compare Self-Serve, Director-Led, and Voice Conversion →

In this article

  1. 01The short answer
  2. 02One narrator, full cast, or cinematic?
  3. 03Where the time actually goes in traditional production
  4. 04A worked example: a 60,000-word novel
  5. 05How full-cast production compresses the timeline
  6. 06Does faster mean lower quality?
  7. 07FAQ
  8. 08Takeaways

If you are asking how long it takes to make an audiobook, the honest answer is that it depends far more on scheduling than on writing. The manuscript is already done. What stretches the timeline is everything that happens after: finding a narrator, booking studio time, recording at roughly real-world speed, editing, proofing, and then waiting in a retail review queue. For a full-length novel produced the traditional way, that chain usually runs several weeks to a few months.

This guide breaks the timeline down stage by stage so you can estimate your own book, then shows where full-cast production removes the slowest steps entirely.

The short answer

A finished hour of audiobook is about 9,300 words — the standard conversion rate used across the industry and by ACX, based on a natural reading pace of roughly 150–160 words per minute. So a 60,000-word novel is around 6.5 finished hours of listening.

Producing those 6.5 hours the traditional way takes real calendar time:

StageTraditional human narrationCinematic full-cast production (Midsummerr)
Casting / auditionsDays to weeksBuilt in — choose from the cast library
Recording~2 studio hours per finished hourGenerated from the manuscript in a few hours
Editing, proofing, masteringBrings total work to ~6 hours per finished hourTweaking in the timeline editor — as deep as you want
Agreement → finished audio3–6 weeks (often longer)Days
Retail review (platform-dependent)~2–3 weeksSame platform step applies

The writing is finished before any of this starts. The time you are really asking about is production time — and that is the part the format you choose can change dramatically.

Ready to try it yourself?

Create your first audiobook free →

One narrator, full cast, or cinematic?

Timelines depend on what you are producing, so it helps to name the three formats. A single-narrator audiobook is one voice reading everything — the traditional default, and what the human timeline below describes. A full-cast production gives each character a distinct voice. A cinematic (dramatized) adaptation goes a step further: full cast plus music and sound design, produced like an audio drama rather than a read-through.

On the traditional route, each step up that ladder multiplies the schedule — more actors to cast, more sessions to book, more mixing. That is why dramatized audiobooks have historically been reserved for flagship titles. (Full cast vs single narrator covers when the format is worth it.) Midsummerr produces the cinematic format directly from the manuscript, which is why its timeline does not grow the way a studio production's does.

Where the time actually goes in traditional production

The reason a done manuscript still takes weeks to reach listeners is that each stage is gated by human availability and near-real-time recording.

Casting and auditions. Before a word is recorded, you have to find the right voice. On a marketplace like ACX — Audible's production and distribution platform — you post the title, wait for auditions, review samples, and negotiate terms. This alone can take days to weeks, and longer for a book that wants a specific accent, age, or dual-narrator pairing.

Recording. This is the hard floor of the traditional timeline. An experienced narrator records at roughly two studio hours for every finished hour of audio — a 10-hour book is about 20 hours in the booth, and that is before mistakes, retakes, and breaks. Newer narrators can take closer to five hours per finished hour. Recording cannot be sped up much, because the audio plays back in real time.

Editing, proofing, and mastering. Raw narration is not a finished audiobook. Once you add cleanup, proofing against the text (about 1.5 hours per finished hour on its own), pickups, and mastering, total production work reaches roughly six hours per finished hour for an experienced team. For our 6.5-hour novel, that is around 40 hours of skilled work spread across a schedule.

Retail review. After the audio is finished, most retail paths add a review queue. ACX, for example, runs a quality review of about 14 business days, after which a passing title is posted for sale within roughly 10 business days (Apple Books can need a couple more). That is two to three weeks after your audio is already done.

Add it up and the common range from "yes, let's make the audiobook" to "it's on sale" is a few weeks at the absolute fastest and, realistically, one to three months.

A worked example: a 60,000-word novel

Say you have finished a 60,000-word novel and want it in audio.

  • Length: ~6.5 finished hours (60,000 ÷ 9,300).
  • Recording: ~13 studio hours for an experienced narrator, spread across booked sessions.
  • Full production work: ~40 hours including editing, proofing, and mastering.
  • Calendar time: 3–6 weeks of production after casting, then ~2–3 weeks of retail review.

None of those hours are your writing. They are the cost of turning a finished book into finished audio one recorded minute at a time. That is the bottleneck full-cast production is built to remove.

How full-cast production compresses the timeline

Full-cast production changes the timeline because it does not record audio in real time. Instead of booking a narrator and capturing 6.5 hours minute by minute, Midsummerr works from the manuscript directly: it identifies characters, assigns distinct voices from a cast library, and generates the narration, music, and sound design as a produced piece rather than a single read-through.

That collapses the two slowest stages at once. There is no casting queue, because you pick voices from the library. There is no recording floor, because the audio is generated rather than performed live. What remains is the part that genuinely deserves human attention — review. You listen back in the timeline editor, adjust pacing, fix a pronunciation, re-cast a character, or tighten a scene, and the changes apply without re-booking anyone.

The practical numbers: generating the cinematic version of a full novel — cast, music, and sound design — takes a few hours, not weeks. The rest of the timeline is yours: tweaking the result in the timeline editor, which takes as long as you want to be meticulous. Some authors ship after a quick review pass; others fine-tune every scene's pacing and pauses, swap a casting choice, or fix a pronunciation. Bottom line: a fully dramatized audiobook in days, with the calendar time spent on your own polish rather than someone else's schedule.

Cost tracks the same shift. Our 60,000-word novel runs about $300 on Self-Serve at $5 per 1,000 words, versus the multi-thousand-dollar range typical of human production. If you want the full breakdown, see what an audiobook actually costs.

You can hear what a produced, full-cast result sounds like on any of the listening samples — the same output the timeline above produces in days.

Does faster mean lower quality?

Speed is only worth it if the result holds up, so it is a fair question. The time you save in full-cast production comes from removing scheduling and real-time recording — not from skipping the craft decisions that make an audiobook good.

A skilled human narrator brings interpretive depth, and for some titles a single trusted voice is exactly right. Full-cast production makes a different trade: distinct voices per character, integrated music and sound design, and unlimited editing, produced without the studio schedule. The full production process is the same set of decisions — casting, direction, mixing — compressed into a day of reviewing instead of weeks of booking.

FAQ

How long does it take to make an audiobook? A traditional human-narrated audiobook typically takes several weeks to a few months from decision to on-sale: casting and auditions, then production at about six hours of work per finished hour, then a retail review of roughly two to three weeks. Full-cast production compresses that to days by generating the narration instead of recording it live.

How many hours of audio is a typical novel? About 9,300 words equals one finished hour, so a 60,000-word novel is roughly 6.5 hours, an 80,000-word book is about 8.6 hours, and a 90,000-word book is about 10 hours.

Why does recording take so long? Audio plays back in real time, so it cannot be recorded faster than it is spoken. An experienced narrator needs about two studio hours per finished hour, and total production work reaches about six hours per finished hour once editing, proofing, and mastering are included.

How long is the retail review after the audio is finished? It depends on the platform. ACX, for example, runs a quality review of about 14 business days, then posts a passing title for sale within roughly 10 business days. That review step applies to whichever platform you distribute through, on top of production time.

Can Midsummerr produce a full audiobook in a day? Generation itself takes a few hours, even for a full-length novel — and that includes the full cast, music, and sound design. The remaining time is tweaking the result, which depends on how meticulous you want to be. In practice, a fully dramatized audiobook is a matter of days.

Takeaways

The manuscript is not the bottleneck — scheduling and real-time recording are. Traditional production spends roughly six hours of skilled work per finished hour and adds weeks of retail review, which is why a done book still takes weeks to months to reach listeners. Cinematic full-cast production generates the whole performance in hours and leaves the tweaking to you — so a fully dramatized audiobook ships in days.

Ready to see your own timeline? Check what your book would cost and how fast it can be produced.

Key takeaways

  • The industry standard is about 9,300 words per finished hour, so a 60,000-word novel is roughly 6.5 listening hours.
  • Human narration takes about six hours of work per finished hour once you include recording, editing, and proofing — before any retail review.
  • The slowest steps in traditional production are scheduling and recording, not the writing — which is exactly what full-cast production removes.
  • Midsummerr generates the full cinematic production — cast, music, sound design — in a few hours; the remaining time is your own tweaking, so a fully dramatized audiobook ships in days instead of months.

Ready to turn your book into a cinematic audiobook?

Full-cast AI voices, original music, and sound effects — production-ready in hours, not months.

Get Started FreeListen to Examples

Keep reading

Watercolor magnifying glass for thriller audiobook production
GuidesUpdated

Thriller Audiobook Production: How Full Cast Audio Builds Suspense

Mystery and thriller listeners want clarity, pace, and tension. Here's when full-cast audiobook production helps, what to listen for, and how to produce suspense audio without a studio-scale budget.

June 18, 2026·7 min read
Watercolor dragon circling an open book
GuidesUpdated

Fantasy Audiobook Production: Why Full Cast Changes Everything

Fantasy and romantasy listeners follow characters, worlds, and long arcs. Here's why full-cast audiobook production fits the genre, what to listen for, and how to produce it without a studio-scale budget.

June 17, 2026·7 min read
Watercolor open gate beside studio headphones
GuidesUpdated

Does Audible Accept AI-Narrated Audiobooks in 2026?

Short answer: not through standard ACX submission. Here are the current AI-audiobook rules for Audible, Spotify, Google Play, Apple Books, INaudio, and PublishDrive.

June 14, 2026·9 min read
Watercolor world map with location pins
GuidesUpdated

Where to Distribute Your AI Audiobook in 2026 (Platform-by-Platform Guide)

Where to distribute your AI audiobook in 2026: current routes through Google Play, Spotify, INaudio, Authors Republic, and PublishDrive.

May 17, 2026·12 min read

Midsummerr

Create premium audiobooks with cinematic quality in one click

[email protected]

Quick Links

HomeFeaturesServicesPricingAbout Us

Resources

BlogSupportRequest Demo

Legal

Terms of ServicePrivacy PolicyRefund Policy

© 2026 Midsummerr. All rights reserved.