CatScribe Docs

#Working With EPUB, DOCX, and PDF

Different file formats behave differently during translation. Choosing the right workflow helps preserve structure and reduces cleanup time.

#EPUB Workflow

EPUB is usually the best format for book translation when the source is available as reflowable text.

Recommended steps:

  1. Confirm the EPUB opens correctly before importing.
  2. Translate by chapter or section.
  3. Maintain a glossary for names, places, titles, and recurring terms.
  4. Review headings, footnotes, and links after export.
  5. Open the exported EPUB in an EPUB reader before delivery.

#DOCX Workflow

DOCX is useful when you expect to keep editing after translation.

Recommended steps:

  1. Remove unnecessary comments or tracked changes before importing if they are not part of the translation.
  2. Translate a small section and check whether headings, lists, and tables survive export.
  3. Review the translated document in a word processor.
  4. Reapply complex styles manually if needed.

#PDF Workflow

PDF is often the hardest format because it preserves visual layout more than document structure. Some PDFs contain selectable text. Others are scanned images.

Recommended steps:

  1. Test one or two pages before translating the full file.
  2. Check whether paragraphs appear in the right order.
  3. Watch for merged words, broken line breaks, headers inside paragraphs, and missing characters.
  4. Use a cleaner source format such as EPUB or DOCX when available.
  5. For scanned PDFs, convert the document to searchable text before importing.

#When To Convert First

Convert the source before translation when:

  • The PDF text order is wrong.
  • The file contains scanned images instead of selectable text.
  • Layout is more important than text flow.
  • Tables, footnotes, or columns are not extracted clearly.

After conversion, review the converted source before importing it into CatScribe. A clean source file usually produces a cleaner translation.