#Working With EPUB, DOCX, and PDF
Different file formats behave differently during translation. Choosing the right workflow helps preserve structure and reduces cleanup time.
#EPUB Workflow
EPUB is usually the best format for book translation when the source is available as reflowable text.
Recommended steps:
- Confirm the EPUB opens correctly before importing.
- Translate by chapter or section.
- Maintain a glossary for names, places, titles, and recurring terms.
- Review headings, footnotes, and links after export.
- Open the exported EPUB in an EPUB reader before delivery.
#DOCX Workflow
DOCX is useful when you expect to keep editing after translation.
Recommended steps:
- Remove unnecessary comments or tracked changes before importing if they are not part of the translation.
- Translate a small section and check whether headings, lists, and tables survive export.
- Review the translated document in a word processor.
- Reapply complex styles manually if needed.
#PDF Workflow
PDF is often the hardest format because it preserves visual layout more than document structure. Some PDFs contain selectable text. Others are scanned images.
Recommended steps:
- Test one or two pages before translating the full file.
- Check whether paragraphs appear in the right order.
- Watch for merged words, broken line breaks, headers inside paragraphs, and missing characters.
- Use a cleaner source format such as EPUB or DOCX when available.
- For scanned PDFs, convert the document to searchable text before importing.
#When To Convert First
Convert the source before translation when:
- The PDF text order is wrong.
- The file contains scanned images instead of selectable text.
- Layout is more important than text flow.
- Tables, footnotes, or columns are not extracted clearly.
After conversion, review the converted source before importing it into CatScribe. A clean source file usually produces a cleaner translation.