PDF to Word for TTS

Upload a scanned PDF and download a clean Word document ready for text-to-speech. Optimised for Hungarian OCR output. All processing happens in your browser — no files leave your device.

📄

Drag & drop a PDF here, or

Left/right heuristic for two-column scanned layouts

Your PDF is processed entirely in the browser using PDF.js. No data is sent to any server.

How It Works

Upload

Drop a scanned PDF with an embedded text layer.

Extract

Text blocks are read with PDF.js, with an optional column-detection mode for two-column layouts.

Clean

Hungarian-specific OCR corrections fix common misreads, join hyphenated words, and strip citations.

Download

Get a .docx file ready for Word's built-in Read Aloud feature.

Source

tools/pdf-to-word

Client-side PDF text extraction and Word document generation, with Hungarian OCR corrections.

JavaScript PDF.js docx