File tree Expand file tree Collapse file tree 1 file changed +37
-0
lines changed
Expand file tree Collapse file tree 1 file changed +37
-0
lines changed Original file line number Diff line number Diff line change @@ -165,6 +165,43 @@ followed by the foreground image, which uses the mask as its alpha layer.
165165
166166Usage
167167-----
168+ Example: Re-encoding a Scanned Multipage PDF
169+ --------------------------------------------
170+
171+ Important:
172+ Re-encoding a scanned PDF requires an hOCR file.
173+ If an hOCR file is not provided, the tools may fail with confusing errors.
174+
175+ This section provides a minimal, user-friendly example for a common workflow.
176+
177+ Step 1: Start with a scanned PDF::
178+
179+ scan.pdf
180+
181+ Step 2: Generate an hOCR file
182+ One way to generate an hOCR file is using OCR tools such as ``ocrmypdf ``::
183+
184+ ocrmypdf --sidecar scan.hocr scan.pdf scan_searchable.pdf
185+
186+ This command produces:
187+ - ``scan_searchable.pdf `` (PDF with text layer)
188+ - ``scan.hocr `` (hOCR file required for re-encoding)
189+
190+ Step 3: Re-encode the PDF using archive-pdf-tools::
191+
192+ recode_pdf \
193+ --from-pdf scan_searchable.pdf \
194+ --from-hocr scan.hocr \
195+ --out-pdf output.pdf
196+
197+ Common Pitfall
198+ ~~~~~~~~~~~~~~
199+
200+ Running ``recode_pdf `` without providing an hOCR file may result in errors such as::
201+
202+ AttributeError: 'NoneType' object has no attribute 'seek'
203+
204+ This indicates that an hOCR file is required for this workflow.
168205
169206Creating a PDF from a set of images is pretty straightforward::
170207
You can’t perform that action at this time.
0 commit comments