Naming Split PDF Files By Text Search
AutoSplit plug-in for Adobe® Acrobat®
- Introduction
- This tutorial shows how to name output PDF files in "Split Document" operation provided by the AutoSplit™ plug-in for
the Adobe® Acrobat®. The "Split Document" operation provides a way to automatically name output PDF files using document's text.
Multiple different naming methods can be combined together to create a wide variety of file naming schemes.
- We are going to split a multi-page PDF document into single-page files and automatically name each output file with a text extracted from first page. The text search is used to extract the text from the document. The tutorial illustrates the use of text patterns.
- Every page in the sample input PDF document is stamped with a Bates number. The numbers are using "XYZ-ABC123456" format. We will use a text pattern search to find a matching text on the first page of each output document. The matching text will be to name output files from the splitting process. As a result, the input PDF document will be split into 12 single-page documents and named by the corresponding Bates number.
- This operation is also available in the Guided Actions (aka Action Wizard) tool and can be used for automating of document processing workflows. It allows applying the same processing to multiple PDF files without manually opening files and using menus.
- Prerequisites
- You need a copy of the Adobe® Acrobat® along with the AutoSplit™ plug-in installed on your computer in order to use this tutorial. You can download trial versions of both the Adobe® Acrobat® and the AutoSplit™.
- Step 1 - Open a PDF Document
- Start the Adobe® Acrobat® application and open a PDF document using "File > Open..." menu.
- Please note that if an input PDF document does not contain any searchable text, then it can be used for any text-based processing. If you are using a scanned paper document, then make sure the "Recognize Text" operation (also known as "Optical Character Recognition" or OCR) is applied to this document prior to processing.
- Step 2 - Open the "Split Document Settings" Dialog
- Select "Plug-ins > Split Documents > Split Document..." from the
main Adobe® Acrobat® menu to open the "Split Document Settings" dialog.
[⚡ How to locate Plugins menu ⚡]. - Step 3 - Select Split Method
- Specify a desired document splitting method. This tutorial is focused on naming output files generated by document spliting operation. It does not matter what specific splitting method is used, file naming part works for all of them. As an example, we have selected to split the input document into equal size output documents (one page per file).
- Step 4 - Specify Output File Naming
- We are going to show how to use the "Text By Search" method for naming the files.
- Press the "Add" button in the "Output Naming and Destination" section to add a new component to the name.
- Select the "Text By Search" option. Click "Next >>".
- Enter a search pattern (using a regular expression syntax) into the "Find what" box.
- For example, enter \b[A-Z\-]+\d{6}\b to match Bates numbers that follow "XYZ-ABC123456" format. This search expression will find all text that conforms to this pattern and use it as part of the filename. If you want to learn more about regular expression syntax, then press ?... button next to Find what: text box. This is a commonly used method for searching text for matching patterns. There are many online resources available that cover all aspects of the regular expression usage.
- Click "OK" in the "Find text" dialog to close it.
- Step 5 - Specify an Output Folder
- Press "Browse..." button to select an output folder. Optionally, specify name prefix and/or base filename.
- Click "OK" to start splitting the document.
- Step 6 - Select Input File(s)
- Select either a currently open PDF document or select one or more PDF files for splitting. Click "OK" button to proceed.
- Step 7 - Inspect the Results
- Check the list of the output files displayed in the "AutoSplit Results" dialog. Click "Open Output Folder" to inspect output PDF files.
- The output folder contains 12 single page documents named after the corresponding Bates number.
- Click here for a list of all step-by-step tutorials available.