AutoSplit: Naming Output PDF Files By Text Search
- Introduction
- This tutorial shows how to name output PDF files in "Split Document" operation provided by the AutoSplit™ plug-in for
the Adobe® Acrobat®. The "Split Document" operation provides a way to automatically name output PDF files using document's text.
Multiple different naming methods can be combined together to create a wide variety of file naming schemes.
- We are going to split a multi-page PDF document into single-page files and automatically name each output file with a text extracted from first page. The text search is used to extract the text from the document. The tutorial illustrates the use of text patterns.
- Every page in the sample input PDF document is stamped with a Bates number. The numbers are using "XYZ-ABC123456" format. We will use a text pattern search to find a matching text on the first page of each output document. The matching text will be to name output files from the splitting process. As a result, the input PDF document will be split into 12 single-page documents and named by the corresponding Bates number.
- This operation is also available in the Action Wizard (Acrobat's batch processing tool) and can be used for automating of document processing workflows.
- Prerequisites
- You need a copy of the Adobe® Acrobat® along with the AutoSplit™ plug-in installed on your computer in order to use this tutorial. You can download trial versions of both the Adobe® Acrobat® and the AutoSplit™.
- Step 1 - Open a PDF Document
- Start the Adobe® Acrobat® application and open a PDF document using "File > Open..." menu.
- Please note that if an input PDF document does not contain any searchable text, then it can be used for any text-based processing. If you are using a scanned paper document, then make sure the "Recognize Text" operation (also known as "Optical Character Recognition" or OCR) is applied to this document prior to processing.
- Step 2 - Open the "Split Document Settings" Dialog
- Select "Plug-ins > Split Documents > Split Document..." from the main Adobe® Acrobat® menu to open the "Split Document Settings" dialog.
- Step 3 - Select Split Method
- Select a desired document splitting method. As an example, we have selected to split the input document into equal size output documents (one page per file).
- Step 4 - Specify Output File Naming
- We are going to show how to use the "Text By Search" method for naming the files.
- Press the "Add" button in the "Output Naming and Destination" section to add a new component to the name.
- Select the "Text By Search" option. Click "Next >>".
- Enter a search pattern (using a regular expression syntax) into the "Find what" box.
- For example, enter "\b[A-Z\-]+\d{6}\b" to match Bates numbers that follow "XYZ-ABC123456" format. This search expression will find all text that conforms to this pattern and use it as part of the filename.
- Click "OK" in the "Find text" dialog to close it.
- Step 5 - Specify an Output Folder
- Press "Browse..." button to select an output folder. Optionally, specify name prefix and/or base filename.
- Click "OK" to start splitting the document.
- Step 6 - Start Extraction Process
- Click "OK" in the confirmation dialog.
- Step 7 - Inspect the Results
- Check the list of the output files displayed in the "AutoSplit Results" dialog. Click "Open Output Folder" to inspect output PDF files.
- The output folder contains 12 single page documents named after the corresponding Bates number.
- Click here for a list of all step-by-step tutorials available.