Splitting PDF Documents By a Keyword List
- Introduction
- It is often necessary to split a PDF document at pages that contain specific keywords. The AutoSplit™ software allows to search a PDF document and check every page for presence of the user-specified keywords. If at least one keyword appears on the page, then it is marked as a splitting page. The document will be split at these pages and multiple PDF documents will be created.
- Sample Document Description
- The sample PDF document we are going to use in this tutorial contains 20 pages with Bates numbers from ABC-200001 to ABC-200020 in the lower right corner of each page. The goal is to split PDF document at pages with a specific Bates numbers from a user-specified list and name each output PDF document using a corresponding Bates number.
- Splitting Approach
- We are going to use the "Page with Keywords From List" separator option to split the PDF document at pages that contain following Bates numbers (keywords): ABC-200001, ABC-200004, ABC-200005, ABC-200008, ABC-200012. The PDF document will be split at pages that contain any of these Bates numbers (keywords).
- Prerequisites
- You need a copy of the Adobe® Acrobat® along with the AutoSplit™ Pro plug-in installed on your computer in order to use this tutorial. You can download trial versions of both the Adobe® Acrobat® and the AutoSplit™ Pro plug-in.
- Step 1 - Open A PDF Document
- Start the Adobe® Acrobat® application and open a PDF document using "File > Open…" menu.
- Step 2 - Open The "Split Document Settings" Dialog
- Select "Plug-ins > Split Documents > Split Document…" from the main Adobe® Acrobat® menu to open the "Split Document Settings" dialog.
- Step 3 - Select Splitting Method
- Check the "Use separator:" box and select "Page With Keywords From List" from the menu.
- Step 4 - Specify List of Keywords
- Click "Options..." to open the "Specify List of Keywords" dialog.
- Enter the split keywords in the text field, one keyword per line. In this example, the following Bates numbers are entered into the keyword list: ABC-200001, ABC-200004, ABC-200005, ABC-200008, ABC-200012.
- Optionally, check the "Match text case" option to perform case-sensitive search. Check the "Match whole words" option to match only whole words and ignore partial match when a keyword appears as a part of another word. Click "OK" once done.
- Step 5 - Specify Output File Naming
- Press the "Add" button in the "Output Naming and Destination" section.
- Check the "Text By Search" option. Click "Next>>".
- The goal is to name output files with splitting Bates number. Enter a search expression "ABC-\d{6}\b" into the "Find what" box for finding text to add to the output file name. Press the "OK" button once done.
- Step 6 - Specify An Output Folder
- Specify an output folder via the "Browse..." button. Click "OK" to proceed.
- Step 7 - Start Splitting Process
- Click "OK" in the dialog box to start the splitting process.
- Step 8 - Examine The "AutoSplit Results" Dialog
- The "AutoSplit Results" dialog appears on screen once the processing is completed. It shows a list of files that have been created. Click "Open Output Folder" to inspect the results.
- Step 9 - Inspect The Results
- The AutoSplit™ plug-in has split the input PDF document at pages with specific Bates numbers and created 5 output PDF documents:
- The first document with pages from ABC-200001 to ABC-200003.
- The second document with page ABC-200004 only.
- The third document with pages from ABC-200005 to ABC-200007.
- The fourth document with pages from ABC-200008 to ABC-200011
- The fifth document with pages from ABC-200012 to the last page of the input PDF document.