Splitting PDF Documents By a Keyword List
Introduction
It is often necessary to split a PDF document at pages that contain specific keywords. The AutoSplit™ software allows to search a PDF document and check every page for presence of the user-specified keywords. If at least one keyword appears on the page, then it is marked as a splitting page. The document will be split at these pages and multiple PDF documents will be created.
Sample Document Description
The sample PDF document we are going to use in this tutorial contains 20 pages with Bates numbers from ABC-200001 to ABC-200020 in the lower right corner of each page. The goal is to split PDF document at pages with a specific Bates numbers from a user-specified list and name each output PDF document using a corresponding Bates number.
Splitting Approach
We are going to use the "Page with Keywords From List" separator option to split the PDF document at pages that contain following Bates numbers (keywords): ABC-200001, ABC-200004, ABC-200005, ABC-200008, ABC-200012. The PDF document will be split at pages that contain any of these Bates numbers (keywords).
Prerequisites
You need a copy of the Adobe® Acrobat® along with the AutoSplit™ Pro plug-in installed on your computer in order to use this tutorial. You can download trial versions of both the Adobe® Acrobat® and the AutoSplit™ Pro plug-in.
Step 1 - Open A PDF Document
Start the Adobe® Acrobat® application and open a PDF document using "File > Open…" menu.
Step 2 - Open The "Split Document Settings" Dialog
Select "Plug-ins > Split Documents > Split Document…" from the main Adobe® Acrobat® menu to open the "Split Document Settings" dialog.
Step 3 - Select Splitting Method
Check the "Use separator:" box and select "Page With Keywords From List" from the menu.
Step 4 - Specify List of Keywords
Click "Options..." to open the "Specify List of Keywords" dialog.
Enter the split keywords in the text field, one keyword per line. In this example, the following Bates numbers are entered into the keyword list: ABC-200001, ABC-200004, ABC-200005, ABC-200008, ABC-200012.
Optionally, check the "Match text case" option to perform case-sensitive search. Check the "Match whole words" option to match only whole words and ignore partial match when a keyword appears as a part of another word. Click "OK" once done.
Step 5 - Specify Output File Naming
Press the "Add" button in the "Output Naming and Destination" section.
Check the "Text By Search" option. Click "Next>>".
The goal is to name output files with splitting Bates number. Enter a search expression "ABC-\d{6}\b" into the "Find what" box for finding text to add to the output file name. Press the "OK" button once done.
Step 6 - Specify An Output Folder
Specify an output folder via the "Browse..." button. Click "OK" to proceed.
Step 7 - Start Splitting Process
Click "OK" in the dialog box to start the splitting process.
Step 8 - Examine The "AutoSplit Results" Dialog
The "AutoSplit Results" dialog appears on screen once the processing is completed. It shows a list of files that have been created. Click "Open Output Folder" to inspect the results.
Step 9 - Inspect The Results
The AutoSplit™ plug-in has split the input PDF document at pages with specific Bates numbers and created 5 output PDF documents:
  • The first document with pages from ABC-200001 to ABC-200003.
  • The second document with page ABC-200004 only.
  • The third document with pages from ABC-200005 to ABC-200007.
  • The fourth document with pages from ABC-200008 to ABC-200011
  • The fifth document with pages from ABC-200012 to the last page of the input PDF document.