Assign PDF Page Labels Via a Text Search
Introduction
Manually changing individual page labels one-at-a-time can be time consuming. The AutoSplit™ plug-in offers a way to automatically search pages of a document for specific text, extract it, and use it to assign new page labels. This operation searches each page of a document for a page label ‘style’, and changes them if matching text is found.
Page labels are used by Adobe Acrobat in the “Page Thumbnails” navigation panel and in any page selection tools. They offer a way to name PDF pages using any combination of letters and numbers. For example, search for page numbers (e.g. 1, 2, 3), or roman numerals (e.g. II/VII). A custom search expression can also be typed to search for text that matches a specific pattern. If no matching page label is found on a page, then it is assigned a label that matches its page number within the document.
In the tutorial below, we will look at how to change page labels in a sample input document. The original labels are corresponding page numbers - the goal is to assign new page labels using unique "invoice numbers" that feature on each page:
before/after labels
Input Document Description
The sample PDF document used in the steps below is a collection of single-page invoices. Each invoice features "INVOICE NUMBER: XXXXXXXX" in the same location on each page. We will use the "Search for custom text pattern" method to identify the presence of this text, extract it, and use it to rename page labels.
input invoices
Prerequisites
You need a copy of Adobe® Acrobat® along with the AutoSplit™ plug-in installed on your computer in order to use this tutorial. Both are available as trial versions.
Step 1 - Open the “Create Page Labels By Text Search” Dialog
Open the PDF document that you want to create page labels for in Adobe® Acrobat®, then select “Plug-ins > Merge Documents > Assign Page Labels By Text Search…” from the main Acrobat menu.
open labels dialog
Step 2 - Select a Page Label Style
First, select the desired page label style to search the document for. The user can choose from searching for: page numbers (the default option); roman numerals (I, IV, XIII etc.); or various alphanumeric label formats (e.g. A-5/A-15(1)/C&A-1). There is also the option to enter a custom pattern.
select label style
Step 3 - Enter a Text Pattern
In this example, we will demonstrate assigning new page labels by searching for a custom text pattern. Check "Search for a custom text pattern" and enter an expression using regular expression syntax next to "Pattern to find:". Here, we've used INVOICE NUMBER: \K\d+. This searches for any occurrence of the text "INVOICE NUMBER:" followed by a number consisting of one or more digits (\d+). The "\K" component is used to block any text prior to the actual invoice number from being extracted. For example, a resulting page label would be "0123456" instead of "INVOICE NUMBER: 0123456".
type search pattern
Step 4 - Optional: Insert Text Before/After Page Labels
It's also possible to insert custom text before or after the newly assigned page labels. Enter any text to be inserted before or appended after the new page label into the relevant box. Using the search pattern entered in step 3 will generate only numerical page labels. We will also insert "Invoice " before the extracted invoice numbers. Note that adding spaces here will affect the outcome (e.g.: "Invoice 0123456" instead of "Invoice0123456").
The following processing options can also be used. Optionally check "Append a found page label to the existing page label" to add to existing labels, instead of completely replacing them. "Remove spaces from page labels" can be used to ensure that resulting page labels do not contain spaces.
insert additional text
Step 5 - Add a Search Area
It's necessary to define at least one page area to search for matching text. The desired page label text will typically be located in the same place on each page of a document. Search areas provide the ability to search for text only in these specific areas of each page and greatly decrease the probability of finding similar text patterns that are not suitable page labels.
Note: if matching text is not found inside a chosen search area, the page is assigned a page label that matches it’s physical page number. For example, if no label is found on page 5, it is assigned a page label “5”.
Press "Add Search Area..." to define a page location.
add search area
Use the zoom and draw tools in the upper right corner of the dialog to draw a box around the desired search area. A sample page from the currently opened document will be displayed in the preview box. Here, we have drawn a box around where the "INVOICE NUMBER: XXXXXXXX" is located in each invoice. Page area boundaries can also be manually typed into the boxes to the left of the preview.
Press "OK" to proceed.
draw search area
Step 6 - Optional: Save and Reuse Settings
Optionally save this settings configuration for future reuse by pressing "Save Settings...". These settings can be loaded later via the "Load Settings..." button, to replace page labels in other documents using the same method.
save/reuse settings
Choose a suitable folder and optionally rename the settings file before pressing "Save". The default file name will be "Page Label Settings", and the settings file will be saved with a *.plabels file extension.
save settings file
Step 7 - Run the Procedure
Press "OK" to confirm the settings and run the operation.
confirm settings
A dialog box reports the number of pages that have been assigned new labels. Press "OK" to close it.
close report box
Step 8 - Inspect the Results
Click on the "Page Thumbnails" icon to expand the thumbnails panel in Acrobat.
expand thumbnails panel
Check the new page labels. In this example, they now follow an "Invoice N" format.
check new labels
Page labels are also visible when using various other Acrobat tools, such as "Organize Pages":
see acrobat tools
Step 9 - Review the Processing Report
If any pages fail to be assigned a new page label (e.g.: matching text is not found within the search area on a page), the plug-in prompts the user to view a processing report. Press "OK" to close this dialog and view the report.
open processing report
The report will be opened in the default web browser. It details the page label renaming settings used, and the pages where the operation was unsuccessful.
check processing report
You can find more AutoSplit tutorials here.