Deleting Pages Via a Text Search Using a Command-line BAT File
Manually deleting PDF pages can be a slow process. This tutorial explains how the AutoSplit™ plug-in can be used to automatically delete pages containing specific text, using a command-line BAT file. This is a script file containing 'instructions' for searching pages of a document for specific text (or a text pattern), and deleting them. The first step is to make a custom "Delete Pages by Text Search" configuration in AutoSplit, which will be used to create the BAT file. The BAT file instructs AutoSplit to run this search on a specific input file, delete the relevant pages, and place the remaining ones in a unique output location.
Input Files and Page Deletion Method
The input file used to demonstrate this method contains a collection of invoices. Some invoices contain the text: "PAID" or "TOTAL DUE: 0.00".
The goal is to have these pages removed so that the output file contains only the invoices without this text.
You need a copy of Adobe® Acrobat® along with the AutoSplit plug-in installed on your computer in order to use this tutorial. Both are available as trial versions.
Step 1 - Open the "Find And Delete Pages with Matching Text" Dialog
With the file to be processed open in Acrobat, select "Plug-Ins > Split Documents > Delete Pages By Text Search" from the main menu.
Step 2 - Specify Text Search Options
Use this dialog to configure the text search. In this example, the goal is to delete pages that contain the words “PAID” or “Total due: 0.00”. Type the text to search for in the entry box, one item per line.
Pages found to contain any of these search items will be deleted. See the separate tutorial on how to delete PDF pages via a text search for detailed help with configuring these settings and more examples.
Press "Save..." to save these settings as a text search settings file.
Step 3 - Save the Text Search Settings
Choose a folder and rename the file, which will be saved with a *.textsearch extension. We will save this example as "Settings.textsearch".
Press "Save" to proceed.
Step 4 - Create the BAT File
See the separate tutorial for detailed help on running an operation from a command-line BAT file.
Create a BAT file using any plain text editor (such as Notepad). Begin by creating a blank text file, then add the following lines making sure to replace file paths and filenames with the relevant filenames you are using:
SET AUTOSPLIT_CONFIG_FILE=C:\Data\Settings.textsearch
SET AUTOSPLIT_INPUT_FILE=C:\Data\Input\Invoices.pdf
SET AUTOSPLIT_LOG_FILE=C:\Data\DeletedPagesLog.txt
"C:\Program Files (x86)\Adobe\Acrobat DC\Acrobat\Acrobat.exe" /n /h
AUTOSPLIT_CONFIG_FILE specifies a full file path to the text search settings file created in steps 2 & 3.
The AUTOSPLIT_MODE variable specifies the processing 'type' - a "DeletePages" operation.
AUTOSPLIT_INPUT_FILE specifies a full file path to the input file.
The AUTOSPLIT_OUTPUT_FOLDER file path specifies the output folder (C:\Data\Output) for the modified files to be saved in. Input files are not overwritten, regular Windows-style duplicate filename resolution is applied if there is already a file with the same name in the output folder.
Overall, the BAT file needs to specify three file paths for: the settings file, an input PDF file/folder, and an output folder.
Use the AUTOSPLIT_LOG_FILE variable to specify a log file location - useful for troubleshooting and record keeping. If a log file does not exist, it will be automatically created. If a log file already exists, then new records will be appended to the file.
Step 5 - Save the BAT File
Press "File > Save As..." to save the text as a BAT file.
Notepad prompts you to save the text as a *.txt file. Choose a folder and use the "Save as type:" list to select "All Files". Name the file and manually add a *.bat file extension, then press "Save".
Step 6 - Run the BAT File
Double-click on the BAT file to run it.
Note that the BAT file will open Adobe Acrobat and may display a progress bar whilst processing. Optionally use the /h switch on Acrobat's command line to run it in a minimized window.
Step 7 - Inspect the Results
Open the output folder to view the new file. Note that the log file has also been created.
Open the output file.
All pages containing the text specified in step 2 have been deleted from the document.
You can find more AutoSplit tutorials here.