Merge PDF Files With Similar Filenames Using Text Patterns

AutoSplit plug-in for Adobe® Acrobat®

Overview

Text patterns (regular expressions) can be used to match a specific sequence of characters in the filename and use it to compare files. Files with identical text are merged into a single output file. Only the text that is matching a text pattern is used to compare filenames and make a decision if files can be merged into a single document. This functionality is available starting with AutoSplit version 6.8.5 via Plug-ins > Merge Documents > Merge Multiple Documents With Similar Filenames menu in Adobe Acrobat.

Regular Expressions

Regular expression (or regex for short) syntax is used to define text patterns. It is used by virtually all text processing applications. Short for regular expression, a regex is a string of text that lets you create patterns that help match, locate, and manage text.

Configuring Settings

Select Compare filenames using the text pattern option and type the regular expression into the text box on Merge Files with Similar Filenames dialog screen.

Type text pattern into corresponding entry field

Examples of Text Patterns

Comparing Text to the Left of Underscore
Comparing Text to the Right of Underscore
Comparing Digits at the End of the Filename
Advanced Text Comparison and File Naming with Custom Format

Example 1: Comparing Text to the Left of Underscore

The following example shows how to merge files while using only text to the left of the underscore in the filename.
Use the following text pattern:

^[^_]+

Explanation of the pattern :
^ - indicates that matching starts at the beginning of the text/filename.
[^_]+ - matches one or more symbols that are NOT an underscore.

Here is a list of input files for this example. Text that we want to use for comparing file names is highlighted in green:

Edgar Alan Poe_Summary.pdf
Edgar Alan Poe_DataSheet.pdf
Edgar Alan Poe_Details.pdf
Mark Twain_Summary.pdf
Mark Twain_DataSheet.pdf
Mark Twain_Details.pdf
Charles Dickens_Summary.pdf
Charles Dickens_DataSheet.pdf
Charles Dickens_Details.pdf
Charles Dickens_Attachments.pdf

Here is a list of output files for this example:

Edgar Alan Poe.pdf that includes 3 files: Edgar Alan Poe_Summary.pdf, Edgar Alan Poe_DataSheet.pdf, Edgar Alan Poe_Details.pdf.
Mark Twain.pdf that includes 3 files: Mark Twain_Summary.pdf, Mark Twain_DataSheet.pdf, Mark Twaine_Details.pdf.
Charles Dickens.pdf that includes 4 files: Charles Dickens_Summary.pdf, Charles Dickense_DataSheet.pdf, Charles Dickens_Details.pdf, Charles Dickens_Attachments.pdf.

Example 2: Comparing Text to the Right of Underscore

The following example shows how to merge files while using only text to the right of the first underscore in the filename. If there is another underscore in the filename, then only text between first and second underscore is used. If there is only one underscore in the filename, then all text to the right of the underscore is used for comparison.
Use the following text pattern:

(?<=_)[^_]+

Explanation of the pattern :
(?<=_) - indicates that matching starts after an underscore and does not include it into the match.
[^_]+ - matches one or more symbols that are NOT an underscore.

Here is a list of input files for this example. Text that we want to use for comparing file names is highlighted in green:

Edgar Alan Poe_Summary.pdf
Edgar Alan Poe_DataSheet_2025.pdf
Edgar Alan Poe_Details_029049.pdf
Mark Twain_Summary.pdf
Mark Twain_DataSheet.pdf
Mark Twain_Details_23231.pdf
Charles Dickens_Summary.pdf
Charles Dickens_DataSheet_2025.pdf
Charles Dickens_Details_440029.pdf

Here is a list of output files for this example. Note that only a specific text in the middle of the filename was used to merge the files producing a drastically different grouping of the documents.

Summary.pdf that includes 3 files: Edgar Alan Poe_Summary.pdf, Mark Twain_Summary.pdf, Charles Dickens_Summary.pdf
DataSheet.pdf that includes 3 files: Edgar Alan Poe_DataSheet_2025.pdf, Mark Twain_DataSheet_2025.pdf, Charles Dickens_DataSheet_2025.pdf
Details.pdf that includes 3 files: Edgar Alan Poe_Details_029049.pdf, Mark Twain_Details_23231.pdf, Charles Dickens_Details_440029.pdf

Example 3: Comparing Digits at the End of the Filename

The following example shows how to merge files while using only a sequence of digits at the end of the filenames.
Use the following text pattern:

\d+$

Explanation of the pattern :
\d+ - matches one or more digits.
$ - indicates that matching should align with the end of the text string/filename.

Here is a list of input files for this example. Text that we want to use for comparing file names is highlighted in green:

Edgar Alan Poe_Summary_2025.pdf
Edgar Alan Poe_DataSheet_2025.pdf
Edgar Alan Poe_Details_2025.pdf
Edgar Alan Poe_Summary_2024.pdf
Edgar Alan Poe_DataSheet_2024.pdf
Edgar Alan Poe_Details_2024.pdf
Edgar Alan Poe_Summary_2023.pdf
Edgar Alan Poe_DataSheet_2023.pdf
Edgar Alan Poe_Details_2023.pdf

Here is a list of output files for this example.

2025.pdf that includes 3 files: Edgar Alan Poe_Summary_2025.pdf, Edgar Alan Poe_DataSheet_2025.pdf, Edgar Alan Poe_Details_2025.pdf
2024.pdf that includes 3 files: Edgar Alan Poe_Summary_2024.pdf, Edgar Alan Poe_DataSheet_2024.pdf, Edgar Alan Poe_Details_2024.pdf
2023.pdf that includes 3 files: Edgar Alan Poe_Summary_2023.pdf, Edgar Alan Poe_DataSheet_2023.pdf, Edgar Alan Poe_Details_2023.pdf

Example 4: Advanced Text Comparison and File Naming with Custom Format

The following example shows how to merge files based on multiple different parts of the filename while using some text and excluding another. This example shows how to combine documents for multiple different persons based name and year. Essentially producing document reports for each person for a single year in a separate file. We are going to use a custom format that will utilize only portion of the matching text in the file comparison.
Use the following text pattern:

^([^_]+)[^\d]+(\d+)$(?#format=Report for \1 \2)

Explanation of the pattern :
^([^_]+) - matches all text from the start of the filename until a first underscore. Note the (...) brackets around this part of the regex. We are creating a matching group that can be refered to later from a custom format as \1.
[^\d]+ - matches any character but a digit.
(\d+)$ - matches one or more digits at the end of the filename ($).
(?#format=Report for \1 \2) - custom format expression that combines a first matching group \1 (text until first underscore) with a space and the second matching group \2 (sequence of digits at the end of the filename). Text created by a custom format is going to be used to compare filenames and name output files.

Here is a list of input files for this example. Text that we want to use for comparing file names is highlighted in green:

Edgar Alan Poe_Summary_2025.pdf
Edgar Alan Poe_DataSheet_2025.pdf
Edgar Alan Poe_Details_2025.pdf
Edgar Alan Poe_Summary_2024.pdf
Edgar Alan Poe_DataSheet_2024.pdf
Edgar Alan Poe_Details_2024.pdf
Mark Twain_Summary_2025.pdf
Mark Twain_DataSheet_2025.pdf
Mark Twain_Details_2025.pdf
Mark Twain_Summary_2024.pdf
Mark Twain_DataSheet_2024.pdf
Mark Twain_Details_2024.pdf

Here is a list of output files for this example.

Report for Edgar Alan Poe 2025.pdf that includes 3 files: Edgar Alan Poe_Summary_2025.pdf, Edgar Alan Poe_DataSheet_2025.pdf, Edgar Alan Poe_Details_2025.pdf
Report for Edgar Alan Poe 2024.pdf that includes 3 files: Edgar Alan Poe_Summary_2024.pdf, Edgar Alan Poe_DataSheet_2024.pdf, Edgar Alan Poe_Details_2024.pdf
Report for Mark Twain 2025.pdf that includes 3 files: Mark Twain_Summary_2025.pdf, Mark Twain_DataSheet_2025.pdf, Mark Twain_Details_2025.pdf
Report for Mark Twain 2024.pdf that includes 3 files: Mark Twaine_Summary_2024.pdf, Mark Twain_DataSheet_2024.pdf, Mark Twain_Details_2024.pdf