Merge PDF Files With Similar Filenames Using Text Patterns
AutoSplit plug-in for Adobe® Acrobat®
- Overview
- Text patterns (regular expressions) can be used to match a specific sequence of characters in the filename and use it to compare files. Files with identical text are merged into a single output file. Only the text that is matching a text pattern is used to compare filenames and make a decision if files can be merged into a single document. This functionality is available starting with AutoSplit version 6.8.5 via Plug-ins > Merge Documents > Merge Multiple Documents With Similar Filenames menu in Adobe Acrobat.
- Regular Expressions
- Regular expression (or regex for short) syntax is used to define text patterns. It is used by virtually all text processing applications. Short for regular expression, a regex is a string of text that lets you create patterns that help match, locate, and manage text.
- Configuring Settings
- Select Compare filenames using the text pattern option and type the regular expression into the text box on Merge Files with Similar Filenames dialog screen.
- Examples of Text Patterns
- Example 1: Comparing Text to the Left of Underscore
-
The following example shows how to merge files
while using only text to the left of the underscore in the filename.
Use the following text pattern: -
^[^_]+
-
Explanation of the
pattern
:
^ - indicates that matching starts at the beginning of the text/filename.
[^_]+ - matches one or more symbols that are NOT an underscore. - Here is a list of input files for this example. Text that we want to use for comparing file names is highlighted in green:
-
Edgar Alan Poe_Summary.pdf
Edgar Alan Poe_DataSheet.pdf
Edgar Alan Poe_Details.pdf
Mark Twain_Summary.pdf
Mark Twain_DataSheet.pdf
Mark Twain_Details.pdf
Charles Dickens_Summary.pdf
Charles Dickens_DataSheet.pdf
Charles Dickens_Details.pdf
Charles Dickens_Attachments.pdf
-
Here is a list of output files for this example:
- Edgar Alan Poe.pdf that includes 3 files: Edgar Alan Poe_Summary.pdf, Edgar Alan Poe_DataSheet.pdf, Edgar Alan Poe_Details.pdf.
- Mark Twain.pdf that includes 3 files: Mark Twain_Summary.pdf, Mark Twain_DataSheet.pdf, Mark Twaine_Details.pdf.
- Charles Dickens.pdf that includes 4 files: Charles Dickens_Summary.pdf, Charles Dickense_DataSheet.pdf, Charles Dickens_Details.pdf, Charles Dickens_Attachments.pdf.
- Example 2: Comparing Text to the Right of Underscore
-
The following example shows how to merge files
while using only text to the right of the first underscore in the filename.
If there is another underscore in the filename, then only text between
first and second underscore is used. If there is only one underscore in the filename,
then all text to the right of the underscore is used for comparison.
Use the following text pattern: -
(?<=_)[^_]+
-
Explanation of the
pattern
:
(?<=_) - indicates that matching starts after an underscore and does not include it into the match.
[^_]+ - matches one or more symbols that are NOT an underscore. - Here is a list of input files for this example. Text that we want to use for comparing file names is highlighted in green:
-
Edgar Alan Poe_Summary.pdf
Edgar Alan Poe_DataSheet_2025.pdf
Edgar Alan Poe_Details_029049.pdf
Mark Twain_Summary.pdf
Mark Twain_DataSheet.pdf
Mark Twain_Details_23231.pdf
Charles Dickens_Summary.pdf
Charles Dickens_DataSheet_2025.pdf
Charles Dickens_Details_440029.pdf
-
Here is a list of output files for this example. Note that only a specific text in the middle of the
filename was used to merge the files producing a drastically different grouping of the documents.
- Summary.pdf that includes 3 files: Edgar Alan Poe_Summary.pdf, Mark Twain_Summary.pdf, Charles Dickens_Summary.pdf
- DataSheet.pdf that includes 3 files: Edgar Alan Poe_DataSheet_2025.pdf, Mark Twain_DataSheet_2025.pdf, Charles Dickens_DataSheet_2025.pdf
- Details.pdf that includes 3 files: Edgar Alan Poe_Details_029049.pdf, Mark Twain_Details_23231.pdf, Charles Dickens_Details_440029.pdf
- Example 3: Comparing Digits at the End of the Filename
-
The following example shows how to merge files
while using only a sequence of digits at the end of the filenames.
Use the following text pattern: -
\d+$
-
Explanation of the
pattern
:
\d+ - matches one or more digits.
$ - indicates that matching should align with the end of the text string/filename. - Here is a list of input files for this example. Text that we want to use for comparing file names is highlighted in green:
-
Edgar Alan Poe_Summary_2025.pdf
Edgar Alan Poe_DataSheet_2025.pdf
Edgar Alan Poe_Details_2025.pdf
Edgar Alan Poe_Summary_2024.pdf
Edgar Alan Poe_DataSheet_2024.pdf
Edgar Alan Poe_Details_2024.pdf
Edgar Alan Poe_Summary_2023.pdf
Edgar Alan Poe_DataSheet_2023.pdf
Edgar Alan Poe_Details_2023.pdf
-
Here is a list of output files for this example.
- 2025.pdf that includes 3 files: Edgar Alan Poe_Summary_2025.pdf, Edgar Alan Poe_DataSheet_2025.pdf, Edgar Alan Poe_Details_2025.pdf
- 2024.pdf that includes 3 files: Edgar Alan Poe_Summary_2024.pdf, Edgar Alan Poe_DataSheet_2024.pdf, Edgar Alan Poe_Details_2024.pdf
- 2023.pdf that includes 3 files: Edgar Alan Poe_Summary_2023.pdf, Edgar Alan Poe_DataSheet_2023.pdf, Edgar Alan Poe_Details_2023.pdf
- Example 4: Advanced Text Comparison and File Naming with Custom Format
-
The following example shows how to merge files
based on multiple different parts of the filename while using some text and excluding another.
This example shows how to combine documents for multiple different persons based
name and year. Essentially producing document reports for each person for a single year in a separate file.
We are going to use a custom format that will utilize only portion of the matching text in the
file comparison.
Use the following text pattern: -
^([^_]+)[^\d]+(\d+)$(?#format=Report for \1 \2)
-
Explanation of the
pattern
:
^([^_]+) - matches all text from the start of the filename until a first underscore. Note the (...) brackets around this part of the regex. We are creating a matching group that can be refered to later from a custom format as \1.
[^\d]+ - matches any character but a digit.
(\d+)$ - matches one or more digits at the end of the filename ($).
(?#format=Report for \1 \2) - custom format expression that combines a first matching group \1 (text until first underscore) with a space and the second matching group \2 (sequence of digits at the end of the filename). Text created by a custom format is going to be used to compare filenames and name output files. - Here is a list of input files for this example. Text that we want to use for comparing file names is highlighted in green:
-
Edgar Alan Poe_Summary_2025.pdf
Edgar Alan Poe_DataSheet_2025.pdf
Edgar Alan Poe_Details_2025.pdf
Edgar Alan Poe_Summary_2024.pdf
Edgar Alan Poe_DataSheet_2024.pdf
Edgar Alan Poe_Details_2024.pdf
Mark Twain_Summary_2025.pdf
Mark Twain_DataSheet_2025.pdf
Mark Twain_Details_2025.pdf
Mark Twain_Summary_2024.pdf
Mark Twain_DataSheet_2024.pdf
Mark Twain_Details_2024.pdf
-
Here is a list of output files for this example.
- Report for Edgar Alan Poe 2025.pdf that includes 3 files: Edgar Alan Poe_Summary_2025.pdf, Edgar Alan Poe_DataSheet_2025.pdf, Edgar Alan Poe_Details_2025.pdf
- Report for Edgar Alan Poe 2024.pdf that includes 3 files: Edgar Alan Poe_Summary_2024.pdf, Edgar Alan Poe_DataSheet_2024.pdf, Edgar Alan Poe_Details_2024.pdf
- Report for Mark Twain 2025.pdf that includes 3 files: Mark Twain_Summary_2025.pdf, Mark Twain_DataSheet_2025.pdf, Mark Twain_Details_2025.pdf
- Report for Mark Twain 2024.pdf that includes 3 files: Mark Twaine_Summary_2024.pdf, Mark Twain_DataSheet_2024.pdf, Mark Twain_Details_2024.pdf