Merging Multiple PDF Files Using Control File
Introduction
The AutoSplit plug-in for Adobe Acrobat provides a flexible way of merging multiple PDF and non-PDF files (such JPEG, TIFF, MS Word and other supported formats) into multiple PDF documents. Use “Plug-ins > Merge Documents > Merge Multiple Documents Using Control File…” menu to merge one or more files into one or more PDF documents using a special control file. The merge control file is a plain text document that contains instructions on what document to merge (combine) and what options to use. Use any plain text editor (such as Notepad) to create this file. The minimal control file should contain instructions for the input folder(s), output folder and define a list of document to merge at least one output PDF file. There is no limit for number of output files that can be merged using this method.
What is a Merge Control File?
The control file is a collection of keywords (used to define processing options) and file names. For example, the following control file produces 3 output documents (First.pdf, Second.pdf, Third.pdf) by merging 9 different files from c:\data\input folder:
inputfolder=c:\data2\input
outputfolder=c:\data2\output
File1.pdf,File2.pdf, File3.pdf,>First.pdf
File4.pdf,File5.pdf,File6.pdf,>Second.pdf
File7.pdf,File8.pdf,File9.pdf,>Third.pdf
Input and Output Folders
Each control file should contain at least one inputfolder= and one outputfolder= instruction at the beginning of the file. You can use more than one inputfolder/outputfolder keyword to set a required input/output folders for different parts of the job. For example, you can put a separate inputfolder= keyword before each merging line to designate a different input/output folder(s).
// Enter comments or description here
inputfolder=c:\data2\input
outputfolder=c:\data2\output
File1.pdf,File2.pdf, File3.pdf,>First.pdf
inputfolder=c:\data2\input2
File4.pdf,File5.pdf,File6.pdf,>Second.pdf

Merging Files - Single Line Mode
Each output file that needs to be merged is defined on a single text line and consists of a comma-separated list of filenames. The following example defines an output file Output.pdf that should be created by merging 3 input file: File1.pdf, File2.pdf, File3.pdf files. If file extension is omitted, the .pdf file extension is assumed and automatically added.
File1.pdf,File2.pdf,File3.pdf,>Output1.pdf
DocA.pdf,DocB.pdf,DocC.pdf,>Output2.pdf
Merging Files - Multiple Line Mode
Sometimes, entering a long list of files on the single line makes the control file hard to read. Use <begindoc> and <enddoc> keywords to define a single output document on the multiple lines. The multi-line format makes code a lot more manageable.
<begindoc>
File1.pdf
File2.pdf
File3.pdf
document=Output.pdf
<endoc>
The code editor automatically displays a dashed horizontal line at the end of <begindoc> and <enddoc> keywords if they appear immideatelly at the begining of the line. The lines help to visually separate different output file definitions. The line is only displayed if the keywords appear at the start of the line. You can suppress the line separator by entering an extra space before the keyword.
Begindoc and Enddoc syntax
Specifying Input Files
The input files can be specified by either listing them "as-is" in a comma separated list (for example File1.pdf,File2.pdf,File3.pdf) or by using the filename= and filepath= keywords. Use the filename= keyword to specify a name of the file that is located in the currently selected input folder (specified by inputfolder= keyword). Only filename without any path should appear in the value of this keyword:
filename=File1.pdf
filename=File2.pdf
Use filepath= keyword to specify a full path to the input file. The input folder location is ignored by this keyword. You have to provide a complete path to the file:
filepath=c:\Data\File1.pdf
filepath=c:\Data\File2.pdf
Skip Missing Files
Use skipmissing=yes and skipmissing=no keywords to control handling of the missing files. Sometimes it is necessary to designate some files as optional. If skipmissing=no is used (this is a default value), then the merge operation is not going to be performed if one of the input files are missing. If skipmissing=yes option is set, then missing files will be ignored and will not stop the merge operation from executing.
The sample code below shows how to designate some input files as optional.
filepath=c:\Data\File1.pdf
skipmissing=yes
filepath=c:\Data\OptionalDoc1.pdf
filepath=c:\Data\OptionalDoc2.pdf
skipmissing=no
Defining Output Filenames
There are two ways how to designate an output file name for the merged file.
  1. By using document= keyword
  2. By using > symbol
The output file name is designated by > symbol and it should occur in front of the file name.
File1.pdf,File2.pdf,>Output1.pdf File3.pdf,File4.pdf,document=Output2.pdf
If an output file name definition is ommited, then output file is created by using a name of the first file in the input file list. The following instructions will produce File1.pdf in the output folder by merging File1.pd, File2.pdf, File3.pdf from input folder:
File1.pdf,File2.pdf,File3.pdf
Merging All Files From Folder
Use *.pdf syntax to merge all files of the specified file type from the input folder:
inputfolder=c:\data2\ProjectFiles
outputfolder=c:\data2\OutputFiles
*.pdf,>ProjectFiles.pdf
Merging By Page Numbers
Here is an example of merging PDF files by a page number. The following script extracts the first pages from all PDF files in the input folder and puts them into output1.pdf file, the second pages are extracted into the output2.pdf, and 3rd pages from each PDF file are combined into the output3.pdf.
inputfolder=c:\data\A
outputfolder=c:\data\B
page={1},*.pdf,>output1.pdf
page={2},*.pdf,>output2.pdf
page={3},*.pdf,>output3.pdf
Merge PDF documents by page numbers
Using Wildcards
Use filter= keyword along with wildcards to select multiple files that match a specific naming scheme. The following instructions will merge all PDF files from a folder (and all its sub-folders) that start with "Invoice":
inputfolder=c:\data\input
outputfolder=c:\data
subfolders=yes
filter=Invoices*.pdf,>ProjectFiles.pdf
Searching for Files in Subfolders
The following code will search for files CoverPage1.pdf and CoverPage2.pdf inside the c:\data\input folder and all its subfolders to use with the merge:
inputfolder=c:\data\input
outputfolder=c:\data
subfolders=yes
filter=CoverPage1.pdf,filter=CoverPage2.pdf,>ProjectFiles.pdf
Merging non-PDF files
The following examples merges all Microsoft Word files (with *.doc and *.docx extensions) from input folder into Report.pdf. All file types supported by Adobe Acrobat can be merged. The actual list of supported formats (for conversion to PDF) may differ depending on Acrobat version. Use "Edit > Preferences..." menu to review or configure format conversion settings.
inputfolder=c:\data2\ProjectFiles
outputfolder=c:\data2\OutputFiles
*.doc,*.docx>Report.pdf
Report File
The merge process creates a report file that lists all input and output files as well as any errors encountered during the processing. Report file is generated in theHTML format and can be viewed in any browser.
Selecting a Page Range
Many keywords can be applied to multiple files at once if a wildcard file selection is used. Use pagerange= keyword to specify a page range to be extracted from the input file. Only pages specified by the pagerange keyword are included into the merged output. Here is an example of using pagerange= keyword that is applied to all PDF files in the input folder. Keyword selects first 10 pages from each input document for using in the merge operation:
inputfolder=c:\data\input
outputfolder=c:\data
subfolders=yes
pagerange=1-10,*.pdf,>ProjectFiles.pdf
There is also a page= keyword for extracting just a single page from the input document. The following code will extract page 5 from File1.pdf and save it as SinglePageExtract.pdf:
<begindoc>
page={5},File1.pdf
document=SinglePageExtract.pdf
<enddoc>
Using Bookmarks to Refer to Pages
The pagerange= and page= keywords provide a way to use page labels, named destinations and bookmarks names to refer to pages. The following code illustrates how to extract a page range defined by two bookmarks - "FirstPage" and "LastPage":
pagerange={b:FirstPage}-{b:LastPage},File1.pdf,>ExtractByBookmarks.pdf
It is recommended to use {...} syntax when defining a page reference. The text inside brackets can contain any character or digit except a newline and a dash.
If "FirstPage" bookmark points to page 5, and "LastPage" bookmark points to page 8, then the above code is equivalent to extracting pages 5-8 from the File1.pdf and saving them as the ExtractByBookmarks.pdf.
Using Destinations and Page Labels
The following code shows how to use named destinations (d: prefix) and pagel labels (l: prefix) in the pagerange= keyword.
pagerange={d:DestinationA}-{d:DestinationB},File1.pdf,>ExtractByDestination.pdf
pagerange={l:A525}-{l:A538},File1.pdf,>ExtractByPageLabels.pdf
Page label is a custom name/alias that can be assigned to a PDF page to better reflect a logical structure of the document. Page labels can be assigned in the "Page Thumbnails" pane of Adobe Acrobat. Page label can be any combination of symbols, not only a number. For example, Roman numerals are frequently used as page labels (ii, vii, xii).
Using "Last" keyword
Use "Last" keyword to refer to the last page in the PDF document:
pagerange={1}-{Last},File1.pdf,>ExtractByDestination.pdf
page={Last},File1.pdf,>ExtractByPageLabels.pdf
Entering Comments
Use // to enter comments. Comments are ignored during the processing and are used for adding readable annotations to the control instructions.
// Enter comments or description here
Use /// to enter comments that appear on the gray-colored background for a better visual apperance. Use this kind of comments to separate different parts of the control file.
Syntax coloring
List of Supported Keywords
Keyword Definition Examples
pagerange Defines a page range to use from the next input pdf document. Format: pagerange=StartingPageNumber-EndingPageNumber. Page numbering starts from 1. Specify 0 to indicate the last page of the document. This instruction should appear before an input document entry and affects only the next input file. pagerange=1-2,File1.pdf
pagerange=10-0,File1.pdf
page Defines a single page to use from the next input pdf document. Format: page=PageNumber. Page numbering starts from 1. Specify 0 to indicate the last page of the document. This instruction should appear before an input document entry and affects only the next input file. page=4,File1.pdf
page=10,File1.pdf
padtoeven Turns On automatic padding of each input file with a blank page if a number of pages in the document is odd. Use padtoeven=yes to turn ON padding, padtoeven=no to turn it OFF. This instruction can be used anywhere in the control file.Please note that there is no space neither before or after = symbol. padtoeven=yes
padtoeven=no
extractnth Specifies that only Nth pages from the next input document need to be extracted. For example, setting this value to 2 will extract pages 1, 3, 5, 7, 9and so on. Setting this value to 3 will extract pages 1, 4, 7, 10 and so on. This value cannot be less than 1. This instruction should appear before an input document entry and affects only the next input file. extractnth=2,File1.pdf
inputfolder Defines an input folder where input files are located. This keyword is required. There should be at least one keyword in the begining of the control file. This instruction can be used multiple times anywhere in the control file. inputfolder=C:\Data\Input
outputfolder Defines an output folder where to place merged documents. This keyword is required. There should be at least one keyword in the begining of the control file. This instruction can be used multiple times anywhere in the control file. outputfolder=C:\Data\Output
filter Defines a file name filter. Use wildcards and ? symbol to specify multiple files that match a specific file naming pattern. Can be used to search for a file inside subfolders (if subfolder=yes keyword is set). filter=Invoices*.pdf
filter=*.pdf
filter=CoverPage1.pdf
reportfile Specifies a full path with filename for the report document. Report contains all details about input and output files, as well as any errors encountered during the processing. Report file is produced in HTML format and should have *.htm file extension. By default, if this optin is not used, ReportFile.htm is created in the first output folder listed in the control file. reportfile=C:\Project\Reports\ProcessingLog.htm
password Password protects output file. This instruction should occur on the same line with the list of the input files and defines a password to use to secure output document. File1.pdf,File2.pdf,password=3kf8f81$!
bookmark Defines a bookmark to use for bookmarking of a specific input file in the output document. This instruction needs to be specified before the name of the input file. By default, all sub-documents are bookmarked using input file name. Bookmark=First Document,File1.pdf,Bookmark=Second Document,File2.pdf
copybookmark Controls the transfer of the bookmarks from input documents to the output. copybookmarks=yes copybookmarks=no
overwrite This keyword is used to define if output files needs to be overwritten if a file with the same name already exists in the output folder. This option is global and should be specified once per control file. overwrite=yes
overwrite=no
filename Specifies an input filename without any path. File is located in the folder specified by inputfolder= keyword. filename=File1.pdf
filepath Specifies a full path to the input file. filepath=c:\Data\Project\File1.pdf
subfolders Use this keyword to include files from subfolders, when using file name templates such as *.pdf. subfolders=yes
subfolders=no
author Sets "Author" metadata record for the output document. This keyword can be used multiple times. It affects all merged documents that follows the keyword. It needs to be specified on a separate line only. Do not use commas in the text of this field. author=Acme Consulting Inc.
title Sets "Title" metadata record for the output document. This keyword can be used multiple times. It affects all merged documents that follows the keyword. It needs to be specified on a separate line only. Do not use commas in the text of this field. title=Account Terms And Conditions
subject Sets "Subject" metadata record for the output document. This keyword can be used multiple times. It affects all merged documents that follows the keyword. It needs to be specified on a separate line only. Do not use commas in the text of this field. subject=Account Statement
keywords Sets "Keywords" metadata record for the output document. This keyword can be used multiple times. It affects all merged documents that follows the keyword. It needs to be specified on a separate line only. Do not use commas in the text of this field. keywords=Keyword1 Keyword2 Keyword 3
Here is an example of the control file that uses most keywords:
inputfolder=c:\data2\input
outputfolder=c:\data2\output
reportfile=c:\data2\ReportLog.htm
overwrite=no
padtoeven=yes
author=Acme Consulting LLC
title=Customer Account Statement
subject=Second Quarter 2013
keywords=Account Second Quarter
pagerange=1-5,File1.pdf,File2.pdf, File3.pdf,>First.pdf
bookmark=First Document,File4.pdf,bookmark=Second Document,File5.pdf,bookmark=Third Document,File6.pdf,>Second.pdf
pagerange=2-3,File7.pdf,pagerange=1-1,File8.pdf,pagerange=2-2,File9.pdf,>Third.pdf,password=ab1492t%