23.2.2016
Petr Bravenec has recently introduced the application GeoSign that aims to facilitate the work of Czech surveyors. GeoSign needs to manipulate documents in PDF. This manipulation is handled by PDF Manipulaction Utility (PDFMU), a utility we created for this purpose. We are releasing the souce code of PDFMU under the AGPL license. The current version of the program is 1.0.
PDFMU can perform various operations on PDF documents,
for example adding digital signatures or attachments.
It uses a command line interface,
so it is suitable for batch processing.
Its standard Windows name is pdfmu
(or pdfmu.exe
).
I will use this name in the following sections to denote the PDFMU executable.
Since it is implemented in Java, it is necessary to have JRE (version 7 or newer) installed in order to run PDFMU.
PDFMU supports the following operations on PDF documents:
update-version
: changes the PDF versionupdate-properties
: changes the propertiesattach
: attaches a filesign
: adds a digital signatureinspect
: displays PDF version, properties and digital signaturesIn the following sections I will describe these operations.
update-version
Every PDF document contains the information about the version of PDF used to encode the document. The PDF versions released so far are backwards compatible, so it is possible to raise the PDF version of a document without damaging the document's consistency. However, changing the version of a document using PDFMU invalidates all the embedded digital signatures so it is recommended to change the version (if necessary) before signing the document.
pdfmu update-version document.pdf --force --version=1.6
:
changes the PDF version of the document document.pdf
to the value 1.6PDFMU supports PDF versions 1.2 through 1.7.
update-properties
PDF documents contain so called properties. Every property in a document has a unique name and a value. PDFMU can set and remove any properties in a PDF document with the exception of the properties Producer and ModDate.
Some of the properties have a special support
in the PDF specification;
we call them standard properties.
The operation pdfmu update-properties
has special options
to set the standard properties:
Name | Description | PDFMU option | |
---|---|---|---|
Internal | English | ||
Title | Title | --Title |
|
Author | Author | The name of the person who created the document. | --Author |
Subject | Subject | --Subject |
|
Keywords | Keywords | --Keywords |
|
Creator | Application | The name of the product that created the document in the original format. | --Creator |
Producer | PDF Producer | The name of the product that converted the document from the original format to PDF. | |
CreationDate | Created | The date and time the document was created. | --CreationDate |
ModDate | Modified | The date and time the document was most recently modified. | |
Trapped | Has the document been modified to include trapping information? | --Trapped |
PDFMU sets the properties Producer and ModDate (date and time of last modification) automatically.
pdfmu update-properties document.pdf --force --Title="My document"
:
sets the property Title in the document document.pdf
to the value "My document"pdfmu update-properties document.pdf --force --set "document owner" "Al Pine"
:
sets the property "document owner" of the document document.pdf
to the value "Al Pine"pdfmu update-properties document.pdf --force --clear "document owner"
:
removes the property "document owner" from the document document.pdf
attach
A PDF document may contain other files as attachments.
Each attachment has a name and a description.
A call to pdfmu attach
attaches one file
and optionally sets its name (option --rename
)
and description (option --description
).
pdfmu attach document.pdf attachment.txt --force
:
attaches the file attachment.txt
to the document document.pdf
(under the name attachment.txt
)pdfmu attach document.pdf attachment.txt --force --rename="my file.txt" --description="Important information"
:
attaches the file attachment.txt
to the document document.pdf
under the name my file.txt
with the description "Important information"sign
Adding digital signatures is the most complex operation of PDFMU with respect to both the user interface and the implementation. For a better understanding of digital singatures in PDF documents I recommend the book Digital signatures for PDF documents. In the following paragraphs I will attempt to summarize the basic knowledge necessary to work with digital signatures in PDF.
Digital signature of a file is information that is tied to the content of the file and with the identity of the person or the institution that created the signature. A signature allows us to verify whether the file has not changed since the moment of signing and whether it was really signed by whoever claims to be the signer. Signatures are usually used to confirm authorship or authentication of a document.
Timestamp of a file is information that allows us to verify whether the file already existed in the current form at a time recorded in the timestamp.
Timestamp authority is a server that issues timestamps. If we trust the server, we can also trust the (valid) timestamps issued by this server.
Internal digital signature (or timestamp) of a file is a digital signature (or timestamp) embedded in the file.
A PDF document may contain one or more internal digital signatures. Every signature contains its signing time. If the signing time is recorded in the form of a timestamp, it can be verified.
pdfmu sign document.pdf --force --keystore-type=pkcs12 --keystore=cert.p12
:
signs the document document.pdf
using the private key and certificate
saved in the file cert.p12
pdfmu sign document.pdf --force --keystore-type=pkcs12 --keystore=cert.p12
--tsa-url="http://example.com/tsa" --tsa-username=name --tsa-password=password
:
same as previous, and additionally adds a timestamp from the timestamp authority
http://example.com/tsa
,
using the username name
and password password
for authorizationpdfmu sign dokument.pdf --force --keystore-type=Windows-MY --key-alias="Al Pine"
:
signs the document document.pdf
using the private key and certificate
saved in the Windows certificate store
under the friendly name "Al Pine"inspect
The operation inspect
prints out PDF version, properties and digital signatures
of a document,
that is all the information
that can be modified using the operations
update-version
,
update-properties
and sign
.
pdfmu inspect document.pdf
:
prints out the information about the document document.pdf
Besides the options shown in the examples above, PDFMU offers many more,
for example printing in JSON format
or timestamp authority authorization using a certificate.
You can display all the supported options using the option --help
.
Calling pdfmu --help
displays the basic options
and for example calling pdfmu sign --help
displays the options specific for the operation sign
.
Then head to the GitHub repository, download the source code and build the program using Maven. You can find more detailed instructions on the page README.
PDFMU is created in the programming language Java. The core functionality of each of the supported operations is implemented in the library iText; PDFMU basically exposes a part of the API of iText in a command line manner.
Other than that, PDFMU uses the library Argparse4j for parsing command line arguments and the library Jackson for formatting the output in JSON.
The source code of PDFMU is freely available in the GitHub repository.