1. Home
  2. Linux
  3. Split pdf files from linux terminal pdftk

How To Split PDF Files From The Linux Terminal Using PDFtk

One of the best ways to split PDF files on Linux isn’t with a GUI tool like Evince or Ocular. Instead, it’s with a terminal app called PDFtk. Not only can it split PDF files, it can also edit and modify them.

Install PDFtk

This application has been around for a while and can be easily installed on most Linux distributions. Open up a terminal window and follow the instructions below to get the app to install.

Note: to install PDFtk, you must be using Ubuntu, Debian, Arch Linux, Fedora or OpenSUSE. If you are not running any of these Linux distributions, follow the source instructions at the bottom.

Ubuntu

sudo apt install pdftk

Debian

sudo apt-get install pdftk

Arch Linux

PDFtk is useable on Arch Linux, but users won’t be able to install the software from the main Arch Linux sources. Instead, interacting with the Arch Linux AUR is required. To start the installation of PDFtk on Arch, open up a terminal and use the Pacman package manager to sync the latest version of the Git tool.

Note: there is another PDFtk package on the AUR that makes installing the program easier, as it decompiles a ready-built program, rather than building from source. We do not recommend going this route, as there are problems with the ready-built GCC-GCJ package.

sudo pacman -S git

Now that Git is working on Arch Linux, you’ll be able to use it to download the latest version of the PDFtk AUR snapshot. In the terminal, use git clone to download the build instruction file.

git clone https://aur.archlinux.org/pdftk.git

Using the CD command, move the terminal from the user’s Home directory to the newly cloned pdftk folder.

cd pdftk

Inside the PDFtk sources folder, start the building process by running makepkg. Keep in mind that running the makepkg command will automatically download, compile and install any required dependency files. If, however, the builder fails to automatically grab these dependencies, you’ll need to install them manually. All dependencies for the PDFtk AUR package can be found at this link.

Fedora

Currently, there isn’t a Fedora PDFtk package in the software repositories. Luckily, it’s easy to get the OpenSUSE packages working. Start off by using wget to download the necessary packages.

wget https://ftp.gwdg.de/pub/opensuse/distribution/leap/42.3/repo/oss/suse/x86_64/pdftk-2.02-10.1.x86_64.rpm

wget https://ftp.gwdg.de/pub/opensuse/distribution/leap/42.3/repo/oss/suse/x86_64/libgcj48-4.8.5-24.14.x86_64.rpm

Using the CD command, move the terminal to the Downloads folder.

cd ~/Downloads

Lastly, use the DNF package manager to install PDFtk:

sudo dnf install libgcj48-4.8.5-24.14.x86_64.rpm pdftk-2.02-10.1.x86_64.rpm -y

OpenSUSE

sudo zypper install pdftk

Building From Source

Building PDFtk from source doesn’t take too much effort, as there are pre-configured build files inside of the source directory. To build the program from source, you’ll first need to download the code. To get the code, use the wget downloading tool in the terminal.

To ensure PDFtk builds correctly, make sure that you have GCC, GCJ, and libgcj installed on your Linux PC.

wget https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/pdftk-2.02-src.zip

Next, use the Unzip command to extract the PDFtk Zip folder to your Linux PC’s Download folder. Don’t have the Unzip app installed? Search your package manager for “unzip”, and install it.

cd ~/Downloads

unzip pdftk-2.02-src.zip

Extracting the PDFtk source zip archive should make a “pdftk-2.02-src” folder inside Downloads. In the terminal, use the CD command to enter it.

cd pdftk-2.02-src

In the root PDFtk folder, not much can take place. To compile anything, we need to move the terminal to the PDFtk sub-folder.

cd pdftk

The PDFtk sub-folder has dozens of specialized Makefiles that the user can use to automatically build for different operating systems. Using the LS command, list the contents of the directory.

ls

Look through and find the specific Makefile you need and start the build process with the command below. Please remember to change “Makefile.filename” in the command below with the name of the Makefile you need to use.

make -f Makefile.filename

Using PDFtk

One of the main draws to PDFtk is its ability to join and split PDF files. For example, to break up a PDF file so that each page of the document is its own file, try using the burst command:

pdftk testfile.pdf burst

PDFtk will output the split files in the same location as the source file.

Want to reform all of the split PDF files back into one? Start out by renaming the original PDF file (for safety).

mv testfile.pdf testfile.bak

Now that the test PDF file is safe, recombine everything with PDFtk. First, use the LS command to view the files in the directory.

ls

Next, re-run the LS command, but this time use it to store all of the PDF filenames.

ls *.pdf >> pdf-filenames.txt

Assign the contents of pdf-filenames.txt to a Bash variable. Using a variable in this instance, rather than a wildcard means that when we re-combine the PDF, all pages will be in order.

value=$(<pdf-filenames.txt)

Lastly, recombine the PDF file with PDFtk and $value.

pdftk $value cat output recombined-document.pdf

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.