TSSFL TECHNOLOGY STACK

Posted: **Sat Feb 27, 2016 11:13 am**

Sometimes you may need to extract/remove some specific pages from a pdf document. You can achieve this without much hassle through ubuntu command line. Here is an example to keep only pages 10 - 90 of the original document, File_1.pdf with 100 pages and produce a new document File_2.pdf:

Code: Select all

pdftk File_1.pdf cat 10-90 output File_2.pdf

Posted: **Fri Dec 09, 2016 7:58 am**

Here is another magic to remove only a specific page in the pdf document by using pdftk/PdfToolKit. Suppose you have a pdf document with 10 pages and you would like to remove only page number 7, then execute the following command:

Code: Select all

pdftk document.pdf cat 1-6  8-end output new_document.pdf

Where document.pdf is the input file with ten pages and new_document is the output file with the seventh page stripped out.

You can do things more magically, for example to take every pdf file in the current directory and copy them to the new directory with only a certain page of each pdf file removed. In the example below, we make a directory named Trimmed and copy all pdf files with only the first page stripped out:

Code: Select all

mkdir Trimmed
for i in *pdf ; do pdftk "$i" cat 2-end output "Trimmed/$i" ; done

Posted: **Fri Jan 04, 2019 11:10 am**

Install pdftk in Ubuntu 18.04

pdftk is not installed by default in Ubuntu 18.04 (Bionic) due to deprecated GCJ runtime on which pdftk package in Ubuntu (and its upstream Debian package) depend on.

You can install pdftk from PPA as follows:

Code: [Select all] [Expand/Collapse]

$sudo add-apt-repository ppa:malteworld/ppa
$sudo apt update
$sudo apt install pdftk

Posted: **Tue Dec 17, 2019 12:21 am**

Here is the magic to insert a range of pdf pages in another large pdf file. The command line below will insert any number of pages, "Insert_pages.pdf" in between the "Big.pdf" document after page 99 and output a new pdf file "Bigger.pdf":

Code: [Select all] [Expand/Collapse]

pdftk A=Big.pdf B=Insert_pages.pdf cat A1-99 B A100-end output Bigger.pdf

Posted: **Wed Jun 08, 2022 11:29 am**

Usually removing specific pages from a document may go with word counts. Counting words in a pdf document under Unix/Linux can be done as follows:

Count words -- including pages numbers, header text, etc,

Code: [Select all] [Expand/Collapse]

pdftotext yourfile.pdf - | wc -w

Or

count only words starting with a char out of [A-Za-z], or out of [A-Za-z,0-9], respectively:

Code: [Select all] [Expand/Collapse]

pdftotext yourfile.pdf - | tr " " "\n" | sort | uniq | grep "^[A-Za-z]" > words

Code: [Select all] [Expand/Collapse]

pdftotext yourfile.pdf - | tr " " "\n" | sort | uniq | grep "^[A-Za-z,0-9]" > words

and after getting the words list, "words", grep it within the output of pdftotext:

Code: [Select all] [Expand/Collapse]

pdftotext yourfile.pdf - | tr " " "\n" | grep -Ff words | wc -l

See a reference.

See also an online tool here for counting words in the LaTeX documents.

TSSFL TECHNOLOGY STACK

Remove/Extract Specific Pages From pdf File Using Terminal

Remove/Extract Specific Pages From pdf File Using Terminal

Re: Remove/Extract Specific Pages From pdf File Using Terminal

Re: Remove/Extract Specific Pages From pdf File Using Terminal

Re: Remove/Extract Specific Pages From pdf File Using Terminal

Re: Remove/Extract Specific Pages From pdf File Using Terminal