Available to read, export (save) and edit PDF documents is a little known, lightweight, Linux application, Xournal [768 kb]. Xournal directly injects text of all sizes and types, erases (white-outs) given text and images, highlites, injects, shapes, draws and does other things. The image above-right illustrates some possible edits done by me. It was my first time, so please excuse the sloppiness. In general, Xournal can be used for notetaking, sketching and keeping a journal using a stylus in a variety of document forms. Unfortunately, it does not bookmark or have a search feature. A lightweight wine portable pdf app that does bookmarking and searching is pdfxchange. It also has other useful tools. I have not had a chance to use it, but I have heard good things about Master Pdf Editor, which has a Linux version: sudo apt-get install master-pdf-editor3.
There are many simple command line tools and special applications in Linux that perform special tasks related to PDF documents in a remarkably efficient and accurate way. Three general tool collections that can be easily installed in most distributions, are coreutils, x11-utils, x11-apps.
The accompanying chart illustrates what some command line tools can do. The main tool for pdf-documents is Poppler-utils (PDF Utilities)[456 kb]. The command, as printed in this table, should be given from the directory containing the pertinent files. The files resulting from the command will appear in the same directory (provided a specific path is not specified in the command). Finally, ghostscript, which is installed on most Linux distributions, can be used to merge or extract .pdf and/or .ps files, albeit the commands are long (see last 2 entries in accompanying table).
COMMAND APPLIED TO FILE XX | RESULT |
---|---|
pdfseparate xx.pdf p-%d.pdf | separates xx.pdf into separate pages: p-1.pdf, p-2.pdf, ... |
pdfseparate -f 5 xx.pdf p-%d.pdf | separates from page 5 to end: p-5.pdf, p-6.pdf, ... |
pdfseparate -f 2 -l 3 xx.pdf p-%d.pdf | separates from page 2 to page 3: p-2.pdf, p-3.pdf |
pdfseparate -f 2 -l 2 xx.pdf p-%d.pdf | separates page 2: p-2.pdf |
pdfimages xx.pdf y | extracts all images, saved as y-000.ppm, y-001.ppm,... |
pdfunite xx.pdf yy.pdf zz.pdf | unites xx.pdf and yy.pdf into zz.pdf |
pdftotext xx.pdf | extracts text, saved as xx.txt |
pdftoppm xx.pdf y | PDF to ppm converter, saved as y-1.ppm |
pdftohtml xx.pdf | PDF to HTML converter |
pdftops xx.pdf | PDF to PostScript (PS) converter |
pdfinfo xx.pdf | document information for xx.pdf |
pdffonts xx.pdf | font analyzer for xx.pdf |
html2text xx.html | tee ~/xx.tex | converts xx.html, including some special html symbols, to xx.tex |
vilistextum -rcn xx.html xx.tex | converts xx.html, including empty space but not symbols, to xx.tex |
convert xx.xwd yy.jpg | converts image types (imagemagick) |
convert xx.tif -compress jpeg xx.pdf | converts xx.tif to xx.pdf |
csplit -k xx.tex 22 37 | splits xx.tex at lines 22, 37 with line 22 in the second file, etc |
lxsplit -j xx.rar.001 | joins files of like type - creates xx.rar from xx.rar.001, xx.rar.002, etc |
gs -q -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -dPDFSETTINGS=/prepress -dFirstPage=3 -dLastPage=5 -sOutputFile=fileout.pdf filein.pdf [extracts pages 3-5 of filein.pdf] | |
gs -q -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -dPDFSETTINGS=/prepress -sOutputFile=fileout.pdf filein1.pdf filein2.pdf [merges filein1 and filein2] |