Convert PDF to HTML -Shell




There is inbuilt linux utility to convert pdf files to html.
pdftohtml command is available in Fedora / Ubuntu

syntax

pdftohtml -c src.pdf dst.html

if you wish to ignore images add -i

This will convert pdf to multiple interlinked html files.

other options

Usage: pdftohtml [options]  [ ]
  -f           : first page to convert
  -l           : last page to convert
  -q                : don't print any messages or errors
  -h                : print usage information



-help : print usage information -p : exchange .pdf links by .html -c : generate complex document -i : ignore images -noframes : generate no frames -stdout : use standard output -zoom : zoom the pdf document (default 1.5) -xml : output for XML post-processing -hidden : output hidden text -nomerge : do not merge paragraphs -enc : output text encoding name -dev : output device name for Ghostscript (png16m, jpeg etc) -v : print copyright and version info -opw : owner password (for encrypted files) -upw : user password (for encrypted files) -nodrm : override document DRM settings


if you find any missing point in here, please let us know in comment section or tweet us at @linuxreaders. To get more articles like this, subscribe to our RSS feeds / Mails.
Read 240 articles by

Trending Posts