pandoc is a tool for converting documents from one format into another.

You could use it to read an HTML document and convert it to markdown.

I prefer to use it to produce nice PDFs but using markdown instead of LaTeX.

To do this, you probably want both pandoc and texlive-full.

Unfortunately, this can be a rather large thing to install:

$sudo apt-get install pandoc texlive-full Reading package lists... Done Building dependency tree Reading state information... Done … The following NEW packages will be installed aglfn chktex cm-super cm-super-minimal context context-modules dvidvi dvipng feynmf fonts-cabin fonts-comfortaa fonts-crosextra-caladea fonts-crosextra-carlito fonts-ebgaramond fonts-ebgaramond-extra fonts-font-awesome fonts-freefont-otf fonts-gfs-artemisia fonts-gfs-baskerville fonts-gfs-bodoni-classic fonts-gfs-complutum fonts-gfs-didot fonts-gfs-didot-classic fonts-gfs-gazis fonts-gfs-neohellenic fonts-gfs-olga fonts-gfs-porson fonts-gfs-solomos fonts-gfs-theokritos fonts-hosny-amiri fonts-inconsolata fonts-ipaexfont-gothic fonts-ipaexfont-mincho fonts-junicode fonts-lato fonts-linuxlibertine fonts-lmodern fonts-lobster fonts-lobstertwo fonts-oflb-asana-math fonts-roboto fonts-sil-gentium fonts-sil-gentium-basic fonts-sil-gentiumplus fonts-stix fonts-texgyre fragmaster lacheck latex-cjk-all latex-cjk-chinese latex-cjk-chinese-arphic-bkai00mp latex-cjk-chinese-arphic-bsmi00lp latex-cjk-chinese-arphic-gbsn00lp latex-cjk-chinese-arphic-gkai00mp latex-cjk-common latex-cjk-japanese latex-cjk-japanese-wadalab latex-cjk-korean latex-cjk-thai latexdiff latexmk lcdf-typetools libfile-homedir-perl libfile-which-perl libintl-perl libplot2c2 libpoppler-qt4-4 libpotrace0 libpstoedit0v5 libptexenc1 libsynctex1 libtexlua52 libtexluajit2 libtext-unidecode-perl libxml-libxml-perl libxml-namespacesupport-perl libxml-sax-base-perl libxml-sax-expat-perl libxml-sax-perl libzzip-0-13 lmodern m-tx musixtex pandoc pandoc-data pfb2t1c2pfb pmx prerex preview-latex-style prosper ps2eps pstoedit psutils purifyeps tex-common tex-gyre tex4ht tex4ht-common texinfo texlive-base texlive-bibtex-extra texlive-binaries texlive-extra-utils texlive-font-utils texlive-fonts-extra texlive-fonts-extra-doc texlive-fonts-recommended texlive-fonts-recommended-doc texlive-formats-extra texlive-full texlive-games texlive-generic-extra texlive-generic-recommended texlive-humanities texlive-humanities-doc texlive-lang-african texlive-lang-arabic texlive-lang-chinese texlive-lang-cjk texlive-lang-cyrillic texlive-lang-czechslovak texlive-lang-english texlive-lang-european texlive-lang-french texlive-lang-german texlive-lang-greek texlive-lang-indic texlive-lang-italian texlive-lang-japanese texlive-lang-korean texlive-lang-other texlive-lang-polish texlive-lang-portuguese texlive-lang-spanish texlive-latex-base texlive-latex-base-doc texlive-latex-extra texlive-latex-extra-doc texlive-latex-recommended texlive-latex-recommended-doc texlive-luatex texlive-math-extra texlive-metapost texlive-metapost-doc texlive-music texlive-omega texlive-pictures texlive-pictures-doc texlive-plain-extra texlive-pstricks texlive-pstricks-doc texlive-publishers texlive-publishers-doc texlive-science texlive-science-doc texlive-xetex tipa ttf-adf-accanthis ttf-adf-gillius ttf-adf-universalis vprerex 0 to upgrade, 161 to newly install, 0 to remove and 0 not to upgrade. Need to get 1,782 MB of archives. After this operation, 3,466 MB of additional disk space will be used.  You may find it convenient to define a simple make rule for turning pandoc markdown documents into PDFs, as follows: $ cat >Makefile <<'EOF'
%.pdf : %.mdwn
pandoc -o $@$<
EOF


Now you may write markdown files, and convert them into PDFs, simply by running make foo.pdf to turn foo.mdwn into foo.pdf.

If you use a PDF viewer such as evince, if you have the document open, and then generate a new version, your viewer will refresh itself to view the new version.

You can make your PDF viewer show live updates to your document by combining this make rule with an inotify watch command:

\$ while true; do inotifywait -e close_write,move_self foo.mdwn; make foo.pdf; done


You can write in-line LaTeX to include content that is not expressible in markdown.

markdown also lacks a way of providing metadata, but pandoc has an extension called yaml_metadata_block, which parses the header of a document as a YAML document.

---
title: Foo
author: Joe Bloggs
- \usepackage{bytefield}
geometry: margin=3cm
---
# Title

This is an example bytefield figure:

\begin{bytefield}{16}
\wordbox{1}{A 16-bit field} \\
\bitbox{8}{8 bits} & \bitbox{8}{8 more bits} \\
\wordbox{2}{A 32-bit field. Note that text wraps within the box.}
\end{bytefield}


Without the header, you would otherwise need to write a separate file containing the \usepackage{bytefield} directive, and invoke pandoc with --template=use-bytefield.latex.