pandoc is a tool for converting documents from one format into another.

You could use it to read an HTML document and convert it to markdown.

I prefer to use it to produce nice PDFs but using markdown instead of LaTeX.

To do this, you probably want both pandoc and texlive-full.

Unfortunately, this can be a rather large thing to install:

$ sudo apt-get install pandoc texlive-full
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed
  aglfn chktex cm-super cm-super-minimal context context-modules
  dvidvi dvipng feynmf fonts-cabin fonts-comfortaa
  fonts-crosextra-caladea fonts-crosextra-carlito fonts-ebgaramond
  fonts-ebgaramond-extra fonts-font-awesome fonts-freefont-otf
  fonts-gfs-artemisia fonts-gfs-baskerville fonts-gfs-bodoni-classic
  fonts-gfs-complutum fonts-gfs-didot fonts-gfs-didot-classic
  fonts-gfs-gazis fonts-gfs-neohellenic fonts-gfs-olga
  fonts-gfs-porson fonts-gfs-solomos fonts-gfs-theokritos
  fonts-hosny-amiri fonts-inconsolata fonts-ipaexfont-gothic
  fonts-ipaexfont-mincho fonts-junicode fonts-lato
  fonts-linuxlibertine fonts-lmodern fonts-lobster fonts-lobstertwo
  fonts-oflb-asana-math fonts-roboto fonts-sil-gentium
  fonts-sil-gentium-basic fonts-sil-gentiumplus fonts-stix
  fonts-texgyre fragmaster lacheck latex-cjk-all latex-cjk-chinese
  latex-cjk-chinese-arphic-bkai00mp latex-cjk-chinese-arphic-bsmi00lp
  latex-cjk-chinese-arphic-gbsn00lp latex-cjk-chinese-arphic-gkai00mp
  latex-cjk-common latex-cjk-japanese latex-cjk-japanese-wadalab
  latex-cjk-korean latex-cjk-thai latexdiff latexmk lcdf-typetools
  libfile-homedir-perl libfile-which-perl libintl-perl libplot2c2
  libpoppler-qt4-4 libpotrace0 libpstoedit0v5 libptexenc1 libsynctex1
  libtexlua52 libtexluajit2 libtext-unidecode-perl libxml-libxml-perl
  libxml-namespacesupport-perl libxml-sax-base-perl
  libxml-sax-expat-perl libxml-sax-perl libzzip-0-13 lmodern m-tx
  musixtex pandoc pandoc-data pfb2t1c2pfb pmx prerex
  preview-latex-style prosper ps2eps pstoedit psutils purifyeps
  tex-common tex-gyre tex4ht tex4ht-common texinfo texlive-base
  texlive-bibtex-extra texlive-binaries texlive-extra-utils
  texlive-font-utils texlive-fonts-extra texlive-fonts-extra-doc
  texlive-fonts-recommended texlive-fonts-recommended-doc
  texlive-formats-extra texlive-full texlive-games
  texlive-generic-extra texlive-generic-recommended
  texlive-humanities texlive-humanities-doc texlive-lang-african
  texlive-lang-arabic texlive-lang-chinese texlive-lang-cjk
  texlive-lang-cyrillic texlive-lang-czechslovak texlive-lang-english
  texlive-lang-european texlive-lang-french texlive-lang-german
  texlive-lang-greek texlive-lang-indic texlive-lang-italian
  texlive-lang-japanese texlive-lang-korean texlive-lang-other
  texlive-lang-polish texlive-lang-portuguese texlive-lang-spanish
  texlive-latex-base texlive-latex-base-doc texlive-latex-extra
  texlive-latex-extra-doc texlive-latex-recommended
  texlive-latex-recommended-doc texlive-luatex texlive-math-extra
  texlive-metapost texlive-metapost-doc texlive-music texlive-omega
  texlive-pictures texlive-pictures-doc texlive-plain-extra
  texlive-pstricks texlive-pstricks-doc texlive-publishers
  texlive-publishers-doc texlive-science texlive-science-doc
  texlive-xetex tipa ttf-adf-accanthis ttf-adf-gillius
  ttf-adf-universalis vprerex
0 to upgrade, 161 to newly install, 0 to remove and 0 not to upgrade.
Need to get 1,782 MB of archives.
After this operation, 3,466 MB of additional disk space will be used.

You may find it convenient to define a simple make rule for turning pandoc markdown documents into PDFs, as follows:

$ cat >Makefile <<'EOF'
%.pdf : %.mdwn
    pandoc -o $@ $<

Now you may write markdown files, and convert them into PDFs, simply by running make foo.pdf to turn foo.mdwn into foo.pdf.

If you use a PDF viewer such as evince, if you have the document open, and then generate a new version, your viewer will refresh itself to view the new version.

You can make your PDF viewer show live updates to your document by combining this make rule with an inotify watch command:

$ while true; do inotifywait -e close_write,move_self foo.mdwn; make foo.pdf; done

You can write in-line LaTeX to include content that is not expressible in markdown.

markdown also lacks a way of providing metadata, but pandoc has an extension called yaml_metadata_block, which parses the header of a document as a YAML document.

title: Foo
author: Joe Bloggs
- \usepackage{bytefield}
geometry: margin=3cm
# Title

This is an example bytefield figure:

  \wordbox{1}{A 16-bit field} \\
  \bitbox{8}{8 bits} & \bitbox{8}{8 more bits} \\
  \wordbox{2}{A 32-bit field. Note that text wraps within the box.}

Without the header, you would otherwise need to write a separate file containing the \usepackage{bytefield} directive, and invoke pandoc with --template=use-bytefield.latex.