Categories
Tools Writing

Using MultiMarkdown and GNU Make to generate HTML

In a previous post, I said I was going to start talking about how I do my writing and how I generate html, pdf's, and e-pub files. For me, it all starts with the html and MultiMarkdown is the tool I use to turn Markdown into html. From that html, I generate the other final formats.

The MultiMarkdown website does a good job at describing what the tool does. Here is an excerpt:

Writing with MultiMarkdown allows you to separate the content and structure of your document from the formatting. You focus on the actual writing, without having to worry about making the styles of your chapter headers match, or ensuring the proper spacing between paragraphs. And with a little forethought, a single plain text document can easily be converted into multiple output formats without having to rewrite the entire thing or format it by hand. Even better, you don’t have to write in “computer-ese” to create well formatted HTML or LaTeX commands. You just write, MultiMarkdown takes care of the rest.

I diverge from MultiMarkdown's full feature set because I do not use it to generate pdf's or e-pub formats. I only use it to generate html. The main reason for this is I could not figure out how to get LaTeX to work! When I installed LaTeX on my Mac by way of  MacTeX, I constantly got errors when I tried to generate LaTeX documents. I am sure I could figure it out eventually, but I didn't want to. Not really. In my head, I knew CSS real well and I know I could make the html look exactly the way I wanted. Using MultiMarkdown meant  that the html would not look like the pdf, it would look like the default LaTeX styles that come with MultiMarkdown. These styles are nice, but they're not what I want and I didn't want to learn LaTeX to figure it all out. So, my goal was to generate html and from that I would generate the other formats.

Using Make

Now that my goal was to use MultiMarkdown to generate html, I wanted to use GNU Make to automatically build html when Markdown files change. The simplest way to do this is to author a very simple Makefile:

[code]

%.html: %.md
multimarkdown -o $@ $<

[/code]

The $@ represents the output filename and the $< represents the input file in Make parlance. This rule says that any X.html file depends on a file named X.md and the way to create it is multimarkdown -o $@ $<.

I also added a clean rule:

[code]

clean:
rm -rf *.html

[/code]

MultiMarkdown Headers

MultiMarkdown extends standard Markdown with some attributes you can set in your header. These attributes can define the CSS file to use, insert arbitrary html into the html's <head> element, set the author, title, etc. Lots of these directives are used for LaTeX formatting as well, but I largely ignore these. Here is a sample header:

[code]

Title: Avonia
Language: en
Author: Nick Cody
LaTeX XSLT: manuscript-novel.xslt
Surname: Cody
Base Header Level: 1
Comment: This is a work-fragment; it is the middle of a story. It is destined to be trashed.

[/code]

When this is compiled to html, it looks like this:

[code lang="html"]
Avonia

[/code]

CSS

The CSS was a bit trickier. You can use a MultiMarkdown CSS: directive, but that would link to a file. I wanted the CSS to be embedded so the html file could be e-mailed to someone and it would have everything they needed. I tried uploading the CSS to my website and used that absolute url as the CSS location, but accessing a remote server when trying to look at a local html file made me feel dirty.

So, instead, I used the HTML Header: MultiMarkdown directive. I use make to take a standard CSS file and remove all newline characters so the CSS could be embedded. The enhanced rule for that is as follows:

[code]

%.mdcss: %.css Makefile
echo HTML Header: \>> $@ %.html: %.md header.md novel-style.mdcss Makefile cat header.md > tmp cat novel-style.mdcss >> tmp cat $< >> tmp multimarkdown -o $@ tmp rm -f tmp [/code]

A few things are happening here. First, I take the regular css file and create a new file type, .mdcss. This is the single-line MultiMarkdown directive which has the whole CSS on a single line. This is very much like css and JavaScript minification. Notice I use the tr command to strip out newlines.

Then, I have an enhanced html rule, which takes my original MultiMarkdown header, concatenates that with the mdcss, and then concatenates that with the actual writing content. The result is an html file that can be viewed directly. I have a sample file you can look at here: http://primordia.com/upload/lorem_ipsum.html

You can look at the Markdown source, here: http://primordia.com/upload/lorem_ipsum.md

Enhancing <hr> with fancy awesomeness

Notice that in Markdown, *** gets turned into <hr>. In my CSS, I don't show the standard rule, I display some Unicode character I turned into a 300dpi png. This png has enough pixels to look good on the screen and on the printer. I make sure it's the same size by using the background-size CSS attribute, along with specifying width and height in inches and not in pixels:

[code lang="css"]

hr {
background-image: url(0F05.png);
background-size: 100%;
margin-left: auto;
margin-right: auto;
margin-top: 1em;
margin-bottom: 1em;
width: 0.33in;
height: 0.33in;
border: 0px;
}

[/code]

Notice the 0F05.png. That image weighs in at 396 pixels square and I render it at 0.33in. This yields 1200dpi... goot enough for printing and the stylesheet I created prints awesome. Here is the image:

But I don't really want to reference that image as a file. I already embedded the CSS, so I figure it would be best to embed the image, too. You can do this by base64 encoding the image data. That turns my stylesheet into this:

[code lang="css"]

hr {
background-image: url(data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAYwAAAGMCAYAAADJOZVKAAAACXBIWXMAAL
...
}

[/code]

The ellipsis is a big ellipsis. Lots of data in base64 encoding follows, but I omitted it for brevity. I created the encoding using the Mac's builtin base64 command-line program and I created a helper rule:

[code]

$(CPRESS_DIR)images/%.base64: $(CPRESS_DIR)%.png $(CPRESS_DIR)cpress.mak
base64 $< | sed -e "s/.\{76\}/&~/g" | tr '~' '\n' | tr -d ' ' > $@
[/code]

That breaks the continuous stream of bytes into another stream with newlines every 76 characters. Some editors cry when you put too many characters on one line.

If I were more clevery, I'd awk the css file and replace the image url with the data uri, on the fly. Unfortunately, I'm not t hat clever, at least not yet anyway. I'm an awk n00b.

The advantage here is the stylesheet is completely contained in the html, including the image. This is awesome!

Printing Background Images

When you print html docs, background-images don't typically print. This has been the default behavior in browsers since as long as I can remember. In my case, I wanted the default to print background graphics since I use them for the horizontal rule elements. That's easy, so I added this to my CSS:

[code lang="css"]

@media print {
* {-webkit-print-color-adjust: exact;}
}

[/code]

That probably only works in Safari and Chrome since they use webkit, but for now that was good enough for me.

Wrapping it up

So, that's all for now. In another post, I'll talk about how I used wkhtmltopdf to generate a pdf that looks identical to the HTML (as rendered in Chrome of Safari). I'll also talk about how I use Calibre to generate e-pub format. On the surface, Calibre is a GUI program and it would appear to violate my UNIX-style approach of using Make and command-line scripts. But inside the Calibre package are a set of powerful command-line utilities that I bent to my will. Stay tuned for more on that cause it's so exciting!

Categories
Tools Writing

Markdown for Writing Projects

I use Markdown for writing because it's simple, vendor neutral, and easy to process. Using Markdown, I'm not locked into a particular word processor or proprietary format. I work with text and text is awesome. I want to describe how I use Markdown to write and generate artifacts such as html, pdf, and various e-book formats like ePub.

The writing solution I wanted had these requirements:

  • I want to edit in plain-text, Markdown
  • I want to control how the HTML looks  by writing the CSS myself
  • I want the PDF to look like the HTML
  • I want the e-book format to look like the HTML and PDF
  • I want the HTML, PDF, and e-book formats to be built from the Markdown source, automatically
  • I want to edit remotely on my iPad and have my local and remote work synchronized
  • I want to retrieve past revisions in case I paint myself into a corner and I need to get back to the place I was before

I'll write a few posts over the course of the next few weeks that will serve as a general introduction to how I used MultiMarkdownmakegit, and Dropbox to address these requirements. For now, let's talk about Markdown.

Why Markdown?

Most people I know are already well versed in the beauty of markdown and plain text editing. Markdown is used all over the place. Github uses their own flavor to power all of their README's, messages, comments, and more. There is blogging software that uses it. Lots of editors can do some basic coloring and bolding of markdown text to make it look pretty without a full conversion to HTML markup. It's also just nice to read as plain text, since being readable as plain text is one of the primary features of Markdown.

You may ask what software is available to convert the plain-text markdown into something fancier. There are tons of options here. Here are just few:

  • Jekyll - Takes markdown files and can generate a static website, like a blog
  • Marked - A Markdown editor with HTML preview and PDF generation capabilities
  • Scrivener - A full-fledged writers tool that uses Markdown, manages characters, to-dos, scenes, etc.
  • Byword - A simple iPhone/iPad/Mac editor
  • TextMate - An awesome text editor for the Mac. Unlike vim, easily allows wrapping margins.

Oh how I wish WordPress would allow me to use Markdown as the native editor format! There is wp-markdown plugin, but I haven't had the guts to try it out yet. I'm so afraid of being disappointed. It works by taking Markdown and converting it to WordPress HTML and it converts it back to Markdown when you edit a post. That scares the crap out of me.

Most of the software options listed previously allow you to write in Markdown and they can convert to something like HTML or a PDF with some canned stylesheets. And they probably do this through the GUI. For my purposes, I wanted to use TextMate and I wanted to write the stylesheet myself. I originally tried to use vim as my Markdown editor, but I found out that Vim sucks noodles at Markdown editing.

I didn't want to use a complex tool that has a lot of features. Scrivener might be nice, I never tried it, but it looks awfully complex. I didn't want to use a GUI. I didn't want to pull down a menu to generate my HTML. I wanted to use the command line because the command-line is awesome. In a nutshell, I wanted Markdown to be my code and I wanted a build system that produced my programs: the HTML, PDF, and e-book formats.

I wanted to us make to see that my Markdown files are modified and have my HTML generated automatically. I wanted to write chapters as individual Markdown files and have them auto-magically aggregated into a book a post-process. Just like a linker!

In my next post, I'll talk about MultiMarkdown and why it's awesome and how I use it to generate HTML with my own CSS. I even tricked the HTML generation to insert javascript to produce automatic hyphenation since hyphenation is still something that HTML5/CSS3 don't seem to do well in most browsers I tried.