-
Notifications
You must be signed in to change notification settings - Fork 55
What's different in MultiMarkdown 3.0?
-
Maintaining a growing collection of nested regular expressions was going to become increasingly difficult. I don't plan on adding much (if any) in the way of new syntax features, but it was a mess.
-
Performance on longer documents was poor. The nested perl regular expressions was slow, even on a relatively fast computer. Performance on something like an iPhone would probably have been miserable.
-
The reliance on Perl made installation fairly complex on Windows. That didn't bother me too much, but it is a factor.
-
Perl can't be run on an iPhone/iPad, and I would like to be able to have MultiMarkdown on an iOS device, and not just regular Markdown (which exists in C versions).
-
I was interested in learning about PEG's and revisiting C programming.
-
The syntax has been fairly stable, and it would be nice to be able to formalize it a bit --- which happens by definition when using a PEG.
-
I wanted to revisit the syntax and features and clean things up a bit.
-
Did I mention how much faster this is? And that it could (eventually) run on an iPhone?
A "snippet" is a section of HTML (or LaTeX) that is not a complete, fully-formed document. It doesn't contain the header information to make it a valid XML document. It can't be compiled with LaTeX into a PDF without further commands.
For example:
# This is a header #
And a paragraph.
becomes the following HTML snippet:
<h1 id="thisisaheader">This is a header</h1>
<p>And a paragraph.</p>
and the following LaTeX snippet:
\part{This is a header}
\label{thisisaheader}
And a paragraph.
It was not possible to create a LaTeX snippet with the original MultiMarkdown, because it relied on having a complete XHTML document that was then converted to LaTeX via an XSLT document (requiring a whole separate program). This was powerful, but complicated.
Now, I have come full-circle. peg-multimarkdown will now output LaTeX directly, without requiring XSLT. This allows the creation of LaTeX snippets, or complete documents, as necessary.
To create a complete document, simply include metadata. You can include a title, author, date, or whatever you like. If you don't want to include any real metadata, including "format: complete" will still trigger a complete document, just like it used to.
NOTE: If the only metadata present is Base Header Level
then a
complete document will not be triggered. This can be useful when combining
various documents together.
The old approach (even though it was hidden from most users) was a bit of a kludge, and this should be more elegant, and more flexible.
When metadata was repeated in MultiMarkdown 2.0, the second instance
"overwrote" the first instance. In MultiMarkdown 3.0, each instance of a
repeated metadata key is used. This is necessary for something like LaTeX Input
which can require multiple instances.
This means you can't "erase" an unnecessary metadata value by including a second, empty, copy. This was a trick used, for example, to erase undesired style information inserted into the document by older versions of Scrivener.
LaTeX documents are created a bit differently than under the old system. You no longer have to use an XSLT file to convert from XHTML to LaTeX. You can go straight from MultiMarkdown to LaTeX, which is faster and more flexible.
To create a complete LaTeX document, you can process your file as a snippet,
and then place it in a LaTeX template that you already have. Alternatively,
you can use metadata to trigger the creation of a complete document. You can
use the LaTeX Input
metadata to insert a \input{file}
command. You can
then store various template files in your texmf directory and call them with
metadata, or with embedded raw LaTeX commands in your document. For example:
LaTeX Input: mmd-memoir-header
Title: Sample MultiMarkdown Document
Author: Fletcher T. Penney
LaTeX Mode: memoir
LaTeX Input: mmd-memoir-begin-doc
LaTeX Footer: mmd-memoir-footer
This would include several template files in the order that you see. The
LaTeX Footer
metadata inserts a template at the end of your document. Note
that the order and placement of the LaTeX Include
statements is important.
The LaTeX Mode
metadata allows you to specify that MultiMarkdown should use
the memoir
or beamer
output format. This places subtle differences in the
output document for compatibility with those respective classes.
This system isn't quite as powerful as the XSLT approach, since it doesn't
alter the actual MultiMarkdown to LaTeX conversion process. But it is probably
much more familiar to LaTeX users who are accustomed to using \input{}
commands and doesn't require knowledge of XSLT programming.
I recommend checking out the default LaTeX Support Files that are available on github. They are designed to serve as a starting point for your own needs.
Note: You can still use this version of MultiMarkdown to convert text into XHTML, and then process the XHTML using XSLT to create a LaTeX document, just like you used to in MMD 2.0.
In HTML, images have three pieces of "metadata" relevant to MMD --- alt
,
title
, and id
. In Markdown and MultiMarkdown 2.0, these were structured in
the following manner:
This is an image ![alt text](file.png "This is a title")
In MultiMarkdown 2.0, the "alt text" was processed to "alttext" and used as an
id
attribute.
The problem was that LaTeX, and later on ODF, didn't really need the alt
or
title
metadata --- those formats really needed a caption instead. Using a
caption in some formats, but not others, leads to strange inconsistencies in
the documents created from the same MultiMarkdown source. And while I
understand the differences between the alt
and title
attributes, it really
doesn't make a lot of sense.
So instead, in MultiMarkdown 3.0, the following is used instead:
This is an image ![This is alt text](file.png "This is a title")
or
This is an image ![Another *alt*][fig]
![This is a *caption*][fig2]
The following image has no `alt` or `caption`:
![][fig2]
[fig]: file2.png "This is another title"
[fig2]: file3.png "This is another title"
The id
attribute comes into play when an image is specified as a reference.
This allows you to link to the image from elsewhere in your document.
When an image is the only thing constituting a paragraph, it is becomes
wrapped in a <figure>
tag, and instead of an alt
attribute, it has a
caption. This is demonstrated by fig2
above. This caption is used regardless
of output format, providing consistency between HTML, LaTeX, and ODF. When
this happens, the alt tag is a stripped down version of the caption, since it
can't have any markup applied.
So, in MultiMarkdown 3.0, there are now 4 pieces of metadata --- alt
,
title
, id
, and an optional caption
. There are only three places to
describe this metadata, so one piece has to be duplicated. Currently, the
alt
is a duplicate of the caption
, sans any markup. If an image is
contained within a paragraph, an alt
attribute is created, but no caption
.
An alternative plan I am considering is to use what currently generates the
title
attribute to instead generate the alt
attribute in all instances,
and the caption can be ignored if an image is not considered a figure.
Footnotes work slightly differently than before. This is partially on purpose, and partly out of necessity. Specifically:
-
Footnotes are anchored based on number, rather than the label used in the MMD source. This won't show a visible difference to the reader, but the XHTML source will be different.
-
Footnotes can be used more than once. Each reference will link to the same numbered note, but the "return" link will only link to the first instance.
-
Footnote "return" links are a separate paragraph after the footnote. This is due to the way peg-markdown works, and it's not worth the effort to me to change it. You can always use CSS to change the appearance however you like.
-
Footnote numbers are surrounded by "[]" in the text.
Because the original MultiMarkdown processed the text document into XHTML first, and then processed the entire XHTML document into LaTeX, it couldn't tell the difference between raw HTML and HTML that was created from plaintext. This version, however, uses the original plain text to create the LaTeX document. This means that any raw HTML inside your MultiMarkdown document is not converted into LaTeX.
The benefit of this is that you can embed one piece of the document in two formats --- one for XHTML, and one for LaTeX:
<blockquote>
<p>Release early, release often!</p>
<blockquote><p>Linus Torvalds</p></blockquote>
</blockquote>
<!-- \epigraph{Release early, release often!}{Linus Torvalds} -->
In this section, when the document is converted into XHTML, the blockquote
sections will be used as expected, and the epigraph
will be ignored since it
is inside a comment. Conversely, when processed into LaTeX, the raw HTML will
be ignored, and the comment will be processed as raw LaTeX.
You shouldn't need to use this feature, but if you want to specify exactly how a certain part of your document is processed into LaTeX, it's a neat trick.
MultiMarkdown 2.0 supported ASCIIMathML embedded with MultiMarkdown documents. This syntax was then converted to MathML for XHTML output, and then further processed into LaTeX when creating LaTeX output. The benefit of this was that the ASCIIMathML syntax was pretty straightforward. The downside was that only a handful of browsers actually support MathML, so most of the time it was only useful for LaTeX. Many MMD users who are interested in LaTeX output already knew LaTeX, so they sometimes preferred native math syntax, which led to several hacks.
MultiMarkdown 3.0 does not have built in support for ASCIIMathML. In fact, I would probably have to write a parser from scratch to do anything useful with it, which I have little desire to do. So I came up with a compromise.
ASCIIMathML is no longer supported by MultiMarkdown. Instead, you can use LaTeX to code for math within your document. When creating a LaTeX document, the source is simply passed through, and LaTeX handles it as usual. If you desire, you can add a line to your header when creating XHTML documents that will allow MathJax to appropriately display your math.
Normally, MathJax and LaTeX supported using \[ math \]
or \( math \)
to
indicate that math was included. MMD stumbled on this due to some issues with
escaping, so instead we use \\[ math \\]
and \\( math \\)
. See an
example:
latex input: mmd-article-header
Title: MultiMarkdown Math Example
latex input: mmd-article-begin-doc
latex footer: mmd-memoir-footer
xhtml header: <script type="text/javascript"
src="http://localhost/~fletcher/math/mathjax/MathJax.js">
</script>
An example of math within a paragraph --- \\({e}^{i\pi }+1=0\\)
--- easy enough.
And an equation on it's own:
\\[ {x}_{1,2}=\frac{-b\pm \sqrt{{b}^{2}-4ac}}{2a} \\]
That's it.
You would, of course, need to change the xhtml header
metadata to point to
your own installation of MathJax.
Note: MultiMarkdown doesn't actually do anything with the code inside the brackets. It simply strips away the extra backslash and passes the LaTeX source unchanged, where it is handled by MathJax if it's properly installed, or by LaTeX. If you're having trouble, you can certainly email the MultiMarkdown Discussion List, but I do not provide support for LaTeX code.