How to export org-mode
to HTML
In a previous post, one notable omission on ways to generate static websites was
to use something that isn’t a static-site generator by itself, but has the
capabilities of static-site generation.1See is-a vs has-a. In this post
we will look at one such candidate: specifically,
org-mode
.2Org-mode is a package that
is distributed as part of GNU
Emacs.
In this article we will limit ourselves to converting a single org
file to
HTML and not make drastic changes to how various syntactic constructs are
converted to HTML. As such the below will be presently out of scope.3These
will be addressed in subsequent articles.
- Significantly altering the generated HTML: Depending on how much alteration
is desired, this may be best handled by creating a “derived export
backend”.4Deriving from, in this case,
ox-html
. - Building an entire website: When building a website we want to export
multiple pages that are linked together. This can be done by leveraging
org-mode
’s Publishing capabilities.
This article uses the following version of org-mode
:
9.6.7
Let’s dig in!
What is org-mode
?
org-mode
is an overloaded term. The term can be used to refer to:
- either the
org-mode
package in Emacs, or - the file syntax that
org-mode
the package can parse, or - the interactive
major-mode
that theorg-mode
package in Emacs provides when you open a file inorg-mode
syntax
In this post we’ll refer to the syntax as org
5Or, org
syntax. and we’ll
refer to the package as org-mode
. Interactive capabilities of the major-mode
are out of scope of this article, but
they are numerous and worth
exploring if the reader is unaware. To make the comparison with static-site
generators easier, below is a summary in a format we’ve used previously.
Org-mode6 Extensibility, both for general markup and specifically for syntax highlighting requires Emacs Lisp knowledge.
- Written in Emacs lisp
- License: GPL-3.0-or-later
- Initial release: 20037Initial release of the HTML-export functionality seems to have been in 2011.
- Stable release: 2023
- Input formats
- Supported frontend frameworks
- Dated8And likely incompatible with the latest release. Bootstrap support
- Themes
- A modest collection of themes
-
ox-tufte
9 In fact, we use it in this blog. fortufte-css
10 A CSS theme inspired by Edward Tufte’s books and handouts which, among other things, supports notes in the right margin.
- Syntax highlighting via Emacs’
major-modes
- Support for migration to Org
- Only indirectly, by first migrating to Markdown or similar format using something like Jekyll’s migration support and then using Pandoc.
Convert org
to HTML
- Use an editor of your choice to
write a file in
org
syntax.11A brief summary oforg
syntax. - Use
org-mode
’s export mechanism12Specifically,ox-html
. to convert theorg
file to HTML.
I.e., if so desired13But, really, why? the reliance on Emacs can be reduced to the process of converting to HTML.
For instance, given an org file with the following content:
#+TITLE: A weary traveler who is lost * What they might look like #+caption: A weary traveller [[../../../static/old-man-suitcase.png]] * What they might be thinking #+begin_verse I grow wearier... Possibly I am lost, but I am not yet done. -- Suhail #+end_verse
It can be exported using the below command:14Where $file.org
stands for
name of the file.
emacs --batch --no-init-file --file=$file.org --eval "(progn (require 'org) (setq org-export-allow-bind-keywords t) (org-html-export-to-html))"
The above command invokes Emacs in “batch” mode, ensuring that it doesn’t
process any custom initialization15With --not-init-file
., opens the org
file in question and then specifies that the following Emacs Lisp code be
executed16Via the --eval
argument. which:
- Ensures that
org-mode
is loaded.17Not strictly necessary, sinceorg-mode
is included in recent Emacs and would be loaded by visiting theorg
file. - Enables the use of the
BIND
keyword. - Exports the loaded
org
file to HTML.
The generated HTML is as follows:
Configure the export process
In this section, we will practically see how to alter the export process to meet our needs. We will focus on two aspects of the auto-generated HTML:
Configure the Table of Contents
The table of contents includes all headlines in the document. Its depth is therefore the same as the headline levels in the file. If you need to use a different depth, or turn it off entirely, set the org-export-with-toc variable accordingly. You can achieve the same on a per file basis, using the following
toc
item inOPTIONS
keyword.
Using the keyword syntax
, we can turn off the table of contents entirely:
#+OPTIONS: toc:nil #+TITLE: A weary traveler who is lost * What they might look like #+caption: A weary traveller [[../../../static/old-man-suitcase.png]] * What they might be thinking #+begin_verse I grow wearier... Possibly I am lost, but I am not yet done. -- Suhail #+end_verse
Which, upon export, will result in the below HTML:
Using the
num
export keyword, we can also toggle section-numbers:
#+OPTIONS: num:nil #+TITLE: A weary traveler who is lost * What they might look like #+caption: A weary traveller [[../../../static/old-man-suitcase.png]] * What they might be thinking #+begin_verse I grow wearier... Possibly I am lost, but I am not yet done. -- Suhail #+end_verse
Which, upon export, will result in the below HTML:
We could also combine the above settings:
#+OPTIONS: toc:nil #+OPTIONS: num:nil #+TITLE: A weary traveler who is lost * What they might look like #+caption: A weary traveller [[../../../static/old-man-suitcase.png]] * What they might be thinking #+begin_verse I grow wearier... Possibly I am lost, but I am not yet done. -- Suhail #+end_verse
Which, upon export, will result in the below HTML:
We can also tweak the text in the table of contents entry using
property syntax
:
Normally Org uses the headline for its entry in the table of contents. But with
ALT_TITLE
property, a different entry can be specified for the table of contents.
#+TITLE: A weary traveler who is lost * What they might look like :PROPERTIES: :ALT_TITLE: Looks :END: #+caption: A weary traveller [[../../../static/old-man-suitcase.png]] * What they might be thinking :PROPERTIES: :ALT_TITLE: Thoughts :END: #+begin_verse I grow wearier... Possibly I am lost, but I am not yet done. -- Suhail #+end_verse
Which, upon export, will result in the below HTML:
We can, if we so desire, alter the placement of the table of contents as well:
Org normally inserts the table of contents directly before the first headline of the file. To move the table of contents to a different location, first turn off the default with
org-export-with-toc
variable or with#+OPTIONS: toc:nil
. Then insert#+TOC: headlines N
at the desired location(s).
Configure the Postamble
However, there are some export settings that require us to alter some elisp18Emacs Lisp. variables. For instance, in order to alter the postamble we have to:
Set
org-html-preamble
to a string to override the default format string.[…]
The above also applies to
org-html-postamble
andorg-html-postamble-format
.
In order to set these variables during the export process, we have to use the
BIND
keyword.
After consulting the source code, we can now modify the postamble:
#+BIND: org-html-postamble "<p class=\"author\">Author: %a</p>\n<p class=\"date\">Date: %d</p>" #+AUTHOR: Suhail #+DATE: [2023-08-06 Sun] #+BIND: org-html-metadata-timestamp-format "%F" #+OPTIONS: toc:nil #+OPTIONS: num:nil #+TITLE: A weary traveler who is lost * What they might look like #+caption: A weary traveller [[../../../static/old-man-suitcase.png]] * What they might be thinking #+begin_verse I grow wearier... Possibly I am lost, but I am not yet done. -- Suhail #+end_verse
Which, upon export, will result in the below HTML:
When the documentation is insufficient
While org-mode
is quite well documented, it also has very many ways of
configuring different aspects of it, including the export process. There are
times when the documentation
proves to be insufficient. In those moments, we have to look at the
documentation in the source code.
For exporting org
to HTML, there are two places of interest:
-
Documentation of
org-export-options-alist
19For different versions oforg
, alter the git tag as needed. - Values of
:options-alist
for the HTML backend20ox-html
. that haveOPTIONS
andKEYWORD
strings asnil
.21I.e., from here to here.
For instance, to figure out how to modify how timestamps are formatted we peruse
through the options in
ox-html
till we come across
the line with the option :html-metadata-timestamp-format
:
160: (:html-metadata-timestamp-format nil nil org-html-metadata-timestamp-format)
The format of the above options is the same as that for
org-export-options-alist
. After the property name,
the values, in order, are:
KEYWORD
: A string which denotes the keyword that sets the value of this property.OPTION
: A string which denotes how to set the value of this property via theOPTIONS
keyword.DEFAULT
: The default value of the property and also the variable whose value can be used to alter the value of the property.BEHAVIOR
: How to handle multiple keywords, when possible, for the same property. If not provided, the default behaviour is to keep the first value.
The
documentation for org-html-metadata-timestamp-format
confirms
our hypothesis.
"Format used for timestamps in preamble, postamble and metadata. See `format-time-string' for more information on its components."
The referenced function format-time-string
is an Emacs function
documented here. Among other things it notes:
%F
This stands for the ISO 8601 date format, which is like
%+4Y-%m-%d
except that any flags or field width override the+
and (after subtracting 6) the4
.
Equipped with this information we are now finally able to customize the postamble to meet our needs.
Reference: org
syntax cheatsheet
Org is primarily about organizing and searching through your plain-text notes. However, it also provides a lightweight yet robust markup language for rich text formatting and more.
Unlike Markdown,22Which refers to a collection of similar syntaxes. org
is
a single syntax.23Similar to
reStructuredText. While the
Orgmode
website does an excellent job of documenting the details, below we
summarize some of the highlights.
Metadata syntax
The “metadata syntax” corresponds to the “and more” part of the above
comment. In the present context,24That of exporting org-mode
to HTML. this
syntax is used to affect the export process.
Comment syntax
Lines starting with zero or more whitespace characters followed by one
#
and a whitespace are treated as comments and, as such, are not exported.Likewise, regions surrounded by
#+BEGIN_COMMENT
…#+END_COMMENT
are not exported.
Keyword syntax
Keywords are structured according to the following pattern:
#+KEY: VALUEKEY A string consisting of any non-whitespace characters, other than call (which would forms a babel call element). VALUE A string consisting of any characters but a newline.
Some notable keywords of relevance to the export process:
Property syntax
Properties are key–value pairs. When they are associated with a single entry or with a tree they need to be inserted into a special drawer (see Drawers) with the name
PROPERTIES
, which has to be located right below a headline, and its planning line (see Deadlines and Scheduling) when applicable. Each property is specified on a single line, with the key—surrounded by colons—first, and the value after it. Keys are case-insensitive. Here is an example:* CD collection ** Classic *** Goldberg Variations :PROPERTIES: :Title: Goldberg Variations :Composer: J.S. Bach :END:
Rich-text syntax
Links
The general link format, … looks like this:
[[LINK][DESCRIPTION]]or alternatively
[[LINK]]
Additionally, several
ways of defining internal links25I.e., within a file such as foo.org
.
are supported:
[[#my-custom-id]]
will point to a node withCUSTOM_ID
property set tomy-custom-id
.[[*My section]]
will point to a headline with the nameMy section
.[[my target]]
will first try and look for (and match to) an occurrence of<<my target>>
in the file; if none found, it’ll try and match to an element with theNAME
set tomy target
.
Paragraphs and text formatting
Paragraphs are separated by at least one empty line. If you need to enforce a line break within a paragraph, use
\\
at the end of a line.
In addition, there also ways to represent blocks of text that preserve line-breaks26Verse block., quote a passage from another document27Quote block., and centering some text.28Center block.
Text can also be italicized etc:
You can make words
*bold*
,/italic/
,_underlined_
,=verbatim=
and~code~
, and, if you must,+strike-through+
. Text in the code and verbatim string is not processed for Org specific syntax; it is exported verbatim.[…]
Sometimes, when marked text also contains the marker character itself, the result may be unsettling… You can use zero width space29Unicode
0x200B
. to help Org sorting out the ambiguity.
You can also have superscripts and subscripts:
^
and_
are used to indicate super- and subscripts. To increase the readability of ASCII text, it is not necessary, but OK, to surround multi-character sub- and superscripts with curly braces.
And horizontal lines:
A line consisting of only dashes, and at least 5 of them, is exported as a horizontal line.
As well as specify custom HTML attributes:
Org files can also have special directives to the HTML export back-end. For example, by using
#+ATTR_HTML
lines to specify new format attributes30Such as, CSS class, inlined style etc. to31Including, but not limited to.<a>
or<img>
tags.
Footnotes
Two kinds of footnotes are supported:
- Anonymous footnotes32I.e., where the definition is inlined at the point of reference.
- Named footnotes33Which may be referenced multiple times.
An inline footnote is as follows:
Some text[fn::An inline footnote.] and then some more text after the footnote.
Whereas a named footnote:
… is started by a footnote marker in square brackets in column 0, no indentation allowed. It ends at the next footnote definition, headline, or after two consecutive empty lines. The footnote reference is simply the marker in square brackets, inside text. Markers always start with
fn:
. For example:The Org website[fn:55] now looks a lot better than it used to. ... [fn:55] The link is: https://orgmode.org
Figures and captions
An image is a link to an image file that does not have a description part, for example
file:./img/cat.jpg
Equivalently, we may also have:
[[./img/cat.jpg]]
We can also add captions:
#+CAPTION: my caption [[./img/cat.jpg]]
And customize the styling of it:
#+ATTR_HTML: :width 300px #+CAPTION: my caption [[./img/cat.jpg]]
Tables
Any line with
|
as the first non-whitespace character is considered part of a table.|
is also the column separator. Moreover, a line starting with|-
is a horizontal rule. It separates rows explicitly. Rows before the first horizontal rule are header lines.
The width of columns is automatically determined by the table editor. The alignment of a column is determined automatically from the fraction of number-like versus non-number fields in the column.
[…]
To set the width of a column, one field anywhere in the column may contain just the string
<N>
where N specifies the width as a number of characters.[…]
If you would like to overrule the automatic alignment of number-rich columns to the right and of string-rich columns to the left, you can use
<r>
,<c>
or<l>
in a similar fashion. You may also combine alignment and field width like this:<r10>
.
LaTeX and special symbols
And Greek letters:
You can use LaTeX-like syntax to insert special symbols—named entities—like
\alpha
to indicate the Greek letter34α, or\to
to indicate an arrow35→… If you need such a symbol inside a word, terminate it with a pair of curly brackets.[…]
During export, these symbols are transformed into the native format of the exporter back-end. Strings like
\alpha
are exported asα
in the HTML output…
One can also embed LaTeX:36Which, by default, when exported to HTML will use Mathjax, but can also be configured to transcode math into images.
LaTeX fragments do not need any special marking at all. The following snippets are identified as LaTeX source code:
- Environments of any kind.37When MathJax is used, only the environments recognized by MathJax are processed. When dvipng, dvisvgm, or ImageMagick suite is used to create images, any LaTeX environment is handled. The only requirement is that the
\begin
statement appears on a new line, preceded by only whitespace.- Text within the usual LaTeX math delimiters. To avoid conflicts with currency specifications, single
$
characters are only recognized as math delimiters if the enclosed text contains at most two line breaks, is directly attached to the$
characters with no whitespace in between, and if the closing$
is followed by whitespace, punctuation or a dash. For the other delimiters, there is no such restriction, so when in doubt, use\(...\)
as inline math delimiters.
Source code38What org-mode
refers to as “Literal examples”.
Source code can be embedded using #+BEGIN_SRC
and #+END_SRC
delimiters. When
done so, the code will be highlighted based on the configured syntax
highlighting in Emacs. A consequence of this fact is that simply by adding
syntax highlighting capabilities to your editor,39Assuming you use Emacs as
your editor. one can get syntax highlighting in the exported output.40Using
the
htmlize
Emacs package for HTML output format. For monospace
content, the #+BEGIN_EXAMPLE
and #+END_EXAMPLE
delimiters can be used
instead. Additionally,
Both in
example
and insrc
snippets, you can add a-n
switch to the end of the#+BEGIN
line, to get the lines of the example numbered. The-n
takes an optional numeric argument specifying the starting line number of the block. If you use a+n
switch, the numbering from the previous numbered snippet is continued in the current one. The+n
switch can also take a numeric argument. This adds the value of the argument to the last line of the previous block to determine the starting line number.
There’s also the ability to link to specific lines in the source code as well as the ability to highlight the specific line in the code example when hovering over a reference in the generated HTML.
Comments
Comments can be left on twitter, mastodon, as well as below, so have at it.
New post!
— The Weary Travelers blog (@wearyTravlrsBlg) August 6, 2023
Have you ever wanted to publish a blog from org-mode files? Find out how.https://t.co/o2HguzR9ki
Reply here if you have comments.
Footnotes:
See is-a vs has-a.
These will be addressed in subsequent articles.
Deriving from, in this case, ox-html
.
Or, org
syntax.
Extensibility, both for general markup and specifically for syntax highlighting requires Emacs Lisp knowledge.
Initial release of the HTML-export functionality seems to have been in 2011.
And likely incompatible with the latest release.
In fact, we use it in this blog.
A CSS theme inspired by Edward Tufte’s books and handouts which, among other things, supports notes in the right margin.
Where $file.org
stands for
name of the file.
With --not-init-file
.
Via the --eval
argument.
Not strictly necessary, since
org-mode
is included in recent Emacs and would be loaded by visiting the
org
file.
For different
versions of org
, alter the git tag as needed.
Which refers to a collection of similar syntaxes.
Similar to reStructuredText.
That of exporting org-mode
to HTML.
I.e., within a file such as foo.org
.
Unicode 0x200B
.
Such as, CSS class, inlined style etc.
Including, but not limited to.
I.e., where the definition is inlined at the point of reference.
Which may be referenced multiple times.
α
→
Which, by default, when exported to HTML will use Mathjax, but can also be configured to transcode math into images.
When MathJax is used, only the environments recognized by MathJax are processed. When dvipng, dvisvgm, or ImageMagick suite is used to create images, any LaTeX environment is handled.
What org-mode
refers to as “Literal examples”.
Assuming you use Emacs as your editor.