How to overcome syntactic limitations in org-mode
HTML markup is expressive, but verbose. It has given rise to several lightweight markup languages that can be exported to HTML.1E.g., Markdown family of syntaxes, reStructuredText, AsciiDoc etc.
Each lightweight markup language trades off expressivity and readability. As
such not every valid HTML
construct may be directly expressible in every
lightweight markup language. When describing tabular information using org
syntax, for instance, the below isn’t directly expressible:
- a table cell containing a list
- a table cell containing a nested table
As a contrast, tables in reStructuredText can contain nested tables as well as lists:
+------------+------------+----------------------+ | Header 1 | Header 2 | Header 3 | +============+============+======================+ | body row 1 | column 2 | column 3 | +------------+------------+----------------------+ | body row 2 | Cells may span columns. | +------------+------------+----------------------+ | body row 3 | Cells may | - Cells | +------------+ span rows. | - can contain | | | | | | body row 4 | | - lists | | | | | | | | - with some nesting | +------------+------------+----------------------+
Can we selectively make use of reStructuredText’s syntax within an org-mode
file that we intend to export to HTML?
Evaluation of code blocks
org-mode
is much more than simply a markup language, and there’s a principled
way of leveraging some of its capabilities in order to escape this syntactic
limitation.2Specifically, by defining evaluation of a code block in a
specific markup language as exporting it to a particular output format, one can
overcome any syntactic limitations that org-mode
may pose.
In addition to its abilities to export to different output formats via exporter
back-ends, org-mode
also has extensive literate programming capabilities,
such as the ability to extract and evaluate code blocks in different languages.
An important feature of Org’s management of source code blocks is the ability to pass variables, functions, and results to one another using a common syntax for source code blocks in any language. Although most literate programming facilities are restricted to one language or another, Org’s language-agnostic approach lets the literate programmer match each programming task with the appropriate computer language and to mix them all together in a single Org document. This interoperability among languages explains why Org’s source code management facility was named Org Babel by its originators, Eric Schulte and Dan Davison.
In order to define how to interpret a code block in a language <lang>
, we need
to define a function named org-babel-execute:<lang>
. Additionally, the
convention is to define this function in a package named
ob-<lang>
.3Specifically, the latter convention is what
org-babel-do-load-languages
depends on for enabling evaluation support.
But what is export, if not simply a particular kind of evaluation?
Exporting as an instance of evaluation
The python3-docutils
package provides rst2html4
which is a
commandline utility that allows one to export rst
to HTML
. Thus all we need
to do in org-babel-execute:rst
is to invoke rst2html4
with the appropriate
parameters and we’re done.
rst2html4
, by default, generates the full HTML
page. However, this behaviour
can be modified by passing in an explicit --template
parameter.
--template=<file> Template file. (UTF-8 encoded, default: "/usr/lib/python3.11/site- packages/docutils/writers/html4css1/template.txt")
By providing the below template4Obtained by trial-and-error on the default
template. we are able to instruct rst2html4
to only generate the content in
the <body>
tag.
%(body_pre_docinfo)s %(docinfo)s %(body)s
We will additionally wrap the generated HTML
output in a <div>
element with
custom classes to allow for styling to be configured.
(require 'org-macs) (require 'ob) (require 'ob-dot) ;; we reuse `org-babel-expand-body:dot' ;;; completion support during interactive use (defconst org-babel-header-args:rst '((class . :any) (cmd . :any)) "RST-specific header arguments.") ;;; main/essential code (defvar org-babel-default-header-args:rst '((:results . "html") (:class . "") (:cmd . "rst2html4")) "Default arguments to use when evaluating an RST source block.") (defun org-babel-execute:rst (body params) "Define execution of an `rst-mode' block as exporting. BODY is the `rst-mode' code block and PARAMS are the header arguments. This function defines an additional header-argument `:class' which defines additional classes that need to be added to the wrapping element when exporting to HTML. Exporting to outputs other than HTML, while possible, isn't yet implemented." (let ((results (split-string (cdr (assq :results params))))) (cond ((member "html" results) (let* ((classes (cdr (assq :class params))) (cmd (cdr (assq :cmd params))) (template (org-babel-temp-file "rst-" ".txt")) (cmdline (format "--template=%s" (org-babel-process-file-name template))) (coding-system-for-read 'utf-8) (coding-system-for-write 'utf-8) (in-file (org-babel-temp-file "rst-" ".rst")) (cmdstring (concat cmd " " cmdline " " (org-babel-process-file-name in-file)))) (with-temp-file template (insert "%(body_pre_docinfo)s\n%(docinfo)s\n%(body)s")) (with-temp-file in-file (insert (org-babel-expand-body:dot body params))) (format "<div class='%s-snippet %s'>\n %s </div>" cmd classes (org-babel-eval cmdstring "")))) ((member "latex" results ) (error "LaTeX export of RST block not yet implemented")) (t (error "Result format not supported"))))) (defun org-babel-prep-session:rst (_session _params) "Return an error because RST does not support sessions." (error "RST does not support sessions")) (provide 'ob-rst)
Defining ob-rst.el
as above, while necessary, isn’t sufficient by itself. We
also have to enable code evaluation for rst
.
By default, only Emacs Lisp is enabled for evaluation. To enable or disable other languages, customize the org-babel-load-languages variable either through the Emacs customization interface, or by adding code to the init file as shown next.
In this example, evaluation is enabled for Emacs Lisp as well as reStructuredText.
(org-babel-do-load-languages 'org-babel-load-languages '((emacs-lisp . t) (rst . t)))
Using the above, we are able to use rst
syntax to define richer tables and
have them be converted to HTML
automatically during export. For instance, this
rst-mode
snippet in an org file:
#+begin_src rst :exports results :eval yes +------------+------------+----------------------+ | Header 1 | Header 2 | Header 3 | +============+============+======================+ | body row 1 | column 2 | column 3 | +------------+------------+----------------------+ | body row 2 | Cells may span columns. | +------------+------------+----------------------+ | body row 3 | Cells may | - Cells | +------------+ span rows. | - can contain | | | | | | body row 4 | | - lists | | | | | | | | - with some nesting | +------------+------------+----------------------+ #+end_src
Results in the following HTML
table being generated.
Header 1 | Header 2 | Header 3 |
---|---|---|
body row 1 | column 2 | column 3 |
body row 2 | Cells may span columns. | |
body row 3 | Cells may span rows. |
|
body row 4 |
Conclusion
When using org-mode
as a lightweight markup language, if a syntactic
limitation is encountered,5Which is not inherent to the output format being
exported to. the remedy is straightforward.
- Identify a lightweight markup language more suited to the task.
- Define an
org-babel-execute:<lang>
function6Inob-<lang>.el
. which exports code in language<lang>
to the desired output format.7Pandoc may be used if a native conversion facility doesn’t exist. - Enable
<lang>
viaorg-babel-do-load-languages
.8 Step 4: Profit.
That’s it. That’s the idea.
Comments
Comments can be left on twitter, mastodon, as well as below, so have at it.
New post!
— The Weary Travelers blog (@wearyTravlrsBlg) September 4, 2023
Part 3: Have you ever wanted to publish a blog from org-mode files? Find out how.https://t.co/MGqyNpLZuh
Reply here if you have comments.
Footnotes:
E.g., Markdown family of syntaxes, reStructuredText, AsciiDoc etc.
Specifically, by defining evaluation of a code block in a
specific markup language as exporting it to a particular output format, one can
overcome any syntactic limitations that org-mode
may pose.
Specifically, the latter convention is what
org-babel-do-load-languages
depends on for enabling evaluation support.
Obtained by trial-and-error on the default template.
Which is not inherent to the output format being exported to.
In ob-<lang>.el
.