A lightweight markup language ( LML), also termed a simple or humane markup language, is a markup language with simple, unobtrusive syntax. It is designed to be easy to write using any generic text editor and easy to read in its raw form. Lightweight markup languages are used in applications where it may be necessary to read the raw document as well as the final rendered output.
For instance, a person downloading a software library might prefer to read the documentation in a text editor rather than a web browser. Another application for such languages is to provide for data entry in web publishing, such as and , where the input interface is a simple text box. The server software then converts the input into a common document markup language like HTML.
In 1986 international standard SGML provided facilities to define and parse lightweight markup languages using grammars and tag implication. The 1998 W3C XML is a profile of SGML that omits these facilities. However, no SGML document type definition (DTD) for any of the languages listed below is known.
Most languages distinguish between markup for lines or blocks and for shorter spans of texts, but some only support inline markup.
Some markup languages are tailored for a specific purpose, such as documenting computer code (e.g. POD, RD) or being converted to a certain output format (usually HTML) and nothing else, others are more general in application. This includes whether they are oriented on textual presentation or on data serialization.
Presentation oriented languages include AsciiDoc, atx, BBCode, Creole, Crossmark, Epytext, Haml, JsonML, MakeDoc, Markdown, Org-mode, POD, reStructuredText, RD, Setext, SiSU, SPIP, Xupl, Texy!, Textile, txt2tags, UDO and Wikitext.
Data serialization oriented languages include Curl (homoiconic, but also reads JSON; every object serializes), JSON, and YAML.
+ Comparing language features |
Markdown's own syntax does not support class attributes or id attributes; however, since Markdown supports the inclusion of native HTML code, these features can be implemented using direct HTML. (Some extensions may support these features.)
txt2tags' own syntax does not support class attributes or id attributes; however, since txt2tags supports inclusion of native HTML code in tagged areas, these features can be implemented using direct HTML when saving to an HTML target.
LMLs sometimes differ for multi-word markup where some require the markup characters to replace the inter-word spaces ( infix). Some languages require a single character as prefix and suffix, other need doubled or even tripled ones or support both with slightly different meaning, e.g. different levels of emphasis.
+ Comparison of text formatting syntax | ||||
AsciiDoc | code | tt | <strong>strongly emphasized</strong> | Can double operators to apply formatting where there is no word boundary (for example <em>emphasized text</em> yields bold t ext). |
<code>code</code> | <b>bold text</b> | |||
<i>italic text</i> | <tt>monospace text</tt> | |||
<nowiki></nowiki> | ||||
<nowiki></nowiki> | <nowiki></nowiki> | presentational HTML tags | ||
<nowiki></nowiki> | ||||
<nowiki></nowiki> | ||||
Gemtext does not have any inline formatting, monospaced text (known preformatted text to the Gemini community) has to have the opening and closing <nowiki></nowiki> on their own lines.
Microsoft Word and Outlook, and accordingly other word processors and mail clients that strive for a similar user experience, support the basic convention of using asterisks for boldface and underscores for italic style. While Word removes the characters, Outlook retains them.
+ Italic type or normal emphasis |
Code ! AsciiDoc !! ATX !! Creole !! Jira !! Markdown !! MediaWiki !! Org-mode !! PmWiki !! reST !! Setext !! Slack !! Textile !! Texy! !! TiddlyWiki !! txt2tags !! WhatsApp |
+ Bold face or strong emphasis |
Code ! AsciiDoc !! ATX !! Creole !! Jira !! Markdown !! MediaWiki !! Org-mode !! PmWiki !! reST !! Setext !! Slack !! Textile !! Texy! !! TiddlyWiki !! txt2tags !! WhatsApp |
+ Underlined or inserted text |
Code ! Jira !! Markdown !! Org-mode !! Setext !! TiddlyWiki !! txt2tags |
AsciiDoc, ATX, Creole, MediaWiki, PmWiki, reST, Slack, Textile, Texy! and WhatsApp do not support dedicated markup for underlining text.
+ Strike-through or deleted text |
Code ! Jira !! Markdown !! Org-mode !! Slack !! TiddlyWiki !! txt2tags !! WhatsApp |
AsciiDoc, ATX, Creole, MediaWiki, PmWiki, reST, Setext, Textile and Texy! do not support dedicated markup for striking through text.
+ Monospaced font, teletype text or code |
Code ! AsciiDoc !! ATX !! Creole !! Gemtext !! Jira !! Markdown !! Org-mode !! PmWiki !! reST !! Slack !! Textile !! Texy! !! TiddlyWiki !! txt2tags !! WhatsApp |
Mediawiki, Setext and Gemtext do not provide lightweight markup for inline code spans.
Most LMLs follow one of two styles for headings, either Setext-like underlines or atx-like "atx, the true structured text format" by Aaron Swartz (2002) line markers, or they support both.
Level 1 HeadingThe first style uses underlines, i.e. repeated characters (e.g. equals <nowiki></nowiki>, hyphen <nowiki></nowiki> or tilde <nowiki></nowiki>, usually at least two or four times) in the line below the heading text.
===Level 2 Heading ---------------Level 3 Heading ~~~~~~~~~~~~~~~
+ Underlined heading levels ! Chars: ! <nowiki></nowiki> ! <nowiki></nowiki> ! <nowiki></nowiki> ! <nowiki></nowiki> ! <nowiki></nowiki> ! <nowiki></nowiki> ! <nowiki></nowiki> ! <nowiki></nowiki> ! <nowiki></nowiki> ! <nowiki></nowiki> ! <nowiki></nowiki> ! <nowiki></nowiki> ! <nowiki></nowiki> !title="Minimum of characters" | min |
# Level 1 HeadingThe second style is based on repeated markers (e.g. hash <nowiki></nowiki>, equals <nowiki></nowiki> or asterisk <nowiki></nowiki>) at the start of the heading itself, where the number of repetitions indicates the (sometimes inverse) heading level. Most languages also support the reduplication of the markers at the end of the line, but whereas some make them mandatory, others do not even expect their numbers to match.
- Level 2 Heading ##
- Level 3 Heading ###
+ Line prefix (and suffix) headings ! Character: ! <nowiki></nowiki> ! <nowiki></nowiki> ! <nowiki></nowiki> ! <nowiki></nowiki> ! <nowiki></nowiki> ! Suffix ! Levels ! Indentation |
Org-mode supports indentation as a means of indicating the level.
BBCode does not support section headings at all.
POD and Textile choose the HTML convention of numbered heading levels instead.
+ Other heading formats |
Microsoft Word supports auto-formatting paragraphs as headings if they do not contain more than a handful of words, no period at the end and the user hits the enter key twice. For lower levels, the user may press the tabulator key the according number of times before entering the text, i.e. one through eight tabs for heading levels two through nine.
LMLs that are tailored for special setups, e.g. wikis or code documentation, may automatically generate named anchors (for headings, functions etc.) inside the document, link to related pages (possibly in a different namespace) or provide a textual search for linked keywords.
Most languages employ (double) square or angular brackets to surround links, but hardly any two languages are completely compatible. Many can automatically recognize and parse absolute URLs inside the text without further markup.
+ Hyperlink syntax ! Languages ! Basic syntax !! Text syntax !! Title syntax |
Gemtext links have to be on a line by themselves, they cannot be used inline.
+ Reference syntax ! Languages ! Text syntax !! Title syntax | |||
!rowspan=3 | Markdown | ||
Org-mode's normal link syntax does a text search of the file. You can also put in dedicated targets with <nowiki></nowiki>.
+ Unordered, bullet list items ! Characters: ! <nowiki></nowiki> ! <nowiki></nowiki> ! <nowiki></nowiki> ! <nowiki></nowiki> ! <nowiki></nowiki> ! ! ! <nowiki></nowiki> ! <nowiki></nowiki> ! ! ! ! ! nest |
Microsoft Word automatically converts paragraphs that start with an asterisk <nowiki></nowiki>, hyphen-minus <nowiki></nowiki> or greater-than bracket <nowiki></nowiki> followed by a space or horizontal tabulator as bullet list items. It will also start an enumerated list for the digit 1 and the case-insensitive letters a (for alphabetic lists) or i (for roman numerals), if they are followed by a period <nowiki></nowiki>, a closing round parenthesis <nowiki></nowiki>, a greater-than sign <nowiki></nowiki> or a hyphen-minus <nowiki></nowiki> and a space or tab; in case of the round parenthesis an optional opening one <nowiki></nowiki> before the list marker is also supported.
Languages differ on whether they support optional or mandatory digits in numbered list items, which kinds of enumerators they understand (e.g. decimal digit 1, roman numerals i or I, alphabetic letters a or A) and whether they support to keep explicit values in the output format. Some Markdown dialects, for instance, will respect a start value other than 1, but ignore any other explicit value.
+ Ordered, enumerated list items ! Chars: ! <nowiki></nowiki> ! <nowiki></nowiki> ! <nowiki></nowiki> ! <nowiki></nowiki> ! <em> ! <strong> ! <i> ! <b> ! *italic* ! ! ! ! ! ! ! nest |
Slack assists the user in entering enumerated and bullet lists, but does not actually format them as such, i.e. it just includes a leading digit followed by a period and a space or a bullet character **italic** in front of a line.
+ Labeled, glossary, definition list syntax ! Languages ! Term being defined !! Definition of the term |
|
|