by roundabout, Thursday, 4 April 2024, 11:27:00 (1712230020), pushed by roundabout, Wednesday, 31 July 2024, 06:54:44 (1722408884)
Author identity: vlad <vlad.muntoiu@gmail.com>
af4154b9fc587658b00c9c13852d56c5029166a1
doc/enduser/Formatting messages.md
@@ -0,0 +1,429 @@
Formatting messages
===================
Formatting in 30 seconds
------------------------
To format messages you use special syntax.
* `*italic*` or `_italic_` becomes *italic*.
* `**bold**` or `__bold__` becomes **bold**.
* `~~strikethrough~~` becomes ~~strikethrough~~.
* `++inserted++` becomes ++inserted++.
* `--deleted--` becomes --deleted--.
* `[link](https://wikipedia.org)` becomes [link](https://wikipedia.org).
* ``
becomes .
* ``code`` becomes `code`.
* Paragraphs are separated by a blank line.
* `~~~` or `~~~language` starts a code block; `~~~` ends it.
* `#` starts a heading. The number of `#`s determines the level.
* `>` starts a blockquote.
* `*` or `<number>. ` starts a list item.
* `;` makes a comment.
* A backslash at the end of a line forces a line break.
Introduction
------------
The roundabout uses a markdown-like syntax to format messages, including
commit messages, comments, descriptions, files and so on.
It is designed to be similar in philosophy to markdown, and for markdown
users it should be really similar (this is why we use Markdown's extension
and mime types provisionally, until there is more adoption).
Currently not all markdown (as described in Gruber's initial release)
features are supported, but the most common use cases should be covered.
Conversely, some extra features are available, primarily ones useful in
a code review environment. Some special cases are handled differently
compared to Gruber's markdown.
Very importantly, it is *******not******* CommonMark compliant!
There is also not a formalised specification for the syntax; first we need
to stabilise it.
The syntax does not specify a particular output format, but the reference
implementation shipped here outputs HTML.
Philosophy
----------
Like the original markdown, the goal with this syntax is to make it possible
to write _plain_ text that has formatting as a bonus when it's supported.
A document should be readable and publishable even if a rendered version is
not available, in its raw form, without looking like it's been marked up
with tags or formatting instructions.
* **Easy to read:** The syntax should be understandable enough
so that it is readable without rendering it, even for someone not familiar
with this syntax.
* **Easy to write:** The syntax should be easy to write using a
regular keyboard, only in plain text. The renderer will try its best to
understand the human.
* However, it should be still easy to read. The syntax should look like
what it represents, even if it's a little harder to write it.
* **Predictable:** Edge-cases should be avoided as much as
possible, as long as it still understands the writer's intent.
* **Easy (and fast) to render:** A reasonably fast renderer should
be possible to write with a reasonable amount of effort and a few lines of
code in a high-level language.
* **Unambiguous:** There should not be confusion in parsing.
Precedence should be clear, and it should always prioritise the most likely
interpretation.
* **Extensible:** Without breaking forward compatibility, it should
be possible to add new features to the syntax programmatically, including
special interest features like charts, sheet music, or code review features.
* **Semantic:** Available features should never imply altering
the document's presentation, but rather its structure and meaning. Clients
could style all features as they wish, as long as they reasonably make sense.
Allowing the user to add stylesheets is fine, as long as the document itself
doesn't contain any style information, and they're not mandatory.
* **Secure:** The syntax should not allow for arbitrary code execution, or any
other security risk. It should be safe to render untrusted documents.
* **Familiar and traditional**: Users should be able to learn it in a few minutes,
it should be similar to markdown, and follow traditions rooted in plain text
mediums, such as email, Usenet, IRC or typewriters.
Structure and terminology
-------------------------
A document is a sequence of _blocks_. Many blocks, but not all, can contain other
_child_ blocks too. Most blocks can hold content, in which case they can also hold
_inline elements_.
Most inline elements can hold other inlines as well, but they can't hold blocks.
Some elements, both blocks and inlines, can have _attributes_. Generally this is
specified in a clear way in the syntax.
Any sequence of characters is a valid document.
Blocks
------
### Void
A void is just a blank line (appears white, i.e. is empty or only has spaces).
It is used to force block separation and is not output in the rendered document.
~~~
block 1
block 2
~~~
turns into
block 1
block 2
#### Comment
If a line starts with `;`, it is turned into a void, to allow the writer to
add comments to the document. This is not output in the rendered document,
like any other void. It also separates blocks.
~~~
block 1
; this is a comment
block 2
~~~
turns into
block 1
; this is a comment
block 2
Plain text mediums haven't got a tradition of leaving comments. But the
character `;` was used as it is very unlikely to appear at the start of a
line.
The space is not required. The line just has to begin with a semicolon.
### Paragraph
Anything not recognised as another block is considered a paragraph. You can
use voids to separate paragraphs. Within a paragraph, whitespace is collapsed
like in HTML, so any sequence of spaces will become one space, and newlines
will be ignored, although the rendered output can automatically wrap lines.
To force a line break in a paragraph, end the line with a backslash.
> **Important:**\
> Unlike in standard markdown, two trailing spaces do not force a line break.
> That syntax is often considered confusing: in most editors, trailing spaces
> are invisible.
>
> Instead, use a backslash at the end of the line.
~~~
This is a paragraph.
This is a new line in the same paragraph.
It doesn't wrap, unless you force it to.\
Now it wraps.
This is a new paragraph.
~~~
turns into
This is a paragraph.
This is a new line in the same paragraph.
It doesn't wrap, unless you force it to.\
Now it wraps.
This is a new paragraph.
Paragraphs can't contain other blocks because it wouldn't make sense, but
they can contain inline elements.
When a block that can contain other blocks has plain text, it is
automatically a paragraph, since the parsing is recursive.
### Heading
#### ATX-style heading
ATX-style headings are lines which start with one or more `#`, then a space.
The number of `#`s determines the heading level.
* For H1 you would use `#`.
* For H2 you would use `##`.
* For H3 you would use `###`.
* For H4 you would use `####`.
* For H5 you would use `#####`.
* For H6 you would use `######`.
~~~
# Heading 1
## Heading 2
### Heading 3
#### Heading 4
##### Heading 5
###### Heading 6
~~~
turns into
# Heading 1
## Heading 2
### Heading 3
#### Heading 4
##### Heading 5
###### Heading 6
> **Important:**\
> Unlike in standard markdown, a space after the `#` is required.
> This is because it is more readable and less likely to convert
> unintended sequences of `#`s into headings.
> Additionally, markdown had a rarely-used feature that allowed
> closing headers with `#`s, which is not supported here.
#### Setext-style heading
Setext headings are only available for H1 and H2. To create an H1,
underline it with `=`, and to create an H2, underline it with `-`.
~~~
Heading 1
=========
Heading 2
---------
~~~
turns into
Heading 1
=========
Heading 2
---------
The number of underlines doesn't matter; you can even use just one.
However, it looks nicer if you match the width of the heading.
I personally prefer this style where available as it gives more visual
weight to the heading and matches traditions.
All headings are single-line. Headings cannot contain other blocks,
but they can contain inline elements.
### Fence (code block)
A fence is a block that starts and ends with either `````
or `~~~`. Generally it is used for program code.
After the opening fence, on the same line, there can be a language
descriptor. Usually it is used to apply syntax highlighting to the
code inside, but it could be used for other purposes.
~~~
```text
This is a code block.
```
~~~
turns into
```text
This is a code block.
```
The language descriptor is optional. If it is not present, the
renderer should not apply syntax highlighting, and the block is
assumed to be plain text.
Obviously, since the fence contains text that should be rendered
specially, it can't contain any other blocks or inlines.
Indented code blocks are currently not supported, but they might
be in the future.
### Blockquote
A blockquote is a block that starts with `>`. It is used to quote
other text, or to set aside text that is not part of the main
content.
A space after the `>` is not required, but recommended for
readability.
Blockquotes can contain other blocks, but not inlines. If you want
inlines they will be in a paragraph anyways.
~~~
> This is a blockquote.
> It can contain multiple lines.
> ## And even other blocks.
> > And they can be nested.
> And you can be lazy.
>
> If this isn't what you want, it can have voids too.
~~~
turns into
> This is a blockquote.
> It can contain multiple lines.
> ## And even other blocks.
> > And they can be nested.
> And you can be lazy.
>
> If this isn't what you want, it can have voids too.
### List
A list is a sequence of list items. List items are represented by lines
that start with `*`, `+`, or `-`, followed by a space, or with a number
followed by a period. The first type creates a bullet (unordered) list,
and the second type creates a numbered (ordered) list. For ordered lists,
the number doesn't matter; it always begins from 1 in HTML.
List items can contain other blocks.
A list item can have continuation lines and contain multiple blocks.
Continuation lines are indented with _two spaces_.
> **Important:**\
> Unlike in standard markdown, a space after the `*`, `+` or `-` is
> required.
> Also unlike in standard markdown, parsing works normally inside
> list items, but they must be indented with two spaces, not four.
> This is because it looks nicer with unordered lists:
> ~~~
> * This is a list item.
> This is a continuation line.
> ; this version vs.
> * This is a list item.
> This is a continuation line.
> ; real markdown
> ~~~
### Horizontal rule
A horizontal rule is a line that contains only hyphens, underscores,
or asterisks, and optionally spaces. The length must be at least 3.
~~~
---------------
________
* * *
~~~
turns into
---------------
________
* * *
### Summary of blocks
1. **Void:** blank line, or line starting with `;`, empty and not output.
2. **Paragraph:** anything not recognised as another block, can contain
inline elements but not blocks.
3. **Heading:** prefix with `#` or underline, single-line, can contain
inline elements.
4. **Fence:** code block, starts and ends with `~~~` or `````,
can contain only text, optionally with a language descriptor.
5. **Blockquote:** starts with `>`, can contain other blocks with recursive
parsing.
6. **List:** starts with `*`, `+`, `-`, or `<number>.`, continuation lines
are indented with two spaces, can contain other blocks with recursive
parsing.
7. **Horizontal rule:** line with at least 3 characters, hyphens,
underscores, or asterisks, optionally with spaces.
Inline elements
---------------
### Text
Technically not an inline element, represented by a simple text in the
output. Anything not otherwise recognised is text.
### Emphasis
Emphasis is represented by `*` or `_` surrounding the text. When using `_`,
the emphasis must be surrounded by spaces or start/end of the line.
There can be at most 7 markers on each side, the number of opening markers
must be exactly the same as the number of closing markers,
and they cannot be mixed, unlike in standard markdown. (Maybe this will change
in the future, but we'll see.) However mixing them sometimes works, only when
nesting makes sense.
The number of markers is interpreted as a sum of powers of 2. If it's 4 or
more then there's the third level of emphasis, if the remainder is 2 or more
then there's the second level of emphasis, and if there's a remainder then
there's the first level of emphasis. You can see how they look below:
1. *level 1*
2. **level 2**
3. ***level 1+2***
4. ****level 3****
5. *****level 1+3*****
6. ******level 2+3******
7. *******level 1+2+3*******
### Strikethrough
Strikethrough is represented by `~~` surrounding the text.
### Diff marker
Diff markers are represented by `++` on either side of the text for inserted
text and `--` for deleted text.
### Link
Links are represented by `[text](url)`. The text can contain inline elements.
Title attributes are currently not supported, but they will be.
#### Image
Images are represented by ``.
### Code
Code is represented by ``` surrounding the text.
Using two backticks is not currently supported.
### Summary of inline elements
1. **Text:** anything not recognised as another inline element.
2. **Emphasis:** `*` or `_` surrounding the text, at most 7 markers on each side.
3. **Strikethrough:** `~~` surrounding the text.
4. **Diff marker:** `++` or `--` surrounding the text.
5. **Link:** `[text](url)`, text can contain inline elements.
6. **Image:** ``.
7. **Code:** ``` surrounding the text.