-.\" $Id: mandoc_char.7,v 1.70 2018/08/08 14:03:27 schwarze Exp $
+.\" $Id: mandoc_char.7,v 1.78 2020/10/31 11:45:16 schwarze Exp $
.\"
.\" Copyright (c) 2003 Jason McIntyre <jmc@openbsd.org>
.\" Copyright (c) 2009, 2010, 2011 Kristaps Dzonsons <kristaps@bsd.lv>
-.\" Copyright (c) 2011, 2013, 2015, 2017 Ingo Schwarze <schwarze@openbsd.org>
+.\" Copyright (c) 2011,2013,2015,2017-2020 Ingo Schwarze <schwarze@openbsd.org>
.\"
.\" Permission to use, copy, modify, and distribute this software for any
.\" purpose with or without fee is hereby granted, provided that the above
.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
.\"
-.Dd $Mdocdate: August 8 2018 $
+.Dd $Mdocdate: October 31 2020 $
.Dt MANDOC_CHAR 7
.Os
.Sh NAME
names; instead, provide ASCII transcriptions of the names.
.Ss Dashes and Hyphens
In typography there are different types of dashes of various width:
-the hyphen (-),
+the hyphen (\(hy),
the en-dash (\(en),
the em-dash (\(em),
and the mathematical minus sign (\(mi).
lorry-driver
.Ed
.Pp
-If a word on a text input line contains a hyphen, a formatter may decide
-to insert an output line break after the hyphen if that helps filling
-the current output line, but the whole word would overflow the line.
-If it is important that the word is not broken across lines in this
-way, a zero-width space
-.Pq Sq \e&
-can be inserted before or after the hyphen.
-While
-.Xr mandoc 1
-never breaks the output line after hyphens adjacent to a zero-width
-space, after any of the other dash- or hyphen-like characters
-represented by escape sequences, or after hyphens inside words in
-macro arguments, other software may not respect these rules and may
-break the line even in such cases.
-.Pp
-Some
-.Xr roff 7
-implementations contains dictionaries allowing to break the line
-at syllable boundaries even inside words that contain no hyphens.
-Such automatic hyphenation is not supported by
-.Xr mandoc 1 ,
-which only breaks the line at whitespace, and inside words only
-after existing hyphens.
-.Pp
The en-dash is used to separate the two elements of a range,
or can be used the same way as an em-dash.
It should be written as
.Fl T Cm utf8
and
.Fl T Cm html .
-But currently, no practically relevant manual page formatter actually
-requires that subtlety, so in manual pages just write plain
+But currently, no practically relevant manual page formatter requires
+that subtlety, so in manual pages, it is sufficient to write plain
.Sq -
to represent hyphen, minus, and hyphen-minus.
+.Pp
+If a word on a text input line contains a hyphen, a formatter may decide
+to insert an output line break after the hyphen if that helps filling
+the current output line, but the whole word would overflow the line.
+If it is important that the word is not broken across lines in this
+way, a zero-width space
+.Pq Sq \e&
+can be inserted before or after the hyphen.
+While
+.Xr mandoc 1
+never breaks the output line after hyphens adjacent to a zero-width
+space, after any of the other dash- or hyphen-like characters
+represented by escape sequences, or after hyphens inside words in
+macro arguments, other software may not respect these rules and may
+break the line even in such cases.
+.Pp
+Some
+.Xr roff 7
+implementations contains dictionaries allowing to break the line
+at syllable boundaries even inside words that contain no hyphens.
+Such automatic hyphenation is not supported by
+.Xr mandoc 1 ,
+which only breaks the line at whitespace, and inside words only
+after existing hyphens.
.Ss Spaces
To separate words in normal text, for indenting and alignment
in literal context, and when none of the following special cases apply,
.Xr roff 7
manual.
.Pp
-Spacing:
+Spaces, non-breaking unless stated otherwise:
.Bl -column "Input" "Description" -offset indent -compact
.It Em Input Ta Em Description
-.It Sq \e\ \& Ta unpaddable non-breaking space
-.It \e\(ti Ta paddable non-breaking space
-.It \e0 Ta unpaddable, breaking digit-width space
+.It Sq \e\ \& Ta unpaddable space
+.It \e\(ti Ta paddable space
+.It \e0 Ta digit-width space
.It \e| Ta one-sixth \e(em narrow space, zero width in nroff mode
.It \e^ Ta one-twelfth \e(em half-narrow space, zero width in nroff
.It \e& Ta zero-width space
+.It \e) Ta zero-width space transparent to end-of-sentence detection
.It \e% Ta zero-width space allowing hyphenation
+.It \e: Ta zero-width space allowing line break
.El
.Pp
Lines:
.It \e(\(aqI Ta \('I Ta acute I
.It \e(\(aqO Ta \('O Ta acute O
.It \e(\(aqU Ta \('U Ta acute U
+.It \e(\(aqY Ta \('Y Ta acute Y
.It \e(\(aqa Ta \('a Ta acute a
.It \e(\(aqe Ta \('e Ta acute e
.It \e(\(aqi Ta \('i Ta acute i
.It \e(\(aqo Ta \('o Ta acute o
.It \e(\(aqu Ta \('u Ta acute u
+.It \e(\(aqy Ta \('y Ta acute y
.It \e(\(gaA Ta \(`A Ta grave A
.It \e(\(gaE Ta \(`E Ta grave E
.It \e(\(gaI Ta \(`I Ta grave I
and
.Sq \e*[N]
.Pq N-character .
-For details, see the
-.Em Predefined Strings
-subsection of the
-.Xr roff 7
-manual.
.Bl -column "Input" "Rendered" "Description" -offset indent
.It Em Input Ta Em Rendered Ta Em Description
.It \e*(Ba Ta \*(Ba Ta vertical bar
.Xr mandoc 1
also supports the
.Pp
-.Dl \eN\(aq Ns Ar number Ns \(aq
+.Dl \eN\(aq Ns Ar number Ns \(aq and \e[ Ns Cm char Ns Ar number ]
.Pp
-escape sequence, inserting the character
+escape sequences, inserting the character
.Ar number
from the current character set into the output.
Of course, this is inherently non-portable and is already marked
-as deprecated in the Heirloom roff manual.
-For example, do not use \eN\(aq34\(aq, use \e(dq, or even the plain
+as deprecated in the Heirloom roff manual;
+on top of that, the second form is a GNU extension.
+For example, do not use \eN\(aq34\(aq or \e[char34], use \e(dq,
+or even the plain
.Sq \(dq
character where possible.
.Sh COMPATIBILITY