X-Git-Url: https://git.cameronkatri.com/mandoc.git/blobdiff_plain/a8bf4b0fd555d5f2c02407d5e3f97a00d18c6296..dcd90e7955626a38ed974eeba3b599fb534e7b62:/mandoc_char.7?ds=sidebyside diff --git a/mandoc_char.7 b/mandoc_char.7 index 20b9f947..8d835665 100644 --- a/mandoc_char.7 +++ b/mandoc_char.7 @@ -1,6 +1,8 @@ -.\" $Id: mandoc_char.7,v 1.43 2011/04/20 22:50:22 kristaps Exp $ +.\" $Id: mandoc_char.7,v 1.56 2013/12/26 17:23:42 schwarze Exp $ .\" -.\" Copyright (c) 2009 Kristaps Dzonsons <kristaps@bsd.lv> +.\" Copyright (c) 2003 Jason McIntyre <jmc@openbsd.org> +.\" Copyright (c) 2009, 2010, 2011 Kristaps Dzonsons <kristaps@bsd.lv> +.\" Copyright (c) 2011 Ingo Schwarze <schwarze@openbsd.org> .\" .\" Permission to use, copy, modify, and distribute this software for any .\" purpose with or without fee is hereby granted, provided that the above @@ -14,26 +16,169 @@ .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. .\" -.Dd $Mdocdate: April 20 2011 $ +.Dd $Mdocdate: December 26 2013 $ .Dt MANDOC_CHAR 7 .Os .Sh NAME .Nm mandoc_char .Nd mandoc special characters .Sh DESCRIPTION -This page documents the special characters and predefined strings accepted by +This page documents the +.Xr roff 7 +escape sequences accepted by .Xr mandoc 1 -to format +to represent special characters in .Xr mdoc 7 and .Xr man 7 documents. .Pp -Both -.Xr mdoc 7 -and -.Xr man 7 -encode special characters with +The rendering depends on the +.Xr mandoc 1 +output mode; in ASCII output, most characters are completely +unintelligible. +For that reason, using any of the special characters documented here, +except those discussed in the +.Sx DESCRIPTION , +is strongly discouraged; they are supported merely for backwards +compatibility with existing documents. +.Pp +In particular, in English manual pages, do not use special-character +escape sequences to represent national language characters in author +names; instead, provide ASCII transcriptions of the names. +.Ss Dashes and Hyphens +In typography there are different types of dashes of various width: +the hyphen (-), +the minus sign (\-), +the en-dash (\(en), +and the em-dash (\(em). +.Pp +Hyphens are used for adjectives; +to separate the two parts of a compound word; +or to separate a word across two successive lines of text. +The hyphen does not need to be escaped: +.Bd -unfilled -offset indent +blue-eyed +lorry-driver +.Ed +.Pp +The mathematical minus sign is used for negative numbers or subtraction. +It should be written as +.Sq \e- : +.Bd -unfilled -offset indent +a = 3 \e- 1; +b = \e-2; +.Ed +.Pp +The en-dash is used to separate the two elements of a range, +or can be used the same way as an em-dash. +It should be written as +.Sq \e(en : +.Bd -unfilled -offset indent +pp. 95\e(en97. +Go away \e(en or else! +.Ed +.Pp +The em-dash can be used to show an interruption +or can be used the same way as colons, semi-colons, or parentheses. +It should be written as +.Sq \e(em : +.Bd -unfilled -offset indent +Three things \e(em apples, oranges, and bananas. +This is not that \e(em rather, this is that. +.Ed +.Pp +Note: +hyphens, minus signs, and en-dashes look identical under normal ASCII output. +Other formats, such as PostScript, render them correctly, +with differing widths. +.Ss Spaces +To separate words in normal text, for indenting and alignment +in literal context, and when none of the following special cases apply, +just use the normal space character +.Pq Sq \ . +.Pp +When filling text, output lines may be broken between words, i.e. at space +characters. +To prevent a line break between two particular words, +use the unpaddable non-breaking space escape sequence +.Pq Sq \e\ \& +instead of the normal space character. +For example, the input string +.Dq number\e\ 1 +will be kept together as +.Dq number\ 1 +on the same output line. +.Pp +On request and macro lines, the normal space character serves as an +argument delimiter. +To include whitespace into arguments, quoting is usually the best choice; +see the MACRO SYNTAX section in +.Xr roff 7 . +In some cases, using the non-breaking space escape sequence +.Pq Sq \e\ \& +may be preferable. +.Pp +To escape macro names and to protect whitespace at the end +of input lines, the zero-width space +.Pq Sq \e& +is often useful. +For example, in +.Xr mdoc 7 , +a normal space character can be displayed in single quotes in either +of the following ways: +.Pp +.Dl .Sq \(dq \(dq +.Dl .Sq \e \e& +.Ss Quotes +On request and macro lines, the double-quote character +.Pq Sq \(dq +is handled specially to allow quoting. +One way to prevent this special handling is by using the +.Sq \e(dq +escape sequence. +.Pp +Note that on text lines, literal double-quote characters can be used +verbatim. +All other quote-like characters can be used verbatim as well, +even on request and macro lines. +.Ss Periods +The period +.Pq Sq \&. +is handled specially at the beginning of an input line, +where it introduces a +.Xr roff 7 +request or a macro, and when appearing alone as a macro argument in +.Xr mdoc 7 . +In such situations, prepend a zero-width space +.Pq Sq \e&. +to make it behave like normal text. +.Pp +Do not use the +.Sq \e. +escape sequence. +It does not prevent special handling of the period. +.Ss Backslashes +To include a literal backslash +.Pq Sq \e +into the output, use the +.Pq Sq \ee +escape sequence. +.Pp +Note that doubling it +.Pq Sq \e\e +is not the right way to output a backslash. +Because +.Xr mandoc 1 +does not implement full +.Xr roff 7 +functionality, it may work with +.Xr mandoc 1 , +but it may have weird effects on complete +.Xr roff 7 +implementations. +.Sh SPECIAL CHARACTERS +Special characters are encoded as .Sq \eX .Pq for a one-character escape , .Sq \e(XX @@ -41,53 +186,26 @@ encode special characters with and .Sq \e[N] .Pq N-character . -One may generalise -.Sq \e(XX -as -.Sq \e[XX] -and -.Sq \eX -as -.Sq \e[X] . -Predefined strings are functionally similar to special characters, using -.Sq \e*X -.Pq for a one-character escape , -.Sq \e*(XX -.Pq two-character , -and -.Sq \e*[N] -.Pq N-character . -One may generalise -.Sq \e*(XX -as -.Sq \e*[XX] -and -.Sq \e*X -as -.Sq \e*[X] . -.Pp -Note that each output mode will have a different rendering of the -characters. -It's guaranteed that each input symbol will correspond to a -(more or less) meaningful output rendering, regardless the mode. -.Sh SPECIAL CHARACTERS -These are the preferred input symbols for producing special characters. +For details, see the +.Em Special Characters +subsection of the +.Xr roff 7 +manual. .Pp Spacing: -.Bl -column -compact -offset indent "Input" "Description" +.Bl -column "Input" "Description" -offset indent -compact .It Em Input Ta Em Description -.It \e~ Ta non-breaking, non-collapsing space -.It \e Ta breaking, non-collapsing n-width space -.It \e^ Ta zero-width space -.It \e% Ta zero-width space +.It Sq \e\ \& Ta unpaddable non-breaking space +.It \e~ Ta paddable non-breaking space +.It \e0 Ta unpaddable, breaking digit-width space +.It \e| Ta one-sixth \e(em narrow space, zero width in nroff mode +.It \e^ Ta one-twelfth \e(em half-narrow space, zero width in nroff .It \e& Ta zero-width space -.It \e| Ta zero-width space -.It \e0 Ta breaking, non-collapsing digit-width space -.It \ec Ta removes any trailing space (if applicable) +.It \e% Ta zero-width space allowing hyphenation .El .Pp Lines: -.Bl -column -compact -offset indent "Input" "Rendered" "Description" +.Bl -column "Input" "Rendered" "Description" -offset indent -compact .It Em Input Ta Em Rendered Ta Em Description .It \e(ba Ta \(ba Ta bar .It \e(br Ta \(br Ta box rule @@ -99,7 +217,7 @@ Lines: .El .Pp Text markers: -.Bl -column -compact -offset indent "Input" "Rendered" "Description" +.Bl -column "Input" "Rendered" "Description" -offset indent -compact .It Em Input Ta Em Rendered Ta Em Description .It \e(ci Ta \(ci Ta circle .It \e(bu Ta \(bu Ta bullet @@ -118,7 +236,7 @@ Text markers: .El .Pp Legal symbols: -.Bl -column -compact -offset indent "Input" "Rendered" "Description" +.Bl -column "Input" "Rendered" "Description" -offset indent -compact .It Em Input Ta Em Rendered Ta Em Description .It \e(co Ta \(co Ta copyright .It \e(rg Ta \(rg Ta registered @@ -126,7 +244,7 @@ Legal symbols: .El .Pp Punctuation: -.Bl -column -compact -offset indent "Input" "Rendered" "Description" +.Bl -column "Input" "Rendered" "Description" -offset indent -compact .It Em Input Ta Em Rendered Ta Em Description .It \e(em Ta \(em Ta em-dash .It \e(en Ta \(en Ta en-dash @@ -138,7 +256,7 @@ Punctuation: .El .Pp Quotes: -.Bl -column -compact -offset indent "Input" "Rendered" "Description" +.Bl -column "Input" "Rendered" "Description" -offset indent -compact .It Em Input Ta Em Rendered Ta Em Description .It \e(Bq Ta \(Bq Ta right low double-quote .It \e(bq Ta \(bq Ta right low single-quote @@ -155,7 +273,7 @@ Quotes: .El .Pp Brackets: -.Bl -column -compact -offset indent "xxbracketrightbpx" Rendered Description +.Bl -column "xxbracketrightbpx" Rendered Description -offset indent -compact .It Em Input Ta Em Rendered Ta Em Description .It \e(lB Ta \(lB Ta left bracket .It \e(rB Ta \(rB Ta right bracket @@ -194,7 +312,7 @@ Brackets: .El .Pp Arrows: -.Bl -column -compact -offset indent "Input" "Rendered" "Description" +.Bl -column "Input" "Rendered" "Description" -offset indent -compact .It Em Input Ta Em Rendered Ta Em Description .It \e(<- Ta \(<- Ta left arrow .It \e(-> Ta \(-> Ta right arrow @@ -211,7 +329,7 @@ Arrows: .El .Pp Logical: -.Bl -column -compact -offset indent "Input" "Rendered" "Description" +.Bl -column "Input" "Rendered" "Description" -offset indent -compact .It Em Input Ta Em Rendered Ta Em Description .It \e(AN Ta \(AN Ta logical and .It \e(OR Ta \(OR Ta logical or @@ -226,7 +344,7 @@ Logical: .El .Pp Mathematical: -.Bl -column -compact -offset indent "xxcoproductxx" "Rendered" "Description" +.Bl -column "xxcoproductxx" "Rendered" "Description" -offset indent -compact .It Em Input Ta Em Rendered Ta Em Description .It \e(pl Ta \(pl Ta plus .It \e(mi Ta \(mi Ta minus @@ -288,10 +406,13 @@ Mathematical: .It \e(Re Ta \(Re Ta real .It \e(pd Ta \(pd Ta partial differential .It \e(-h Ta \(-h Ta Planck constant over 2\(*p +.It \e[12] Ta \[12] Ta one-half +.It \e[14] Ta \[14] Ta one-fourth +.It \e[34] Ta \[34] Ta three-fourths .El .Pp Ligatures: -.Bl -column -compact -offset indent "Input" "Rendered" "Description" +.Bl -column "Input" "Rendered" "Description" -offset indent -compact .It Em Input Ta Em Rendered Ta Em Description .It \e(ff Ta \(ff Ta ff ligature .It \e(fi Ta \(fi Ta fi ligature @@ -308,7 +429,7 @@ Ligatures: .El .Pp Accents: -.Bl -column -compact -offset indent "Input" "Rendered" "Description" +.Bl -column "Input" "Rendered" "Description" -offset indent -compact .It Em Input Ta Em Rendered Ta Em Description .It \e(a" Ta \(a" Ta Hungarian umlaut .It \e(a- Ta \(a- Ta macron @@ -330,7 +451,7 @@ Accents: .El .Pp Accented letters: -.Bl -column -compact -offset indent "Input" "Rendered" "Description" +.Bl -column "Input" "Rendered" "Description" -offset indent -compact .It Em Input Ta Em Rendered Ta Em Description .It \e('A Ta \('A Ta acute A .It \e('E Ta \('E Ta acute E @@ -390,7 +511,7 @@ Accented letters: .El .Pp Special letters: -.Bl -column -compact -offset indent "Input" "Rendered" "Description" +.Bl -column "Input" "Rendered" "Description" -offset indent -compact .It Em Input Ta Em Rendered Ta Em Description .It \e(-D Ta \(-D Ta Eth .It \e(Sd Ta \(Sd Ta eth @@ -401,7 +522,7 @@ Special letters: .El .Pp Currency: -.Bl -column -compact -offset indent "Input" "Rendered" "Description" +.Bl -column "Input" "Rendered" "Description" -offset indent -compact .It Em Input Ta Em Rendered Ta Em Description .It \e(Do Ta \(Do Ta dollar .It \e(ct Ta \(ct Ta cent @@ -414,7 +535,7 @@ Currency: .El .Pp Units: -.Bl -column -compact -offset indent "Input" "Rendered" "Description" +.Bl -column "Input" "Rendered" "Description" -offset indent -compact .It Em Input Ta Em Rendered Ta Em Description .It \e(de Ta \(de Ta degree .It \e(%0 Ta \(%0 Ta per-thousand @@ -424,7 +545,7 @@ Units: .El .Pp Greek letters: -.Bl -column -compact -offset indent "Input" "Rendered" "Description" +.Bl -column "Input" "Rendered" "Description" -offset indent -compact .It Em Input Ta Em Rendered Ta Em Description .It \e(*A Ta \(*A Ta Alpha .It \e(*B Ta \(*B Ta Beta @@ -489,7 +610,20 @@ for use, as they differ across implementations. Manuals using these predefined strings are almost certainly not portable. .Pp -.Bl -column -compact -offset indent "Input" "Rendered" "Description" +Their syntax is similar to special characters, using +.Sq \e*X +.Pq for a one-character escape , +.Sq \e*(XX +.Pq two-character , +and +.Sq \e*[N] +.Pq N-character . +For details, see the +.Em Predefined Strings +subsection of the +.Xr roff 7 +manual. +.Bl -column "Input" "Rendered" "Description" -offset indent .It Em Input Ta Em Rendered Ta Em Description .It \e*(Ba Ta \*(Ba Ta vertical bar .It \e*(Ne Ta \*(Ne Ta not equal @@ -520,6 +654,25 @@ portable. .It \e*(Px Ta \*(Px Ta POSIX standard name .It \e*(Ai Ta \*(Ai Ta ANSI standard name .El +.Sh UNICODE CHARACTERS +The escape sequences +.Pp +.Dl \e[uXXXX] and \eC'uXXXX' +.Pp +are interpreted as Unicode codepoints. +The codepoint must be in the range above U+0080 and less than U+10FFFF. +For compatibility, the hexadecimal digits +.Sq A +to +.Sq F +must be given as uppercase characters, +and points must be zero-padded to four characters; if +greater than four characters, no zero padding is allowed. +Unicode surrogates are not allowed. +.\" .Pp +.\" Unicode glyphs attenuate to the +.\" .Sq \&? +.\" character if invalid or not rendered by current output media. .Sh NUMBERED CHARACTERS For backward compatibility with existing manuals, .Xr mandoc 1 @@ -536,12 +689,15 @@ For example, do not use \eN'34', use \e(dq, or even the plain .Sq \(dq character where possible. .Sh COMPATIBILITY -This section documents compatibility between mandoc and other other +This section documents compatibility between mandoc and other troff implementations, at this time limited to GNU troff .Pq Qq groff . .Pp .Bl -dash -compact .It +The \eN\(aq\(aq escape sequence is limited to printable characters; in +groff, it accepts arbitrary character numbers. +.It In .Fl T Ns Cm ascii , the @@ -569,16 +725,19 @@ from mandoc either because they are poorly documented or they have no known representation. .El .Sh SEE ALSO -.Xr mandoc 1 +.Xr mandoc 1 , +.Xr man 7 , +.Xr mdoc 7 , +.Xr roff 7 .Sh AUTHORS The .Nm manual page was written by -.An Kristaps Dzonsons Aq kristaps@bsd.lv . +.An Kristaps Dzonsons Aq Mt kristaps@bsd.lv . .Sh CAVEATS -The +The predefined string .Sq \e*(Ba -escape mimics the behaviour of the +mimics the behaviour of the .Sq \&| character in .Xr mdoc 7 ;