-.\" $Id: mandoc_char.7,v 1.38 2010/05/09 21:11:22 kristaps Exp $
+.\" $Id: mandoc_char.7,v 1.45 2011/05/15 15:30:33 kristaps Exp $
.\"
.\" Copyright (c) 2009 Kristaps Dzonsons <kristaps@bsd.lv>
.\"
.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
.\"
-.Dd $Mdocdate: May 9 2010 $
+.Dd $Mdocdate: May 15 2011 $
.Dt MANDOC_CHAR 7
.Os
.Sh NAME
.It \e(ts Ta \(ts Ta sigma terminal
.El
.Sh PREDEFINED STRINGS
-These are not recommended for use, as they differ across
-implementations:
+Predefined strings are inherited from the macro packages of historical
+troff implementations.
+They are
+.Em not recommended
+for use, as they differ across implementations.
+Manuals using these predefined strings are almost certainly not
+portable.
.Pp
.Bl -column -compact -offset indent "Input" "Rendered" "Description"
.It Em Input Ta Em Rendered Ta Em Description
.It \e*(>= Ta \*(>= Ta greater-than-equal
.It \e*(aa Ta \*(aa Ta acute
.It \e*(ga Ta \*(ga Ta grave
+.It \e*(Px Ta \*(Px Ta POSIX standard name
+.It \e*(Ai Ta \*(Ai Ta ANSI standard name
.El
-.Sh COMPATIBILITY
-This section documents compatibility of
-.Nm
-with older or existing versions of
-.Xr groff 1 .
+.Sh UNICODE CHARACTERS
+The escape sequence
.Pp
-The following render differently in
-.Fl T Ns Ar ascii
-output mode:
-.Bd -ragged -offset indent
-\e(ss, \e(nm, \e(nb, \e(nc, \e(ib, \e(ip, \e(pp, \e[sum], \e[product],
-\e[coproduct], \e(gr, \e(-h, \e(a.
-.Ed
+.Dl \e[uXXXX]
.Pp
-The following render differently in
-.Fl T Ns Ar html
-output mode:
-.Bd -ragged -offset indent
-\e(~=, \e(nb, \e(nc
-.Ed
+is interpreted as a Unicode codepoint.
+The codepoint must be in the range above U+0080 and less than U+10FFFF.
+For compatibility, points must be zero-padded to four characters; if
+greater than four characters, no zero padding is allowed.
+Unicode surrogates are not allowed.
+.\" .Pp
+.\" Unicode glyphs attenuate to the
+.\" .Sq \&?
+.\" character if invalid or not rendered by current output media.
+.Sh NUMBERED CHARACTERS
+For backward compatibility with existing manuals,
+.Xr mandoc 1
+also supports the
+.Pp
+.Dl \eN\(aq Ns Ar number Ns \(aq
+.Pp
+escape sequence, inserting the character
+.Ar number
+from the current character set into the output.
+Of course, this is inherently non-portable and is already marked
+as deprecated in the Heirloom roff manual.
+For example, do not use \eN'34', use \e(dq, or even the plain
+.Sq \(dq
+character where possible.
+.Sh COMPATIBILITY
+This section documents compatibility between mandoc and other other
+troff implementations, at this time limited to GNU troff
+.Pq Qq groff .
.Pp
-Finally, the following have been omitted by being poorly documented or
-having no known representation:
-.Bd -ragged -offset indent
-\e[radicalex], \e[sqrtex], \e(ru
-.Ed
+.Bl -dash -compact
+.It
+The \eN\(aq\(aq escape sequence is limited to printable characters; in
+groff, it accepts arbitrary character numbers.
+.It
+In
+.Fl T Ns Cm ascii ,
+the
+\e(ss, \e(nm, \e(nb, \e(nc, \e(ib, \e(ip, \e(pp, \e[sum], \e[product],
+\e[coproduct], \e(gr, \e(\-h, and \e(a. special characters render
+differently between mandoc and groff.
+.It
+In
+.Fl T Ns Cm html
+and
+.Fl T Ns Cm xhtml ,
+the \e(~=, \e(nb, and \e(nc special characters render differently
+between mandoc and groff.
+.It
+The
+.Fl T Ns Cm ps
+and
+.Fl T Ns Cm pdf
+modes format like
+.Fl T Ns Cm ascii
+instead of rendering glyphs as in groff.
+.It
+The \e[radicalex], \e[sqrtex], and \e(ru special characters have been omitted
+from mandoc either because they are poorly documented or they have no
+known representation.
+.El
.Sh SEE ALSO
.Xr mandoc 1
.Sh AUTHORS
The
.Sq \e*(Ba
escape mimics the behaviour of the
-.Sq \*(Ba
+.Sq \&|
character in
.Xr mdoc 7 ;
thus, if you wish to render a vertical bar with no side effects, use