X-Git-Url: https://git.cameronkatri.com/mandoc.git/blobdiff_plain/50d0f3e19ecea1f5f0f60a3d8b71e68755c676e0..edc2864f44502c5d9f45664a5d58f03c4146ca27:/preconv.1 diff --git a/preconv.1 b/preconv.1 index 72970061..30d8fa4d 100644 --- a/preconv.1 +++ b/preconv.1 @@ -1,4 +1,4 @@ -.\" $Id: preconv.1,v 1.1 2011/05/26 12:01:14 kristaps Exp $ +.\" $Id: preconv.1,v 1.5 2011/08/18 08:58:44 kristaps Exp $ .\" .\" Copyright (c) 2011 Kristaps Dzonsons .\" @@ -14,12 +14,12 @@ .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. .\" -.Dd $Mdocdate: May 26 2011 $ +.Dd $Mdocdate: August 18 2011 $ .Dt PRECONV 1 .Os .Sh NAME .Nm preconv -.Nd recodes multibyte UNIX manuals as mandoc input +.Nd recode multibyte UNIX manuals .Sh SYNOPSIS .Nm preconv .Op Fl D Ar enc @@ -32,23 +32,18 @@ utility recodes multibyte .Ux manual files into .Xr mandoc 1 +.Po +or other troff system supporting the +.Sq \e[uNNNN] +escape sequence +.Pc input. Its arguments are as follows: .Bl -tag -width Ds .It Fl D Ar enc The default encoding. -This is case-insensitive. -See -.Sx Algorithm -and -.Sx Encodings . .It Fl e Ar enc The document's encoding. -This is case-insensitive. -See -.Sx Algorithm -and -.Sx Encodings . .It Ar file The input file. .El @@ -58,27 +53,23 @@ If is not provided, .Nm accepts standard input. -Output is written to standard output. -Unicode characters in the ASCII range are printed as regular ASCII -characters; those above this range are printed using the +See +.Sx Algorithm +for encoding choice. +.Pp +The recoded input is written to standard output: Unicode characters in +the ASCII range are printed as regular ASCII characters, while those +above this range are printed using the .Sq \e[uNNNN] format documented in .Xr mandoc_char 7 . .Pp If input bytes are improperly formed in the current encoding, they're passed unmodified to standard output. -.Ss Encodings -The +For some encodings, such as UTF-8, unrecoverable input sequences will +cause .Nm -utility accepts the -.Ar utf\-8 , -.Ar us\-ascii , -and -.Ar latin\-1 -encodings as arguments to -.Fl D Ar enc -or -.Fl e Ar enc . +to stop processing and exit. .Ss Algorithm An encoding is chosen according to the following steps: .Bl -enum @@ -86,13 +77,41 @@ An encoding is chosen according to the following steps: From the argument passed to .Fl e Ar enc . .It -If a BOM exists, utf\-8 encoding is selected. +If a BOM exists, UTF\-8 encoding is selected. +.It +From the coding tags parsed from +.Qq File Variables +on the first two lines of input. +A file variable is an input line of the form +.Pp +.Dl \%.\e\(dq -*- key: val [; key: val ]* -*- +.Pp +A coding tag variable is where +.Cm key +is +.Qq coding +and +.Cm val +is the name of the encoding. +A typical file variable with a coding tag is +.Pp +.Dl \%.\e\(dq -*- mode: troff; coding: utf-8 -*- .It From the argument passed to .Fl D Ar enc . .It If all else fails, Latin\-1 is used. .El +.Pp +The +.Nm +utility recognises the UTF\-8, us\-ascii, and latin\-1 encodings as +passed to the +.Fl e +and +.Fl D +arguments, or as coding tags. +Encodings are matched case-insensitively. .\" .Sh IMPLEMENTATION NOTES .\" Not used in OpenBSD. .\" .Sh RETURN VALUES @@ -102,7 +121,12 @@ If all else fails, Latin\-1 is used. .\" .Sh FILES .Sh EXIT STATUS .Ex -std -.\" .Sh EXAMPLES +.Sh EXAMPLES +Explicitly page a UTF\-8 manual +.Pa foo.1 +in the current locale: +.Pp +.Dl $ preconv \-e utf\-8 foo.1 | mandoc -Tlocale | less .\" .Sh DIAGNOSTICS .\" For sections 1, 4, 6, 7, & 8 only. .\" .Sh ERRORS @@ -130,7 +154,8 @@ utility appeared in May 2011. The .Nm utility was written by -.An Kristaps Dzonsons Aq kristaps@bsd.lv . +.An Kristaps Dzonsons , +.Mt kristaps@bsd.lv . .\" .Sh CAVEATS .\" .Sh BUGS .\" .Sh SECURITY CONSIDERATIONS