Now that markdown output is tested for almost everything, test all

[mandoc.git] / mandoc.1
diff --git a/mandoc.1 b/mandoc.1

index 98457b2fb49503854091a100676a5552bde6575f..8d87007df63e9bac4a5176532ee73d247fe79e33 100644 (file)
--- a/mandoc.1
+++ b/mandoc.1
@@ -1,4 +1,4 @@
-.\"    $Id: mandoc.1,v 1.168 2017/01/08 00:11:23 schwarze Exp $
+.\"    $Id: mandoc.1,v 1.178 2017/03/08 19:40:59 schwarze Exp $
  .\"
  .\" Copyright (c) 2009, 2010, 2011 Kristaps Dzonsons <kristaps@bsd.lv>
  .\" Copyright (c) 2012, 2014-2017 Ingo Schwarze <schwarze@openbsd.org>
  .\"
  .\" Copyright (c) 2009, 2010, 2011 Kristaps Dzonsons <kristaps@bsd.lv>
  .\" Copyright (c) 2012, 2014-2017 Ingo Schwarze <schwarze@openbsd.org>
@@ -15,7 +15,7 @@
  .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
  .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  .\"
  .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
  .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  .\"
-.Dd $Mdocdate: January 8 2017 $
+.Dd $Mdocdate: March 8 2017 $
  .Dt MANDOC 1
  .Os
  .Sh NAME
  .Dt MANDOC 1
  .Os
  .Sh NAME
@@ -98,21 +98,29 @@ arguments are
  .Cm iso-8859-1 ,
  and
  .Cm utf-8 .
  .Cm iso-8859-1 ,
  and
  .Cm utf-8 .
-If not specified, autodetection uses the first match:
-.Bl -tag -width iso-8859-1
-.It Cm utf-8
-if the first three bytes of the input file
-are the UTF-8 byte order mark (BOM, 0xefbbbf)
-.It Ar encoding
-if the first or second line of the input file matches the
+If not specified, autodetection uses the first match in the following
+list:
+.Bl -enum
+.It
+If the first three bytes of the input file are the UTF-8 byte order
+mark (BOM, 0xefbbbf), input is interpreted as
+.Cm utf-8 .
+.It
+If the first or second line of the input file matches the
  .Sy emacs
  mode line format
  .Pp
  .D1 .\e" -*- Oo ...; Oc coding: Ar encoding ; No -*-
  .Sy emacs
  mode line format
  .Pp
  .D1 .\e" -*- Oo ...; Oc coding: Ar encoding ; No -*-
-.It Cm utf-8
-if the first non-ASCII byte in the file introduces a valid UTF-8 sequence
-.It Cm iso-8859-1
-otherwise
+.Pp
+then input is interpreted according to
+.Ar encoding .
+.It
+If the first non-ASCII byte in the file introduces a valid UTF-8
+sequence, input is interpreted as
+.Cm utf-8 .
+.It
+Otherwise, input is interpreted as
+.Cm iso-8859-1 .
  .El
  .It Fl k
  A synonym for
  .El
  .It Fl k
  A synonym for
@@ -250,7 +258,7 @@ The
  utility accepts the following
  .Fl T
  arguments, which correspond to output modes:
  utility accepts the following
  .Fl T
  arguments, which correspond to output modes:
-.Bl -tag -width "-T locale"
+.Bl -tag -width "-T markdown"
  .It Fl T Cm ascii
  Produce 7-bit ASCII output.
  See
  .It Fl T Cm ascii
  Produce 7-bit ASCII output.
  See
@@ -274,6 +282,12 @@ Produce
  format output.
  See
  .Sx Man Output .
  format output.
  See
  .Sx Man Output .
+.It Fl T Cm markdown
+Produce output in
+.Sy markdown
+format.
+See
+.Sx Markdown Output .
  .It Fl T Cm pdf
  Produce PDF output.
  See
  .It Fl T Cm pdf
  Produce PDF output.
  See
@@ -443,6 +457,40 @@ The parser is also run, and as usual, the
  level controls which
  .Sx DIAGNOSTICS
  are displayed before copying the input to the output.
  level controls which
  .Sx DIAGNOSTICS
  are displayed before copying the input to the output.
+.Ss Markdown Output
+Translate
+.Xr mdoc 7
+input to the
+.Sy markdown
+format conforming to
+.Lk http://daringfireball.net/projects/markdown/syntax.text\
+ "John Gruber's 2004 specification" .
+The output also almost conforms to the
+.Lk http://commonmark.org/ CommonMark
+specification.
+.Pp
+The character set used for the markdown output is ASCII.
+Non-ASCII characters are encoded as HTML entities.
+Since that is not possible in literal font contexts, because these
+are rendered as code spans and code blocks in the markdown output,
+non-ASCII characters are transliterated to ASCII approximations in
+these contexts.
+.Pp
+Markdown is a very weak markup language, so all semantic markup is
+lost, and even part of the presentational markup may be lost.
+Do not use this as an intermediate step in converting to HTML;
+instead, use
+.Fl T Cm html
+directly.
+.Pp
+The
+.Xr man 7 ,
+.Xr tbl 7 ,
+and
+.Xr eqn 7
+input languages are not supported by
+.Fl T Cm markdown
+output mode.
  .Ss PDF Output
  PDF-1.1 output may be generated by
  .Fl T Cm pdf .
  .Ss PDF Output
  PDF-1.1 output may be generated by
  .Fl T Cm pdf .
@@ -498,7 +546,15 @@ Use
  to show a human readable representation of the syntax tree.
  It is useful for debugging the source code of manual pages.
  The exact format is subject to change, so don't write parsers for it.
  to show a human readable representation of the syntax tree.
  It is useful for debugging the source code of manual pages.
  The exact format is subject to change, so don't write parsers for it.
-Each output line shows one syntax tree node.
+.Pp
+The first paragraph shows meta data found in the
+.Xr mdoc 7
+prologue, on the
+.Xr man 7
+.Ic \&TH
+line, or the fallbacks used.
+.Pp
+In the tree dump, each output line shows one syntax tree node.
  Child nodes are indented with respect to their parent node.
  The columns are:
  .Pp
  Child nodes are indented with respect to their parent node.
  The columns are:
  .Pp
@@ -529,7 +585,26 @@ The input column number (starting at one).
  A closing parenthesis if the node is a closing delimiter.
  .It
  A full stop if the node ends a sentence.
  A closing parenthesis if the node is a closing delimiter.
  .It
  A full stop if the node ends a sentence.
+.It
+BROKEN if the node is a block broken by another block.
+.It
+NOSRC if the node is not in the input file,
+but automatically generated from macros.
+.It
+NOPRT if the node is not supposed to generate output
+for any output format.
+.El
  .El
  .El
+.Pp
+The following
+.Fl O
+argument is accepted:
+.Bl -tag -width Ds
+.It Cm noval
+Skip validation and show the unvalidated syntax tree.
+This can help to find out whether a given behaviour is caused by
+the parser or by the validator.
+Meta data is not available in this case.
  .El
  .Sh ENVIRONMENT
  .Bl -tag -width MANPAGER
  .El
  .Sh ENVIRONMENT
  .Bl -tag -width MANPAGER
@@ -843,6 +918,14 @@ The
  .Ic \&Nd
  macro lacks the required argument.
  The title line of the manual will end after the dash.
  .Ic \&Nd
  macro lacks the required argument.
  The title line of the manual will end after the dash.
+.It Sy "description line outside NAME section"
+.Pq mdoc
+An
+.Ic \&Nd
+macro appears outside the NAME section.
+The arguments are printed anyway and the following text is used for
+.Xr apropos 1 ,
+but none of that behaviour is portable.
  .It Sy "sections out of conventional order"
  .Pq mdoc
  A standard section occurs after another section it usually precedes.
  .It Sy "sections out of conventional order"
  .Pq mdoc
  A standard section occurs after another section it usually precedes.
@@ -1343,6 +1426,10 @@ it is hard to predict which tab stop position the tab will advance to.
  Whitespace at the end of input lines is almost never semantically
  significant \(em but in the odd case where it might be, it is
  extremely confusing when reviewing and maintaining documents.
  Whitespace at the end of input lines is almost never semantically
  significant \(em but in the odd case where it might be, it is
  extremely confusing when reviewing and maintaining documents.
+.It Sy "new sentence, new line"
+.Pq mdoc
+A new sentence starts in the middle of a text line.
+Start it on a new input line to help formatters produce correct spacing.
  .It Sy "bad comment style"
  .Pq roff
  Comment lines start with a dot, a backslash, and a double-quote character.
  .It Sy "bad comment style"
  .Pq roff
  Comment lines start with a dot, a backslash, and a double-quote character.
@@ -1818,6 +1905,19 @@ as if they were a text line.
  .Xr mdoc 7 ,
  .Xr roff 7 ,
  .Xr tbl 7
  .Xr mdoc 7 ,
  .Xr roff 7 ,
  .Xr tbl 7
+.Sh HISTORY
+The
+.Nm
+utility first appeared in
+.Ox 4.8 .
+The option
+.Fl I
+appeared in
+.Ox 5.2 ,
+and
+.Fl aCcfhKklMSsw
+in
+.Ox 5.7 .
  .Sh AUTHORS
  .An -nosplit
  The
  .Sh AUTHORS
  .An -nosplit
  The
@@ -1826,12 +1926,3 @@ utility was written by
  .An Kristaps Dzonsons Aq Mt kristaps@bsd.lv
  and is maintained by
  .An Ingo Schwarze Aq Mt schwarze@openbsd.org .
  .An Kristaps Dzonsons Aq Mt kristaps@bsd.lv
  and is maintained by
  .An Ingo Schwarze Aq Mt schwarze@openbsd.org .
-.Sh BUGS
-In
-.Fl T Cm html ,
-the maximum size of an element attribute is determined by
-.Dv BUFSIZ ,
-which is usually 1024 bytes.
-Be aware of this when setting long link
-formats such as
-.Fl O Cm style Ns = Ns Ar really/long/link .