1 .\" $Id: mandoc.3,v 1.26 2014/09/03 23:21:47 schwarze Exp $
3 .\" Copyright (c) 2009, 2010, 2011 Kristaps Dzonsons <kristaps@bsd.lv>
4 .\" Copyright (c) 2010 Ingo Schwarze <schwarze@openbsd.org>
6 .\" Permission to use, copy, modify, and distribute this software for any
7 .\" purpose with or without fee is hereby granted, provided that the above
8 .\" copyright notice and this permission notice appear in all copies.
10 .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
11 .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
12 .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
13 .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
14 .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
15 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
16 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
18 .Dd $Mdocdate: September 3 2014 $
41 .Nd mandoc macro compiler library
47 .Fd "#define ASCII_NBRSP"
48 .Fd "#define ASCII_HYPH"
49 .Fd "#define ASCII_BREAK"
53 .Fa "enum mandoclevel wlevel"
59 .Fa "enum mandocerr errtype"
60 .Fa "enum mandoclevel level"
61 .Fa "const char *file"
68 .Fa "struct mparse *parse"
72 .Fa "const struct mparse *parse"
76 .Fa "struct mparse *parse"
78 .Ft "enum mandoclevel"
80 .Fa "struct mparse *parse"
82 .Fa "const char *fname"
83 .Fa "pid_t *child_pid"
85 .Ft "enum mandoclevel"
87 .Fa "struct mparse *parse"
89 .Fa "const char *fname"
93 .Fa "struct mparse *parse"
97 .Fa "struct mparse *parse"
98 .Fa "struct mdoc **mdoc"
99 .Fa "struct man **man"
108 .Fa "enum mandoclevel"
110 .Ft "enum mandoclevel"
112 .Fa "struct mparse *parse"
113 .Fa "pid_t child_pid"
121 .Fa "const struct mdoc_node *node"
123 .Ft "const struct mdoc_meta *"
125 .Fa "const struct mdoc *mdoc"
127 .Ft "const struct mdoc_node *"
129 .Fa "const struct mdoc *mdoc"
131 .Vt extern const char * const * mdoc_argnames;
132 .Vt extern const char * const * mdoc_macronames;
139 .Fa "const struct man_node *node"
141 .Ft "const struct man_meta *"
143 .Fa "const struct man *man"
145 .Ft "const struct mparse *"
147 .Fa "const struct man *man"
149 .Ft "const struct man_node *"
151 .Fa "const struct man *man"
153 .Vt extern const char * const * man_macronames;
159 manual into an abstract syntax tree (AST).
161 manuals are composed of
165 and may be mixed with
172 The following describes a general parse sequence:
175 initiate a parsing sequence with
178 parse files or file descriptors with
181 retrieve a parsed syntax tree, if the parse was successful, with
184 iterate over parse nodes with
189 free all allocated memory with
196 This section documents the functions, types, and variables available
199 with the exception of those documented in
205 .It Vt "enum mandocerr"
206 A fatal error, error, or warning message during parsing.
207 .It Vt "enum mandoclevel"
208 A classification of an
210 as regards system operation.
211 .It Vt "struct mparse"
212 An opaque pointer to a running parse sequence.
217 This may be used across parsed input if
219 is called between parses.
221 A prototype for a function to handle fatal error, error, and warning
222 messages emitted by the parser.
227 Obtain a text-only representation of a
228 .Vt struct man_node ,
229 including text contained in its child nodes.
230 To be used on children of the pointer returned from
232 When it is no longer needed, the pointer returned from
237 Obtain the meta-data of a successful
240 This may only be used on a pointer returned by
247 Get the parser used for the current output.
253 Obtain the root node of a successful
256 This may only be used on a pointer returned by
263 Obtain a text-only representation of a
264 .Vt struct mdoc_node ,
265 including text contained in its child nodes.
266 To be used on children of the pointer returned from
268 When it is no longer needed, the pointer returned from
273 Obtain the meta-data of a successful
276 This may only be used on a pointer returned by
283 Obtain the root node of a successful
286 This may only be used on a pointer returned by
294 The arguments have the following effect:
295 .Bl -tag -offset 5n -width inttype
301 bit is set, only that parser is used.
302 Otherwise, the document type is automatically detected.
309 file inclusion requests are always honoured.
310 Otherwise, if the request is the only content in an input file,
311 only the file name is remembered, to be returned in the
318 bit is set, parsing is aborted after the NAME section.
319 This is for example useful in
322 to quickly build minimal databases.
325 .Dv MANDOCLEVEL_FATAL ,
326 .Dv MANDOCLEVEL_ERROR ,
328 .Dv MANDOCLEVEL_WARNING .
329 Messages below the selected level will be suppressed.
331 A callback function to handle errors and warnings.
336 A default string for the
339 macro, overriding the
341 preprocessor definition and the results of
345 The same parser may be used for multiple files so long as
347 is called between parses.
349 must be called to free the memory allocated by this function.
355 Free all memory allocated by
361 .It Fn mparse_getkeep
362 Acquire the keep buffer.
363 Must follow a call of
370 Instruct the parser to retain a copy of its parsed input.
371 This can be acquired with subsequent
387 Return a file descriptor open for reading in
393 If applicable, return the
398 If non-zero, it should be passed to
400 after completing the parse sequence.
406 Parse a file or file descriptor.
411 is opened for reading.
414 is assumed to be the name associated with
416 This may be called multiple times with different parameters; however,
418 should be invoked between parses.
424 Reset a parser so that
432 Obtain the result of a parse.
433 Only successful parses
437 returned less than MANDOCLEVEL_FATAL
439 should invoke this function, in which case one of the three pointers will
445 .It Fn mparse_strerror
446 Return a statically-allocated string representation of an error code.
451 .It Fn mparse_strlevel
452 Return a statically-allocated string representation of a level code.
462 that was spawned with
464 To be called after the parse sequence is complete.
468 .Dv MANDOCLEVEL_SYSERR
469 on failure, that is, when
473 died from a signal or exited with non-zero status.
481 .It Va man_macronames
482 The string representation of a man macro as indexed by
485 The string representation of a mdoc macro argument as indexed by
486 .Vt "enum mdocargt" .
487 .It Va mdoc_macronames
488 The string representation of a mdoc macro as indexed by
491 .Sh IMPLEMENTATION NOTES
492 This section consists of structural documentation for
496 syntax trees and strings.
497 .Ss Man and Mdoc Strings
498 Strings may be extracted from mdoc and man meta-data, or from text
499 nodes (MDOC_TEXT and MAN_TEXT, respectively).
500 These strings have special non-printing formatting cues embedded in the
501 text itself, as well as
503 escapes preserved from input.
504 Implementing systems will need to handle both situations to produce
506 In general, strings may be assumed to consist of 7-bit ASCII characters.
508 The following non-printing characters may be embedded in text strings:
511 A non-breaking space character.
515 A breakable zero-width space.
518 Escape characters are also passed verbatim into text strings.
519 An escape character is a sequence of characters beginning with the
522 To construct human-readable text, these should be intercepted with
524 and converted with one the functions described in
526 .Ss Man Abstract Syntax Tree
527 This AST is governed by the ontological rules dictated in
529 and derives its terminology accordingly.
531 The AST is composed of
533 nodes with element, root and text types as declared by the
536 Each node also provides its parse point (the
541 fields), its position in the tree (the
547 fields) and some type-specific data.
549 The tree itself is arranged according to the following normal form,
550 where capitalised non-terminals represent nodes.
552 .Bl -tag -width "ELEMENTXX" -compact
556 \(<- ELEMENT | TEXT | BLOCK
569 The only elements capable of nesting other elements are those with
570 next-line scope as documented in
572 .Ss Mdoc Abstract Syntax Tree
573 This AST is governed by the ontological
576 and derives its terminology accordingly.
578 elements described in
580 are described simply as
583 The AST is composed of
585 nodes with block, head, body, element, root and text types as declared
589 Each node also provides its parse point (the
594 fields), its position in the tree (the
601 fields) and some type-specific data, in particular, for nodes generated
602 from macros, the generating macro in the
606 The tree itself is arranged according to the following normal form,
607 where capitalised non-terminals represent nodes.
609 .Bl -tag -width "ELEMENTXX" -compact
613 \(<- BLOCK | ELEMENT | TEXT
615 \(<- HEAD [TEXT] (BODY [TEXT])+ [TAIL [TEXT]]
621 \(<- mnode* [ENDBODY mnode*]
628 Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of
629 the BLOCK production: these refer to punctuation marks.
630 Furthermore, although a TEXT node will generally have a non-zero-length
631 string, in the specific case of
632 .Sq \&.Bd \-literal ,
633 an empty line will produce a zero-length string.
634 Multiple body parts are only found in invocations of
636 where a new body introduces a new phrase.
640 syntax tree accommodates for broken block structures as well.
641 The ENDBODY node is available to end the formatting associated
642 with a given block before the physical end of that block.
645 field, is of the BODY
649 as the BLOCK it is ending, and has a
651 field pointing to that BLOCK's BODY node.
652 It is an indirect child of that BODY node
653 and has no children of its own.
655 An ENDBODY node is generated when a block ends while one of its child
656 blocks is still open, like in the following example:
657 .Bd -literal -offset indent
664 This example results in the following block structure:
665 .Bd -literal -offset indent
670 BLOCK Bo, pending -> Ao
675 ENDBODY Ao, pending -> Ao
680 Here, the formatting of the
682 block extends from TEXT ao to TEXT ac,
683 while the formatting of the
685 block extends from TEXT bo to TEXT bc.
686 It renders as follows in
690 .Dl <ao [bo ac> bc] end
692 Support for badly-nested blocks is only provided for backward
693 compatibility with some older
696 Using badly-nested blocks is
697 .Em strongly discouraged ;
704 are unable to render them in any meaningful way.
705 Furthermore, behaviour when encountering badly-nested blocks is not
706 consistent across troff implementations, especially when using multiple
707 levels of badly-nested blocks.
710 .Xr mandoc_escape 3 ,
711 .Xr mandoc_malloc 3 ,
722 library was written by
723 .An Kristaps Dzonsons Aq Mt kristaps@bsd.lv .