1 .\" $Id: mandoc.3,v 1.31 2015/01/15 04:26:40 schwarze Exp $
3 .\" Copyright (c) 2009, 2010, 2011 Kristaps Dzonsons <kristaps@bsd.lv>
4 .\" Copyright (c) 2010, 2013, 2014, 2015 Ingo Schwarze <schwarze@openbsd.org>
6 .\" Permission to use, copy, modify, and distribute this software for any
7 .\" purpose with or without fee is hereby granted, provided that the above
8 .\" copyright notice and this permission notice appear in all copies.
10 .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
11 .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
12 .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
13 .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
14 .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
15 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
16 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
18 .Dd $Mdocdate: January 15 2015 $
41 .Nd mandoc macro compiler library
46 .Fd "#define ASCII_NBRSP"
47 .Fd "#define ASCII_HYPH"
48 .Fd "#define ASCII_BREAK"
52 .Fa "enum mandoclevel wlevel"
54 .Fa "const struct mchars *mchars"
59 .Fa "enum mandocerr errtype"
60 .Fa "enum mandoclevel level"
61 .Fa "const char *file"
68 .Fa "struct mparse *parse"
72 .Fa "const struct mparse *parse"
76 .Fa "struct mparse *parse"
78 .Ft "enum mandoclevel"
80 .Fa "struct mparse *parse"
82 .Fa "const char *fname"
84 .Ft "enum mandoclevel"
86 .Fa "struct mparse *parse"
88 .Fa "const char *fname"
92 .Fa "struct mparse *parse"
96 .Fa "struct mparse *parse"
97 .Fa "struct mdoc **mdoc"
98 .Fa "struct man **man"
107 .Fa "enum mandoclevel"
109 .Ft "enum mandoclevel"
111 .Fa "struct mparse *parse"
119 .Fa "const struct mdoc_node *node"
121 .Ft "const struct mdoc_meta *"
123 .Fa "const struct mdoc *mdoc"
125 .Ft "const struct mdoc_node *"
127 .Fa "const struct mdoc *mdoc"
129 .Vt extern const char * const * mdoc_argnames;
130 .Vt extern const char * const * mdoc_macronames;
137 .Fa "const struct man_node *node"
139 .Ft "const struct man_meta *"
141 .Fa "const struct man *man"
143 .Ft "const struct mparse *"
145 .Fa "const struct man *man"
147 .Ft "const struct man_node *"
149 .Fa "const struct man *man"
151 .Vt extern const char * const * man_macronames;
157 manual into an abstract syntax tree (AST).
159 manuals are composed of
163 and may be mixed with
170 The following describes a general parse sequence:
173 initiate a parsing sequence with
186 retrieve the syntax tree with
189 iterate over parse nodes with
194 free all allocated memory with
203 This section documents the functions, types, and variables available
206 with the exception of those documented in
212 .It Vt "enum mandocerr"
213 An error or warning message during parsing.
214 .It Vt "enum mandoclevel"
215 A classification of an
217 as regards system operation.
218 .It Vt "struct mchars"
219 An opaque pointer to a a character table.
224 .It Vt "struct mparse"
225 An opaque pointer to a running parse sequence.
230 This may be used across parsed input if
232 is called between parses.
234 A prototype for a function to handle error and warning
235 messages emitted by the parser.
240 Obtain a text-only representation of a
241 .Vt struct man_node ,
242 including text contained in its child nodes.
243 To be used on children of the pointer returned from
245 When it is no longer needed, the pointer returned from
250 Obtain the meta-data of a successful
253 This may only be used on a pointer returned by
260 Get the parser used for the current output.
266 Obtain the root node of a successful
269 This may only be used on a pointer returned by
276 Obtain a text-only representation of a
277 .Vt struct mdoc_node ,
278 including text contained in its child nodes.
279 To be used on children of the pointer returned from
281 When it is no longer needed, the pointer returned from
286 Obtain the meta-data of a successful
289 This may only be used on a pointer returned by
296 Obtain the root node of a successful
299 This may only be used on a pointer returned by
307 The arguments have the following effect:
308 .Bl -tag -offset 5n -width inttype
314 bit is set, only that parser is used.
315 Otherwise, the document type is automatically detected.
322 file inclusion requests are always honoured.
323 Otherwise, if the request is the only content in an input file,
324 only the file name is remembered, to be returned in the
331 bit is set, parsing is aborted after the NAME section.
332 This is for example useful in
335 to quickly build minimal databases.
338 .Dv MANDOCLEVEL_BADARG ,
339 .Dv MANDOCLEVEL_ERROR ,
341 .Dv MANDOCLEVEL_WARNING .
342 Messages below the selected level will be suppressed.
344 A callback function to handle errors and warnings.
349 An opaque pointer to a a character table obtained from
352 A default string for the
355 macro, overriding the
357 preprocessor definition and the results of
361 The same parser may be used for multiple files so long as
363 is called between parses.
365 must be called to free the memory allocated by this function.
371 Free all memory allocated by
377 .It Fn mparse_getkeep
378 Acquire the keep buffer.
379 Must follow a call of
386 Instruct the parser to retain a copy of its parsed input.
387 This can be acquired with subsequent
409 Return a file descriptor open for reading in
420 Parse a file descriptor opened with
424 Pass the associated filename in
429 This function may be called multiple times with different parameters; however,
431 should be invoked between parses.
437 Reset a parser so that
445 Obtain the result of a parse.
446 One of the three pointers will be filled in.
451 .It Fn mparse_strerror
452 Return a statically-allocated string representation of an error code.
457 .It Fn mparse_strlevel
458 Return a statically-allocated string representation of a level code.
466 child process that was spawned with
468 To be called after the parse sequence is complete.
471 but does no harm in that case, either.
475 .Dv MANDOCLEVEL_SYSERR
476 on failure, that is, when
480 died from a signal or exited with non-zero status.
488 .It Va man_macronames
489 The string representation of a man macro as indexed by
492 The string representation of a mdoc macro argument as indexed by
493 .Vt "enum mdocargt" .
494 .It Va mdoc_macronames
495 The string representation of a mdoc macro as indexed by
498 .Sh IMPLEMENTATION NOTES
499 This section consists of structural documentation for
503 syntax trees and strings.
504 .Ss Man and Mdoc Strings
505 Strings may be extracted from mdoc and man meta-data, or from text
506 nodes (MDOC_TEXT and MAN_TEXT, respectively).
507 These strings have special non-printing formatting cues embedded in the
508 text itself, as well as
510 escapes preserved from input.
511 Implementing systems will need to handle both situations to produce
513 In general, strings may be assumed to consist of 7-bit ASCII characters.
515 The following non-printing characters may be embedded in text strings:
518 A non-breaking space character.
522 A breakable zero-width space.
525 Escape characters are also passed verbatim into text strings.
526 An escape character is a sequence of characters beginning with the
529 To construct human-readable text, these should be intercepted with
531 and converted with one the functions described in
533 .Ss Man Abstract Syntax Tree
534 This AST is governed by the ontological rules dictated in
536 and derives its terminology accordingly.
538 The AST is composed of
540 nodes with element, root and text types as declared by the
543 Each node also provides its parse point (the
548 fields), its position in the tree (the
554 fields) and some type-specific data.
556 The tree itself is arranged according to the following normal form,
557 where capitalised non-terminals represent nodes.
559 .Bl -tag -width "ELEMENTXX" -compact
563 \(<- ELEMENT | TEXT | BLOCK
576 The only elements capable of nesting other elements are those with
577 next-line scope as documented in
579 .Ss Mdoc Abstract Syntax Tree
580 This AST is governed by the ontological
583 and derives its terminology accordingly.
585 elements described in
587 are described simply as
590 The AST is composed of
592 nodes with block, head, body, element, root and text types as declared
596 Each node also provides its parse point (the
601 fields), its position in the tree (the
608 fields) and some type-specific data, in particular, for nodes generated
609 from macros, the generating macro in the
613 The tree itself is arranged according to the following normal form,
614 where capitalised non-terminals represent nodes.
616 .Bl -tag -width "ELEMENTXX" -compact
620 \(<- BLOCK | ELEMENT | TEXT
622 \(<- HEAD [TEXT] (BODY [TEXT])+ [TAIL [TEXT]]
628 \(<- mnode* [ENDBODY mnode*]
635 Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of
636 the BLOCK production: these refer to punctuation marks.
637 Furthermore, although a TEXT node will generally have a non-zero-length
638 string, in the specific case of
639 .Sq \&.Bd \-literal ,
640 an empty line will produce a zero-length string.
641 Multiple body parts are only found in invocations of
643 where a new body introduces a new phrase.
647 syntax tree accommodates for broken block structures as well.
648 The ENDBODY node is available to end the formatting associated
649 with a given block before the physical end of that block.
652 field, is of the BODY
656 as the BLOCK it is ending, and has a
658 field pointing to that BLOCK's BODY node.
659 It is an indirect child of that BODY node
660 and has no children of its own.
662 An ENDBODY node is generated when a block ends while one of its child
663 blocks is still open, like in the following example:
664 .Bd -literal -offset indent
671 This example results in the following block structure:
672 .Bd -literal -offset indent
677 BLOCK Bo, pending -> Ao
682 ENDBODY Ao, pending -> Ao
687 Here, the formatting of the
689 block extends from TEXT ao to TEXT ac,
690 while the formatting of the
692 block extends from TEXT bo to TEXT bc.
693 It renders as follows in
697 .Dl <ao [bo ac> bc] end
699 Support for badly-nested blocks is only provided for backward
700 compatibility with some older
703 Using badly-nested blocks is
704 .Em strongly discouraged ;
711 are unable to render them in any meaningful way.
712 Furthermore, behaviour when encountering badly-nested blocks is not
713 consistent across troff implementations, especially when using multiple
714 levels of badly-nested blocks.
717 .Xr mandoc_escape 3 ,
718 .Xr mandoc_malloc 3 ,
729 library was written by
730 .An Kristaps Dzonsons Aq Mt kristaps@bsd.lv .