1 .\" $Id: mandoc.3,v 1.32 2015/07/19 06:05:16 schwarze Exp $
3 .\" Copyright (c) 2009, 2010, 2011 Kristaps Dzonsons <kristaps@bsd.lv>
4 .\" Copyright (c) 2010, 2013, 2014, 2015 Ingo Schwarze <schwarze@openbsd.org>
6 .\" Permission to use, copy, modify, and distribute this software for any
7 .\" purpose with or without fee is hereby granted, provided that the above
8 .\" copyright notice and this permission notice appear in all copies.
10 .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
11 .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
12 .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
13 .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
14 .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
15 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
16 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
18 .Dd $Mdocdate: July 19 2015 $
40 .Nd mandoc macro compiler library
45 .Fd "#define ASCII_NBRSP"
46 .Fd "#define ASCII_HYPH"
47 .Fd "#define ASCII_BREAK"
51 .Fa "enum mandoclevel wlevel"
53 .Fa "const struct mchars *mchars"
58 .Fa "enum mandocerr errtype"
59 .Fa "enum mandoclevel level"
60 .Fa "const char *file"
67 .Fa "struct mparse *parse"
71 .Fa "const struct mparse *parse"
75 .Fa "struct mparse *parse"
77 .Ft "enum mandoclevel"
79 .Fa "struct mparse *parse"
81 .Fa "const char *fname"
83 .Ft "enum mandoclevel"
85 .Fa "struct mparse *parse"
87 .Fa "const char *fname"
91 .Fa "struct mparse *parse"
95 .Fa "struct mparse *parse"
96 .Fa "struct mdoc **mdoc"
97 .Fa "struct man **man"
106 .Fa "enum mandoclevel"
114 .Fa "const struct mdoc_node *node"
116 .Ft "const struct mdoc_meta *"
118 .Fa "const struct mdoc *mdoc"
120 .Ft "const struct mdoc_node *"
122 .Fa "const struct mdoc *mdoc"
124 .Vt extern const char * const * mdoc_argnames;
125 .Vt extern const char * const * mdoc_macronames;
132 .Fa "const struct man_node *node"
134 .Ft "const struct man_meta *"
136 .Fa "const struct man *man"
138 .Ft "const struct mparse *"
140 .Fa "const struct man *man"
142 .Ft "const struct man_node *"
144 .Fa "const struct man *man"
146 .Vt extern const char * const * man_macronames;
152 manual into an abstract syntax tree (AST).
154 manuals are composed of
158 and may be mixed with
165 The following describes a general parse sequence:
168 initiate a parsing sequence with
181 retrieve the syntax tree with
184 iterate over parse nodes with
189 free all allocated memory with
198 This section documents the functions, types, and variables available
201 with the exception of those documented in
207 .It Vt "enum mandocerr"
208 An error or warning message during parsing.
209 .It Vt "enum mandoclevel"
210 A classification of an
212 as regards system operation.
213 .It Vt "struct mchars"
214 An opaque pointer to a a character table.
219 .It Vt "struct mparse"
220 An opaque pointer to a running parse sequence.
225 This may be used across parsed input if
227 is called between parses.
229 A prototype for a function to handle error and warning
230 messages emitted by the parser.
235 Obtain a text-only representation of a
236 .Vt struct man_node ,
237 including text contained in its child nodes.
238 To be used on children of the pointer returned from
240 When it is no longer needed, the pointer returned from
245 Obtain the meta-data of a successful
248 This may only be used on a pointer returned by
255 Get the parser used for the current output.
261 Obtain the root node of a successful
264 This may only be used on a pointer returned by
271 Obtain a text-only representation of a
272 .Vt struct mdoc_node ,
273 including text contained in its child nodes.
274 To be used on children of the pointer returned from
276 When it is no longer needed, the pointer returned from
281 Obtain the meta-data of a successful
284 This may only be used on a pointer returned by
291 Obtain the root node of a successful
294 This may only be used on a pointer returned by
302 The arguments have the following effect:
303 .Bl -tag -offset 5n -width inttype
309 bit is set, only that parser is used.
310 Otherwise, the document type is automatically detected.
317 file inclusion requests are always honoured.
318 Otherwise, if the request is the only content in an input file,
319 only the file name is remembered, to be returned in the
326 bit is set, parsing is aborted after the NAME section.
327 This is for example useful in
330 to quickly build minimal databases.
333 .Dv MANDOCLEVEL_BADARG ,
334 .Dv MANDOCLEVEL_ERROR ,
336 .Dv MANDOCLEVEL_WARNING .
337 Messages below the selected level will be suppressed.
339 A callback function to handle errors and warnings.
344 An opaque pointer to a a character table obtained from
347 A default string for the
350 macro, overriding the
352 preprocessor definition and the results of
356 The same parser may be used for multiple files so long as
358 is called between parses.
360 must be called to free the memory allocated by this function.
366 Free all memory allocated by
372 .It Fn mparse_getkeep
373 Acquire the keep buffer.
374 Must follow a call of
381 Instruct the parser to retain a copy of its parsed input.
382 This can be acquired with subsequent
390 Open the file for reading.
393 does not already end in
395 try again after appending
397 Save the information whether the file is zipped or not.
398 Return a file descriptor open for reading in
409 Parse a file descriptor opened with
413 Pass the associated filename in
415 This function may be called multiple times with different parameters; however,
417 should be invoked between parses.
423 Reset a parser so that
431 Obtain the result of a parse.
432 One of the three pointers will be filled in.
437 .It Fn mparse_strerror
438 Return a statically-allocated string representation of an error code.
443 .It Fn mparse_strlevel
444 Return a statically-allocated string representation of a level code.
452 .It Va man_macronames
453 The string representation of a man macro as indexed by
456 The string representation of a mdoc macro argument as indexed by
457 .Vt "enum mdocargt" .
458 .It Va mdoc_macronames
459 The string representation of a mdoc macro as indexed by
462 .Sh IMPLEMENTATION NOTES
463 This section consists of structural documentation for
467 syntax trees and strings.
468 .Ss Man and Mdoc Strings
469 Strings may be extracted from mdoc and man meta-data, or from text
470 nodes (MDOC_TEXT and MAN_TEXT, respectively).
471 These strings have special non-printing formatting cues embedded in the
472 text itself, as well as
474 escapes preserved from input.
475 Implementing systems will need to handle both situations to produce
477 In general, strings may be assumed to consist of 7-bit ASCII characters.
479 The following non-printing characters may be embedded in text strings:
482 A non-breaking space character.
486 A breakable zero-width space.
489 Escape characters are also passed verbatim into text strings.
490 An escape character is a sequence of characters beginning with the
493 To construct human-readable text, these should be intercepted with
495 and converted with one the functions described in
497 .Ss Man Abstract Syntax Tree
498 This AST is governed by the ontological rules dictated in
500 and derives its terminology accordingly.
502 The AST is composed of
504 nodes with element, root and text types as declared by the
507 Each node also provides its parse point (the
512 fields), its position in the tree (the
518 fields) and some type-specific data.
520 The tree itself is arranged according to the following normal form,
521 where capitalised non-terminals represent nodes.
523 .Bl -tag -width "ELEMENTXX" -compact
527 \(<- ELEMENT | TEXT | BLOCK
540 The only elements capable of nesting other elements are those with
541 next-line scope as documented in
543 .Ss Mdoc Abstract Syntax Tree
544 This AST is governed by the ontological
547 and derives its terminology accordingly.
549 elements described in
551 are described simply as
554 The AST is composed of
556 nodes with block, head, body, element, root and text types as declared
560 Each node also provides its parse point (the
565 fields), its position in the tree (the
572 fields) and some type-specific data, in particular, for nodes generated
573 from macros, the generating macro in the
577 The tree itself is arranged according to the following normal form,
578 where capitalised non-terminals represent nodes.
580 .Bl -tag -width "ELEMENTXX" -compact
584 \(<- BLOCK | ELEMENT | TEXT
586 \(<- HEAD [TEXT] (BODY [TEXT])+ [TAIL [TEXT]]
592 \(<- mnode* [ENDBODY mnode*]
599 Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of
600 the BLOCK production: these refer to punctuation marks.
601 Furthermore, although a TEXT node will generally have a non-zero-length
602 string, in the specific case of
603 .Sq \&.Bd \-literal ,
604 an empty line will produce a zero-length string.
605 Multiple body parts are only found in invocations of
607 where a new body introduces a new phrase.
611 syntax tree accommodates for broken block structures as well.
612 The ENDBODY node is available to end the formatting associated
613 with a given block before the physical end of that block.
616 field, is of the BODY
620 as the BLOCK it is ending, and has a
622 field pointing to that BLOCK's BODY node.
623 It is an indirect child of that BODY node
624 and has no children of its own.
626 An ENDBODY node is generated when a block ends while one of its child
627 blocks is still open, like in the following example:
628 .Bd -literal -offset indent
635 This example results in the following block structure:
636 .Bd -literal -offset indent
641 BLOCK Bo, pending -> Ao
646 ENDBODY Ao, pending -> Ao
651 Here, the formatting of the
653 block extends from TEXT ao to TEXT ac,
654 while the formatting of the
656 block extends from TEXT bo to TEXT bc.
657 It renders as follows in
661 .Dl <ao [bo ac> bc] end
663 Support for badly-nested blocks is only provided for backward
664 compatibility with some older
667 Using badly-nested blocks is
668 .Em strongly discouraged ;
675 are unable to render them in any meaningful way.
676 Furthermore, behaviour when encountering badly-nested blocks is not
677 consistent across troff implementations, especially when using multiple
678 levels of badly-nested blocks.
681 .Xr mandoc_escape 3 ,
682 .Xr mandoc_malloc 3 ,
693 library was written by
694 .An Kristaps Dzonsons Aq Mt kristaps@bsd.lv .