1 .\" $Id: mandoc.3,v 1.41 2017/07/04 23:40:01 schwarze Exp $
3 .\" Copyright (c) 2009, 2010, 2011 Kristaps Dzonsons <kristaps@bsd.lv>
4 .\" Copyright (c) 2010-2017 Ingo Schwarze <schwarze@openbsd.org>
6 .\" Permission to use, copy, modify, and distribute this software for any
7 .\" purpose with or without fee is hereby granted, provided that the above
8 .\" copyright notice and this permission notice appear in all copies.
10 .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
11 .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
12 .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
13 .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
14 .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
15 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
16 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
18 .Dd $Mdocdate: July 4 2017 $
39 .Nd mandoc macro compiler library
44 .Fd "#define ASCII_NBRSP"
45 .Fd "#define ASCII_HYPH"
46 .Fd "#define ASCII_BREAK"
50 .Fa "enum mandocerr mmin"
52 .Fa "enum mandoc_os oe_e"
57 .Fa "enum mandocerr errtype"
58 .Fa "enum mandoclevel level"
59 .Fa "const char *file"
66 .Fa "struct mparse *parse"
70 .Fa "const struct mparse *parse"
74 .Fa "struct mparse *parse"
78 .Fa "struct mparse *parse"
79 .Fa "const char *fname"
81 .Ft "enum mandoclevel"
83 .Fa "struct mparse *parse"
85 .Fa "const char *fname"
89 .Fa "struct mparse *parse"
93 .Fa "struct mparse *parse"
94 .Fa "struct roff_man **man"
103 .Fa "enum mandoclevel"
107 .Fa "struct mparse *parse"
108 .Fa "enum mandoclevel *rc"
114 .Fa "const struct roff_node *node"
119 .Vt extern const char * const * mdoc_argnames;
120 .Vt extern const char * const * mdoc_macronames;
123 .Fa "struct roff_man *mdoc"
128 .Vt extern const char * const * man_macronames;
129 .Ft "const struct mparse *"
131 .Fa "const struct roff_man *man"
135 .Fa "struct roff_man *man"
142 manual into an abstract syntax tree (AST).
144 manuals are composed of
148 and may be mixed with
155 The following describes a general parse sequence:
158 initiate a parsing sequence with
174 retrieve the syntax tree with
177 depending on whether the
179 member of the returned
191 if information about the validity of the input is needed, fetch it with
192 .Fn mparse_updaterc ;
194 iterate over parse nodes with starting from the
196 member of the returned
197 .Vt struct roff_man ;
199 free all allocated memory with
205 and go back to step 2 to parse new files.
208 This section documents the functions, types, and variables available
211 with the exception of those documented in
217 .It Vt "enum mandocerr"
218 An error or warning message during parsing.
219 .It Vt "enum mandoclevel"
220 A classification of an
222 as regards system operation.
223 See the DIAGNOSTICS section in
225 regarding the meanings of the levels.
226 .It Vt "struct mparse"
227 An opaque pointer to a running parse sequence.
232 This may be used across parsed input if
234 is called between parses.
236 A prototype for a function to handle error and warning
237 messages emitted by the parser.
242 Obtain a text-only representation of a
243 .Vt struct roff_node ,
244 including text contained in its child nodes.
245 To be used on children of the
248 .Vt struct roff_man .
249 When it is no longer needed, the pointer returned from
254 Get the parser used for the current output.
262 parse tree obtained with
271 parse tree obtained with
279 The arguments have the following effect:
280 .Bl -tag -offset 5n -width inttype
286 bit is set, only that parser is used.
287 Otherwise, the document type is automatically detected.
294 file inclusion requests are always honoured.
295 Otherwise, if the request is the only content in an input file,
296 only the file name is remembered, to be returned in the
303 bit is set, parsing is aborted after the NAME section.
304 This is for example useful in
307 to quickly build minimal databases.
311 .Dv MANDOCERR_STYLE ,
312 .Dv MANDOCERR_WARNING ,
313 .Dv MANDOCERR_ERROR ,
314 .Dv MANDOCERR_UNSUPP ,
317 Messages below the selected level will be suppressed.
319 A callback function to handle errors and warnings.
323 If printing of error messages is not desired,
327 Operating system to check base system conventions for.
329 .Dv MANDOC_OS_OTHER ,
330 the system is automatically detected from
336 A default string for the
339 macro, overriding the
341 preprocessor definition and the results of
348 The same parser may be used for multiple files so long as
350 is called between parses.
352 must be called to free the memory allocated by this function.
358 Free all memory allocated by
364 .It Fn mparse_getkeep
365 Acquire the keep buffer.
366 Must follow a call of
373 Instruct the parser to retain a copy of its parsed input.
374 This can be acquired with subsequent
382 Open the file for reading.
385 does not already end in
387 try again after appending
389 Save the information whether the file is zipped or not.
390 Return a file descriptor open for reading or -1 on failure.
399 Parse a file descriptor opened with
403 Pass the associated filename in
405 This function may be called multiple times with different parameters; however,
409 should be invoked between parses.
415 Reset a parser so that
423 Obtain the result of a parse.
424 One of the two pointers will be filled in.
429 .It Fn mparse_strerror
430 Return a statically-allocated string representation of an error code.
435 .It Fn mparse_strlevel
436 Return a statically-allocated string representation of a level code.
441 .It Fn mparse_updaterc
442 If the highest warning or error level that occurred during the current
449 This is useful after calling
460 .It Va man_macronames
461 The string representation of a
466 The string representation of an
468 macro argument as indexed by
469 .Vt "enum mdocargt" .
470 .It Va mdoc_macronames
471 The string representation of an
476 .Sh IMPLEMENTATION NOTES
477 This section consists of structural documentation for
481 syntax trees and strings.
482 .Ss Man and Mdoc Strings
483 Strings may be extracted from mdoc and man meta-data, or from text
484 nodes (MDOC_TEXT and MAN_TEXT, respectively).
485 These strings have special non-printing formatting cues embedded in the
486 text itself, as well as
488 escapes preserved from input.
489 Implementing systems will need to handle both situations to produce
491 In general, strings may be assumed to consist of 7-bit ASCII characters.
493 The following non-printing characters may be embedded in text strings:
496 A non-breaking space character.
500 A breakable zero-width space.
503 Escape characters are also passed verbatim into text strings.
504 An escape character is a sequence of characters beginning with the
507 To construct human-readable text, these should be intercepted with
509 and converted with one the functions described in
511 .Ss Man Abstract Syntax Tree
512 This AST is governed by the ontological rules dictated in
514 and derives its terminology accordingly.
516 The AST is composed of
518 nodes with element, root and text types as declared by the
521 Each node also provides its parse point (the
526 fields), its position in the tree (the
532 fields) and some type-specific data.
534 The tree itself is arranged according to the following normal form,
535 where capitalised non-terminals represent nodes.
537 .Bl -tag -width "ELEMENTXX" -compact
541 \(<- ELEMENT | TEXT | BLOCK
554 The only elements capable of nesting other elements are those with
555 next-line scope as documented in
557 .Ss Mdoc Abstract Syntax Tree
558 This AST is governed by the ontological
561 and derives its terminology accordingly.
563 elements described in
565 are described simply as
568 The AST is composed of
570 nodes with block, head, body, element, root and text types as declared
574 Each node also provides its parse point (the
579 fields), its position in the tree (the
586 fields) and some type-specific data, in particular, for nodes generated
587 from macros, the generating macro in the
591 The tree itself is arranged according to the following normal form,
592 where capitalised non-terminals represent nodes.
594 .Bl -tag -width "ELEMENTXX" -compact
598 \(<- BLOCK | ELEMENT | TEXT
600 \(<- HEAD [TEXT] (BODY [TEXT])+ [TAIL [TEXT]]
606 \(<- mnode* [ENDBODY mnode*]
613 Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of
614 the BLOCK production: these refer to punctuation marks.
615 Furthermore, although a TEXT node will generally have a non-zero-length
616 string, in the specific case of
617 .Sq \&.Bd \-literal ,
618 an empty line will produce a zero-length string.
619 Multiple body parts are only found in invocations of
621 where a new body introduces a new phrase.
625 syntax tree accommodates for broken block structures as well.
626 The ENDBODY node is available to end the formatting associated
627 with a given block before the physical end of that block.
630 field, is of the BODY
634 as the BLOCK it is ending, and has a
636 field pointing to that BLOCK's BODY node.
637 It is an indirect child of that BODY node
638 and has no children of its own.
640 An ENDBODY node is generated when a block ends while one of its child
641 blocks is still open, like in the following example:
642 .Bd -literal -offset indent
649 This example results in the following block structure:
650 .Bd -literal -offset indent
655 BLOCK Bo, pending -> Ao
660 ENDBODY Ao, pending -> Ao
665 Here, the formatting of the
667 block extends from TEXT ao to TEXT ac,
668 while the formatting of the
670 block extends from TEXT bo to TEXT bc.
671 It renders as follows in
675 .Dl <ao [bo ac> bc] end
677 Support for badly-nested blocks is only provided for backward
678 compatibility with some older
681 Using badly-nested blocks is
682 .Em strongly discouraged ;
687 is unable to render them in any meaningful way.
688 Furthermore, behaviour when encountering badly-nested blocks is not
689 consistent across troff implementations, especially when using multiple
690 levels of badly-nested blocks.
694 .Xr mandoc_escape 3 ,
695 .Xr mandoc_headers 3 ,
696 .Xr mandoc_malloc 3 ,
710 library was written by
711 .An Kristaps Dzonsons Aq Mt kristaps@bsd.lv
713 .An Ingo Schwarze Aq Mt schwarze@openbsd.org .