1 .\" $Id: mdoc.3,v 1.49 2010/08/20 01:02:07 schwarze Exp $
3 .\" Copyright (c) 2009, 2010 Kristaps Dzonsons <kristaps@bsd.lv>
4 .\" Copyright (c) 2010 Ingo Schwarze <schwarze@openbsd.org>
6 .\" Permission to use, copy, modify, and distribute this software for any
7 .\" purpose with or without fee is hereby granted, provided that the above
8 .\" copyright notice and this permission notice appear in all copies.
10 .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
11 .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
12 .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
13 .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
14 .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
15 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
16 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
18 .Dd $Mdocdate: August 20 2010 $
30 .Nd mdoc macro compiler library
34 .Vt extern const char * const * mdoc_macronames;
35 .Vt extern const char * const * mdoc_argnames;
38 .Fa "struct regset *regs"
43 .Fn mdoc_endparse "struct mdoc *mdoc"
45 .Fn mdoc_free "struct mdoc *mdoc"
46 .Ft "const struct mdoc_meta *"
47 .Fn mdoc_meta "const struct mdoc *mdoc"
48 .Ft "const struct mdoc_node *"
49 .Fn mdoc_node "const struct mdoc *mdoc"
52 .Fa "struct mdoc *mdoc"
57 .Fn mdoc_reset "struct mdoc *mdoc"
61 library parses lines of
64 into an abstract syntax tree (AST).
66 In general, applications initiate a parsing sequence with
68 parse each line in a document with
70 close the parsing session with
72 operate over the syntax tree returned by
76 then free all allocated memory with
80 function may be used in order to reset the parser for another input
84 section for a simple example.
86 This section further defines the
91 available to programmers.
93 .Sx Abstract Syntax Tree
94 section documents the output tree.
100 may use the following types:
103 An opaque type defined in
105 Its values are only used privately within the library.
106 .It Vt struct mdoc_node
111 .Sx Abstract Syntax Tree
114 A function callback type defined in
118 Function descriptions follow:
121 Allocates a parsing structure.
126 Returns NULL on failure.
127 If non-NULL, the pointer must be freed with
130 Reset the parser for another parse routine.
133 behaves as if invoked for the first time.
134 If it returns 0, memory could not be allocated.
136 Free all resources of a parser.
137 The pointer is no longer valid after invocation.
139 Parse a nil-terminated line of input.
140 This line should not contain the trailing newline.
141 Returns 0 on failure, 1 on success.
144 is modified by this function.
146 Signals that the parse is complete.
149 is called subsequent to
151 the resulting tree is incomplete.
152 Returns 0 on failure, 1 on success.
154 Returns the first node of the parse.
159 return 0, the tree will be incomplete.
161 Returns the document's parsed meta-data.
162 If this information has not yet been supplied or
166 return 0, the data will be incomplete.
169 The following variables are also defined:
171 .It Va mdoc_macronames
172 An array of string-ified token names.
174 An array of string-ified token argument names.
176 .Ss Abstract Syntax Tree
179 functions produce an abstract syntax tree (AST) describing input in a
181 It may be reviewed at any time with
183 however, if called before
189 fail, it may be incomplete.
191 This AST is governed by the ontological
194 and derives its terminology accordingly.
196 elements described in
198 are described simply as
201 The AST is composed of
203 nodes with block, head, body, element, root and text types as declared
207 Each node also provides its parse point (the
212 fields), its position in the tree (the
219 fields) and some type-specific data, in particular, for nodes generated
220 from macros, the generating macro in the
224 The tree itself is arranged according to the following normal form,
225 where capitalised non-terminals represent nodes.
227 .Bl -tag -width "ELEMENTXX" -compact
231 \(<- BLOCK | ELEMENT | TEXT
233 \(<- HEAD [TEXT] (BODY [TEXT])+ [TAIL [TEXT]]
239 \(<- mnode* [ENDBODY mnode*]
243 \(<- [[:printable:],0x1e]*
246 Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of
247 the BLOCK production: these refer to punctuation marks.
248 Furthermore, although a TEXT node will generally have a non-zero-length
249 string, in the specific case of
250 .Sq \&.Bd \-literal ,
251 an empty line will produce a zero-length string.
252 Multiple body parts are only found in invocations of
254 where a new body introduces a new phrase.
255 .Ss Badly-nested Blocks
256 The ENDBODY node is available to end the formatting associated
257 with a given block before the physical end of that block.
260 field, is of the BODY
264 as the BLOCK it is ending, and has a
266 field pointing to that BLOCK's BODY node.
267 It is an indirect child of that BODY node
268 and has no children of its own.
270 An ENDBODY node is generated when a block ends while one of its child
271 blocks is still open, like in the following example:
272 .Bd -literal -offset indent
279 This example results in the following block structure:
280 .Bd -literal -offset indent
285 BLOCK Bo, pending -> Ao
290 ENDBODY Ao, pending -> Ao
295 Here, the formatting of the
297 block extends from TEXT ao to TEXT ac,
298 while the formatting of the
300 block extends from TEXT bo to TEXT bc.
301 It renders as follows in
305 .Dl <ao [bo ac> bc] end
307 Support for badly-nested blocks is only provided for backward
308 compatibility with some older
311 Using badly-nested blocks is
312 .Em strongly discouraged :
317 front-ends are unable to render them in any meaningful way.
318 Furthermore, behaviour when encountering badly-nested blocks is not
319 consistent across troff implementations, especially when using multiple
320 levels of badly-nested blocks.
322 The following example reads lines from stdin and parses them, operating
323 on the finished parse tree with
325 This example does not error-check nor free memory upon failure.
326 .Bd -literal -offset indent
329 const struct mdoc_node *node;
334 bzero(®s, sizeof(struct regset));
336 mdoc = mdoc_alloc(®s, NULL, NULL);
340 while ((len = getline(&buf, &alloc_len, stdin)) >= 0) {
341 if (len && buflen[len - 1] = '\en')
342 buf[len - 1] = '\e0';
343 if ( ! mdoc_parseln(mdoc, line, buf))
344 errx(1, "mdoc_parseln");
348 if ( ! mdoc_endparse(mdoc))
349 errx(1, "mdoc_endparse");
350 if (NULL == (node = mdoc_node(mdoc)))
351 errx(1, "mdoc_node");
359 in the source archive for a rigorous reference.
366 library was written by
367 .An Kristaps Dzonsons Aq kristaps@bsd.lv .