1 .\" $Id: mdoc.3,v 1.50 2010/10/10 09:47:05 kristaps Exp $
3 .\" Copyright (c) 2009, 2010 Kristaps Dzonsons <kristaps@bsd.lv>
4 .\" Copyright (c) 2010 Ingo Schwarze <schwarze@openbsd.org>
6 .\" Permission to use, copy, modify, and distribute this software for any
7 .\" purpose with or without fee is hereby granted, provided that the above
8 .\" copyright notice and this permission notice appear in all copies.
10 .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
11 .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
12 .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
13 .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
14 .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
15 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
16 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
18 .Dd $Mdocdate: October 10 2010 $
30 .Nd mdoc macro compiler library
34 .Vt extern const char * const * mdoc_macronames;
35 .Vt extern const char * const * mdoc_argnames;
38 .Fa "struct regset *regs"
43 .Fn mdoc_endparse "struct mdoc *mdoc"
45 .Fn mdoc_free "struct mdoc *mdoc"
46 .Ft "const struct mdoc_meta *"
47 .Fn mdoc_meta "const struct mdoc *mdoc"
48 .Ft "const struct mdoc_node *"
49 .Fn mdoc_node "const struct mdoc *mdoc"
52 .Fa "struct mdoc *mdoc"
57 .Fn mdoc_reset "struct mdoc *mdoc"
61 library parses lines of
64 into an abstract syntax tree (AST).
66 In general, applications initiate a parsing sequence with
68 parse each line in a document with
70 close the parsing session with
72 operate over the syntax tree returned by
76 then free all allocated memory with
80 function may be used in order to reset the parser for another input
86 Its values are only used privately within the library.
87 .It Vt struct mdoc_node
90 .Sx Abstract Syntax Tree
96 Allocates a parsing structure.
101 Returns NULL on failure.
102 If non-NULL, the pointer must be freed with
105 Reset the parser for another parse routine.
108 behaves as if invoked for the first time.
109 If it returns 0, memory could not be allocated.
111 Free all resources of a parser.
112 The pointer is no longer valid after invocation.
114 Parse a nil-terminated line of input.
115 This line should not contain the trailing newline.
116 Returns 0 on failure, 1 on success.
119 is modified by this function.
121 Signals that the parse is complete.
124 is called subsequent to
126 the resulting tree is incomplete.
127 Returns 0 on failure, 1 on success.
129 Returns the first node of the parse.
134 return 0, the tree will be incomplete.
136 Returns the document's parsed meta-data.
137 If this information has not yet been supplied or
141 return 0, the data will be incomplete.
145 .It Va mdoc_macronames
146 An array of string-ified token names.
148 An array of string-ified token argument names.
150 .Ss Abstract Syntax Tree
153 functions produce an abstract syntax tree (AST) describing input in a
155 It may be reviewed at any time with
157 however, if called before
163 fail, it may be incomplete.
165 This AST is governed by the ontological
168 and derives its terminology accordingly.
170 elements described in
172 are described simply as
175 The AST is composed of
177 nodes with block, head, body, element, root and text types as declared
181 Each node also provides its parse point (the
186 fields), its position in the tree (the
193 fields) and some type-specific data, in particular, for nodes generated
194 from macros, the generating macro in the
198 The tree itself is arranged according to the following normal form,
199 where capitalised non-terminals represent nodes.
201 .Bl -tag -width "ELEMENTXX" -compact
205 \(<- BLOCK | ELEMENT | TEXT
207 \(<- HEAD [TEXT] (BODY [TEXT])+ [TAIL [TEXT]]
213 \(<- mnode* [ENDBODY mnode*]
217 \(<- [[:printable:],0x1e]*
220 Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of
221 the BLOCK production: these refer to punctuation marks.
222 Furthermore, although a TEXT node will generally have a non-zero-length
223 string, in the specific case of
224 .Sq \&.Bd \-literal ,
225 an empty line will produce a zero-length string.
226 Multiple body parts are only found in invocations of
228 where a new body introduces a new phrase.
229 .Ss Badly-nested Blocks
230 The ENDBODY node is available to end the formatting associated
231 with a given block before the physical end of that block.
234 field, is of the BODY
238 as the BLOCK it is ending, and has a
240 field pointing to that BLOCK's BODY node.
241 It is an indirect child of that BODY node
242 and has no children of its own.
244 An ENDBODY node is generated when a block ends while one of its child
245 blocks is still open, like in the following example:
246 .Bd -literal -offset indent
253 This example results in the following block structure:
254 .Bd -literal -offset indent
259 BLOCK Bo, pending -> Ao
264 ENDBODY Ao, pending -> Ao
269 Here, the formatting of the
271 block extends from TEXT ao to TEXT ac,
272 while the formatting of the
274 block extends from TEXT bo to TEXT bc.
275 It renders as follows in
279 .Dl <ao [bo ac> bc] end
281 Support for badly-nested blocks is only provided for backward
282 compatibility with some older
285 Using badly-nested blocks is
286 .Em strongly discouraged :
291 front-ends are unable to render them in any meaningful way.
292 Furthermore, behaviour when encountering badly-nested blocks is not
293 consistent across troff implementations, especially when using multiple
294 levels of badly-nested blocks.
296 The following example reads lines from stdin and parses them, operating
297 on the finished parse tree with
299 This example does not error-check nor free memory upon failure.
300 .Bd -literal -offset indent
303 const struct mdoc_node *node;
308 bzero(®s, sizeof(struct regset));
310 mdoc = mdoc_alloc(®s, NULL, NULL);
314 while ((len = getline(&buf, &alloc_len, stdin)) >= 0) {
315 if (len && buflen[len - 1] = '\en')
316 buf[len - 1] = '\e0';
317 if ( ! mdoc_parseln(mdoc, line, buf))
318 errx(1, "mdoc_parseln");
322 if ( ! mdoc_endparse(mdoc))
323 errx(1, "mdoc_endparse");
324 if (NULL == (node = mdoc_node(mdoc)))
325 errx(1, "mdoc_node");
331 To compile this, execute
333 .D1 % cc main.c libmdoc.a libmandoc.a
344 library was written by
345 .An Kristaps Dzonsons Aq kristaps@bsd.lv .