1 .\" $Id: mdoc.3,v 1.45 2010/06/29 19:20:38 schwarze Exp $
3 .\" Copyright (c) 2009-2010 Kristaps Dzonsons <kristaps@bsd.lv>
5 .\" Permission to use, copy, modify, and distribute this software for any
6 .\" purpose with or without fee is hereby granted, provided that the above
7 .\" copyright notice and this permission notice appear in all copies.
9 .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
10 .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
11 .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
12 .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
13 .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
14 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
15 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
17 .Dd $Mdocdate: June 29 2010 $
29 .Nd mdoc macro compiler library
34 .Vt extern const char * const * mdoc_macronames;
35 .Vt extern const char * const * mdoc_argnames;
38 .Fa "struct regset *regs"
44 .Fn mdoc_endparse "struct mdoc *mdoc"
46 .Fn mdoc_free "struct mdoc *mdoc"
47 .Ft "const struct mdoc_meta *"
48 .Fn mdoc_meta "const struct mdoc *mdoc"
49 .Ft "const struct mdoc_node *"
50 .Fn mdoc_node "const struct mdoc *mdoc"
53 .Fa "struct mdoc *mdoc"
58 .Fn mdoc_reset "struct mdoc *mdoc"
62 library parses lines of
65 into an abstract syntax tree (AST).
67 In general, applications initiate a parsing sequence with
69 parse each line in a document with
71 close the parsing session with
73 operate over the syntax tree returned by
77 then free all allocated memory with
81 function may be used in order to reset the parser for another input
85 section for a simple example.
87 This section further defines the
92 available to programmers.
94 .Sx Abstract Syntax Tree
95 section documents the output tree.
101 may use the following types:
104 An opaque type defined in
106 Its values are only used privately within the library.
107 .It Vt struct mdoc_node
112 .Sx Abstract Syntax Tree
115 A function callback type defined in
119 Function descriptions follow:
122 Allocates a parsing structure.
129 arguments are defined in
131 Returns NULL on failure.
132 If non-NULL, the pointer must be freed with
135 Reset the parser for another parse routine.
138 behaves as if invoked for the first time.
139 If it returns 0, memory could not be allocated.
141 Free all resources of a parser.
142 The pointer is no longer valid after invocation.
144 Parse a nil-terminated line of input.
145 This line should not contain the trailing newline.
146 Returns 0 on failure, 1 on success.
149 is modified by this function.
151 Signals that the parse is complete.
154 is called subsequent to
156 the resulting tree is incomplete.
157 Returns 0 on failure, 1 on success.
159 Returns the first node of the parse.
164 return 0, the tree will be incomplete.
166 Returns the document's parsed meta-data.
167 If this information has not yet been supplied or
171 return 0, the data will be incomplete.
174 The following variables are also defined:
176 .It Va mdoc_macronames
177 An array of string-ified token names.
179 An array of string-ified token argument names.
181 .Ss Abstract Syntax Tree
184 functions produce an abstract syntax tree (AST) describing input in a
186 It may be reviewed at any time with
188 however, if called before
194 fail, it may be incomplete.
196 This AST is governed by the ontological
199 and derives its terminology accordingly.
201 elements described in
203 are described simply as
206 The AST is composed of
208 nodes with block, head, body, element, root and text types as declared
212 Each node also provides its parse point (the
217 fields), its position in the tree (the
224 fields) and some type-specific data, in particular, for nodes generated
225 from macros, the generating macro in the
229 The tree itself is arranged according to the following normal form,
230 where capitalised non-terminals represent nodes.
232 .Bl -tag -width "ELEMENTXX" -compact
236 \(<- BLOCK | ELEMENT | TEXT
238 \(<- HEAD [TEXT] (BODY [TEXT])+ [TAIL [TEXT]]
244 \(<- mnode* [ENDBODY mnode*]
248 \(<- [[:printable:],0x1e]*
251 Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of
252 the BLOCK production: these refer to punctuation marks.
253 Furthermore, although a TEXT node will generally have a non-zero-length
254 string, in the specific case of
255 .Sq \&.Bd \-literal ,
256 an empty line will produce a zero-length string.
257 Multiple body parts are only found in invocations of
259 where a new body introduces a new phrase.
260 .Ss Badly nested blocks
261 A special kind of node is available to end the formatting
262 associated with a given block before the physical end of that block.
263 Such an ENDBODY node has a non-null
265 field, is of the BODY
269 as the BLOCK it is ending, and has a
271 field pointing to that BLOCK's BODY node.
272 It is an indirect child of that BODY node
273 and has no children of its own.
275 An ENDBODY node is generated when a block ends while one of its child
276 blocks is still open, like in the following example:
277 .Bd -literal -offset indent
284 This example results in the following block structure:
285 .Bd -literal -offset indent
290 BLOCK Bo, pending -> Ao
295 ENDBODY Ao, pending -> Ao
300 Here, the formatting of the Ao block extends from TEXT ao to TEXT ac,
301 while the formatting of the Bo block extends from TEXT bo to TEXT bc,
302 rendering like this in
305 .Dl <ao [bo ac> bc] end
306 Support for badly nested blocks is only provided for backward
307 compatibility with some older
310 Using them in new code is stronly discouraged:
311 Some frontends, in particular
313 are unable to render them in any meaningful way,
316 implementations do not support them, and even for those that do,
317 the behaviour is not well-defined, in particular when using multiple
318 levels of badly nested blocks.
320 The following example reads lines from stdin and parses them, operating
321 on the finished parse tree with
323 This example does not error-check nor free memory upon failure.
324 .Bd -literal -offset indent
327 const struct mdoc_node *node;
332 bzero(®s, sizeof(struct regset));
334 mdoc = mdoc_alloc(®s, NULL, 0, NULL);
338 while ((len = getline(&buf, &alloc_len, stdin)) >= 0) {
339 if (len && buflen[len - 1] = '\en')
340 buf[len - 1] = '\e0';
341 if ( ! mdoc_parseln(mdoc, line, buf))
342 errx(1, "mdoc_parseln");
346 if ( ! mdoc_endparse(mdoc))
347 errx(1, "mdoc_endparse");
348 if (NULL == (node = mdoc_node(mdoc)))
349 errx(1, "mdoc_node");
357 in the source archive for a rigorous reference.
364 library was written by
365 .An Kristaps Dzonsons Aq kristaps@bsd.lv .