1 .\" $Id: mdoc.3,v 1.47 2010/07/04 22:04:04 schwarze Exp $
3 .\" Copyright (c) 2009, 2010 Kristaps Dzonsons <kristaps@bsd.lv>
4 .\" Copyright (c) 2010 Ingo Schwarze <schwarze@openbsd.org>
6 .\" Permission to use, copy, modify, and distribute this software for any
7 .\" purpose with or without fee is hereby granted, provided that the above
8 .\" copyright notice and this permission notice appear in all copies.
10 .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
11 .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
12 .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
13 .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
14 .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
15 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
16 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
18 .Dd $Mdocdate: July 4 2010 $
30 .Nd mdoc macro compiler library
35 .Vt extern const char * const * mdoc_macronames;
36 .Vt extern const char * const * mdoc_argnames;
39 .Fa "struct regset *regs"
45 .Fn mdoc_endparse "struct mdoc *mdoc"
47 .Fn mdoc_free "struct mdoc *mdoc"
48 .Ft "const struct mdoc_meta *"
49 .Fn mdoc_meta "const struct mdoc *mdoc"
50 .Ft "const struct mdoc_node *"
51 .Fn mdoc_node "const struct mdoc *mdoc"
54 .Fa "struct mdoc *mdoc"
59 .Fn mdoc_reset "struct mdoc *mdoc"
63 library parses lines of
66 into an abstract syntax tree (AST).
68 In general, applications initiate a parsing sequence with
70 parse each line in a document with
72 close the parsing session with
74 operate over the syntax tree returned by
78 then free all allocated memory with
82 function may be used in order to reset the parser for another input
86 section for a simple example.
88 This section further defines the
93 available to programmers.
95 .Sx Abstract Syntax Tree
96 section documents the output tree.
102 may use the following types:
105 An opaque type defined in
107 Its values are only used privately within the library.
108 .It Vt struct mdoc_node
113 .Sx Abstract Syntax Tree
116 A function callback type defined in
120 Function descriptions follow:
123 Allocates a parsing structure.
130 arguments are defined in
132 Returns NULL on failure.
133 If non-NULL, the pointer must be freed with
136 Reset the parser for another parse routine.
139 behaves as if invoked for the first time.
140 If it returns 0, memory could not be allocated.
142 Free all resources of a parser.
143 The pointer is no longer valid after invocation.
145 Parse a nil-terminated line of input.
146 This line should not contain the trailing newline.
147 Returns 0 on failure, 1 on success.
150 is modified by this function.
152 Signals that the parse is complete.
155 is called subsequent to
157 the resulting tree is incomplete.
158 Returns 0 on failure, 1 on success.
160 Returns the first node of the parse.
165 return 0, the tree will be incomplete.
167 Returns the document's parsed meta-data.
168 If this information has not yet been supplied or
172 return 0, the data will be incomplete.
175 The following variables are also defined:
177 .It Va mdoc_macronames
178 An array of string-ified token names.
180 An array of string-ified token argument names.
182 .Ss Abstract Syntax Tree
185 functions produce an abstract syntax tree (AST) describing input in a
187 It may be reviewed at any time with
189 however, if called before
195 fail, it may be incomplete.
197 This AST is governed by the ontological
200 and derives its terminology accordingly.
202 elements described in
204 are described simply as
207 The AST is composed of
209 nodes with block, head, body, element, root and text types as declared
213 Each node also provides its parse point (the
218 fields), its position in the tree (the
225 fields) and some type-specific data, in particular, for nodes generated
226 from macros, the generating macro in the
230 The tree itself is arranged according to the following normal form,
231 where capitalised non-terminals represent nodes.
233 .Bl -tag -width "ELEMENTXX" -compact
237 \(<- BLOCK | ELEMENT | TEXT
239 \(<- HEAD [TEXT] (BODY [TEXT])+ [TAIL [TEXT]]
245 \(<- mnode* [ENDBODY mnode*]
249 \(<- [[:printable:],0x1e]*
252 Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of
253 the BLOCK production: these refer to punctuation marks.
254 Furthermore, although a TEXT node will generally have a non-zero-length
255 string, in the specific case of
256 .Sq \&.Bd \-literal ,
257 an empty line will produce a zero-length string.
258 Multiple body parts are only found in invocations of
260 where a new body introduces a new phrase.
261 .Ss Badly-nested Blocks
262 The ENDBODY node is available to end the formatting associated
263 with a given block before the physical end of that block.
266 field, is of the BODY
270 as the BLOCK it is ending, and has a
272 field pointing to that BLOCK's BODY node.
273 It is an indirect child of that BODY node
274 and has no children of its own.
276 An ENDBODY node is generated when a block ends while one of its child
277 blocks is still open, like in the following example:
278 .Bd -literal -offset indent
285 This example results in the following block structure:
286 .Bd -literal -offset indent
291 BLOCK Bo, pending -> Ao
296 ENDBODY Ao, pending -> Ao
301 Here, the formatting of the
303 block extends from TEXT ao to TEXT ac,
304 while the formatting of the
306 block extends from TEXT bo to TEXT bc.
307 It renders as follows in
311 .Dl <ao [bo ac> bc] end
313 Support for badly-nested blocks is only provided for backward
314 compatibility with some older
317 Using badly-nested blocks is
318 .Em strongly discouraged :
323 front-ends are unable to render them in any meaningful way.
324 Furthermore, behaviour when encountering badly-nested blocks is not
325 consistent across troff implementations, especially when using multiple
326 levels of badly-nested blocks.
328 The following example reads lines from stdin and parses them, operating
329 on the finished parse tree with
331 This example does not error-check nor free memory upon failure.
332 .Bd -literal -offset indent
335 const struct mdoc_node *node;
340 bzero(®s, sizeof(struct regset));
342 mdoc = mdoc_alloc(®s, NULL, 0, NULL);
346 while ((len = getline(&buf, &alloc_len, stdin)) >= 0) {
347 if (len && buflen[len - 1] = '\en')
348 buf[len - 1] = '\e0';
349 if ( ! mdoc_parseln(mdoc, line, buf))
350 errx(1, "mdoc_parseln");
354 if ( ! mdoc_endparse(mdoc))
355 errx(1, "mdoc_endparse");
356 if (NULL == (node = mdoc_node(mdoc)))
357 errx(1, "mdoc_node");
365 in the source archive for a rigorous reference.
372 library was written by
373 .An Kristaps Dzonsons Aq kristaps@bsd.lv .