1 .\" $Id: mdoc.3,v 1.37 2010/02/17 19:22:01 kristaps Exp $
3 .\" Copyright (c) 2009-2010 Kristaps Dzonsons <kristaps@bsd.lv>
5 .\" Permission to use, copy, modify, and distribute this software for any
6 .\" purpose with or without fee is hereby granted, provided that the above
7 .\" copyright notice and this permission notice appear in all copies.
9 .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
10 .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
11 .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
12 .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
13 .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
14 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
15 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
17 .Dd $Mdocdate: February 17 2010 $
29 .Nd mdoc macro compiler library
33 .Vt extern const char * const * mdoc_macronames;
34 .Vt extern const char * const * mdoc_argnames;
36 .Fn mdoc_alloc "void *data" "int pflags" "const struct mdoc_cb *cb"
38 .Fn mdoc_reset "struct mdoc *mdoc"
40 .Fn mdoc_free "struct mdoc *mdoc"
42 .Fn mdoc_parseln "struct mdoc *mdoc" "int line" "char *buf"
43 .Ft "const struct mdoc_node *"
44 .Fn mdoc_node "const struct mdoc *mdoc"
45 .Ft "const struct mdoc_meta *"
46 .Fn mdoc_meta "const struct mdoc *mdoc"
48 .Fn mdoc_endparse "struct mdoc *mdoc"
53 library parses lines of
57 mdoc) into an abstract syntax tree (AST).
60 In general, applications initiate a parsing sequence with
62 parse each line in a document with
64 close the parsing session with
66 operate over the syntax tree returned by
70 then free all allocated memory with
74 function may be used in order to reset the parser for another input
77 section for a full example.
80 This section further defines the
85 available to programmers. Following that, the
86 .Sx Abstract Syntax Tree
87 section documents the output tree.
94 may use the following types:
98 An opaque type defined in
100 Its values are only used privately within the library.
102 .It Vt struct mdoc_cb
103 A set of message callbacks defined in
106 .It Vt struct mdoc_node
107 A parsed node. Defined in
110 .Sx Abstract Syntax Tree
115 Function descriptions follow:
119 Allocates a parsing structure. The
121 pointer is passed to callbacks in
123 which are documented further in the header file.
126 arguments are defined in
128 Returns NULL on failure. If non-NULL, the pointer must be freed with
132 Reset the parser for another parse routine. After its use,
134 behaves as if invoked for the first time. If it returns 0, memory could
138 Free all resources of a parser. The pointer is no longer valid after
142 Parse a nil-terminated line of input. This line should not contain the
143 trailing newline. Returns 0 on failure, 1 on success. The input buffer
145 is modified by this function.
148 Signals that the parse is complete. Note that if
150 is called subsequent to
152 the resulting tree is incomplete. Returns 0 on failure, 1 on success.
155 Returns the first node of the parse. Note that if
159 return 0, the tree will be incomplete.
161 Returns the document's parsed meta-data. If this information has not
166 return 0, the data will be incomplete.
170 The following variables are also defined:
173 .It Va mdoc_macronames
174 An array of string-ified token names.
177 An array of string-ified token argument names.
180 .Ss Abstract Syntax Tree
183 functions produce an abstract syntax tree (AST) describing input in a
184 regular form. It may be reviewed at any time with
186 however, if called before
192 fail, it may be incomplete.
195 This AST is governed by the ontological
198 and derives its terminology accordingly.
200 elements described in
202 are described simply as
206 The AST is composed of
208 nodes with block, head, body, element, root and text types as declared
211 field. Each node also provides its parse point (the
216 fields), its position in the tree (the
222 fields) and some type-specific data.
225 The tree itself is arranged according to the following normal form,
226 where capitalised non-terminals represent nodes.
228 .Bl -tag -width "ELEMENTXX" -compact
233 \(<- BLOCK | ELEMENT | TEXT
235 \(<- (HEAD [TEXT])+ [BODY [TEXT]] [TAIL [TEXT]]
237 \(<- BODY [TEXT] [TAIL [TEXT]]
251 Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of
252 the BLOCK production. These refer to punctuation marks. Furthermore,
253 although a TEXT node will generally have a non-zero-length string, in
255 .Sq \&.Bd \-literal ,
256 an empty line will produce a zero-length string.
259 The following example reads lines from stdin and parses them, operating
260 on the finished parse tree with
262 This example does not error-check nor free memory upon failure.
263 .Bd -literal -offset indent
265 const struct mdoc_node *node;
271 mdoc = mdoc_alloc(NULL, 0, NULL);
275 while ((len = getline(&buf, &alloc_len, stdin)) >= 0) {
276 if (len && buflen[len - 1] = '\en')
277 buf[len - 1] = '\e0';
278 if ( ! mdoc_parseln(mdoc, line, buf))
279 errx(1, "mdoc_parseln");
283 if ( ! mdoc_endparse(mdoc))
284 errx(1, "mdoc_endparse");
285 if (NULL == (node = mdoc_node(mdoc)))
286 errx(1, "mdoc_node");
299 utility was written by
300 .An Kristaps Dzonsons Aq kristaps@bsd.lv .
310 macros aren't handled when used to span lines for the
317 macro family doesn't yet understand version arguments.
320 If not given a value, the \-offset argument to
324 should be the width of
333 should default to width