]> git.cameronkatri.com Git - mandoc.git/blob - mdoc.3
Fix enum/int mixing.
[mandoc.git] / mdoc.3
1 .\" $Id: mdoc.3,v 1.50 2010/10/10 09:47:05 kristaps Exp $
2 .\"
3 .\" Copyright (c) 2009, 2010 Kristaps Dzonsons <kristaps@bsd.lv>
4 .\" Copyright (c) 2010 Ingo Schwarze <schwarze@openbsd.org>
5 .\"
6 .\" Permission to use, copy, modify, and distribute this software for any
7 .\" purpose with or without fee is hereby granted, provided that the above
8 .\" copyright notice and this permission notice appear in all copies.
9 .\"
10 .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
11 .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
12 .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
13 .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
14 .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
15 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
16 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
17 .\"
18 .Dd $Mdocdate: October 10 2010 $
19 .Dt MDOC 3
20 .Os
21 .Sh NAME
22 .Nm mdoc ,
23 .Nm mdoc_alloc ,
24 .Nm mdoc_endparse ,
25 .Nm mdoc_free ,
26 .Nm mdoc_meta ,
27 .Nm mdoc_node ,
28 .Nm mdoc_parseln ,
29 .Nm mdoc_reset
30 .Nd mdoc macro compiler library
31 .Sh SYNOPSIS
32 .In mandoc.h
33 .In mdoc.h
34 .Vt extern const char * const * mdoc_macronames;
35 .Vt extern const char * const * mdoc_argnames;
36 .Ft "struct mdoc *"
37 .Fo mdoc_alloc
38 .Fa "struct regset *regs"
39 .Fa "void *data"
40 .Fa "mandocmsg msgs"
41 .Fc
42 .Ft int
43 .Fn mdoc_endparse "struct mdoc *mdoc"
44 .Ft void
45 .Fn mdoc_free "struct mdoc *mdoc"
46 .Ft "const struct mdoc_meta *"
47 .Fn mdoc_meta "const struct mdoc *mdoc"
48 .Ft "const struct mdoc_node *"
49 .Fn mdoc_node "const struct mdoc *mdoc"
50 .Ft int
51 .Fo mdoc_parseln
52 .Fa "struct mdoc *mdoc"
53 .Fa "int line"
54 .Fa "char *buf"
55 .Fc
56 .Ft int
57 .Fn mdoc_reset "struct mdoc *mdoc"
58 .Sh DESCRIPTION
59 The
60 .Nm mdoc
61 library parses lines of
62 .Xr mdoc 7
63 input
64 into an abstract syntax tree (AST).
65 .Pp
66 In general, applications initiate a parsing sequence with
67 .Fn mdoc_alloc ,
68 parse each line in a document with
69 .Fn mdoc_parseln ,
70 close the parsing session with
71 .Fn mdoc_endparse ,
72 operate over the syntax tree returned by
73 .Fn mdoc_node
74 and
75 .Fn mdoc_meta ,
76 then free all allocated memory with
77 .Fn mdoc_free .
78 The
79 .Fn mdoc_reset
80 function may be used in order to reset the parser for another input
81 sequence.
82 .Ss Types
83 .Bl -ohang
84 .It Vt struct mdoc
85 An opaque type.
86 Its values are only used privately within the library.
87 .It Vt struct mdoc_node
88 A parsed node.
89 See
90 .Sx Abstract Syntax Tree
91 for details.
92 .El
93 .Ss Functions
94 .Bl -ohang
95 .It Fn mdoc_alloc
96 Allocates a parsing structure.
97 The
98 .Fa data
99 pointer is passed to
100 .Fa msgs .
101 Returns NULL on failure.
102 If non-NULL, the pointer must be freed with
103 .Fn mdoc_free .
104 .It Fn mdoc_reset
105 Reset the parser for another parse routine.
106 After its use,
107 .Fn mdoc_parseln
108 behaves as if invoked for the first time.
109 If it returns 0, memory could not be allocated.
110 .It Fn mdoc_free
111 Free all resources of a parser.
112 The pointer is no longer valid after invocation.
113 .It Fn mdoc_parseln
114 Parse a nil-terminated line of input.
115 This line should not contain the trailing newline.
116 Returns 0 on failure, 1 on success.
117 The input buffer
118 .Fa buf
119 is modified by this function.
120 .It Fn mdoc_endparse
121 Signals that the parse is complete.
122 Note that if
123 .Fn mdoc_endparse
124 is called subsequent to
125 .Fn mdoc_node ,
126 the resulting tree is incomplete.
127 Returns 0 on failure, 1 on success.
128 .It Fn mdoc_node
129 Returns the first node of the parse.
130 Note that if
131 .Fn mdoc_parseln
132 or
133 .Fn mdoc_endparse
134 return 0, the tree will be incomplete.
135 .It Fn mdoc_meta
136 Returns the document's parsed meta-data.
137 If this information has not yet been supplied or
138 .Fn mdoc_parseln
139 or
140 .Fn mdoc_endparse
141 return 0, the data will be incomplete.
142 .El
143 .Ss Variables
144 .Bl -ohang
145 .It Va mdoc_macronames
146 An array of string-ified token names.
147 .It Va mdoc_argnames
148 An array of string-ified token argument names.
149 .El
150 .Ss Abstract Syntax Tree
151 The
152 .Nm
153 functions produce an abstract syntax tree (AST) describing input in a
154 regular form.
155 It may be reviewed at any time with
156 .Fn mdoc_nodes ;
157 however, if called before
158 .Fn mdoc_endparse ,
159 or after
160 .Fn mdoc_endparse
161 or
162 .Fn mdoc_parseln
163 fail, it may be incomplete.
164 .Pp
165 This AST is governed by the ontological
166 rules dictated in
167 .Xr mdoc 7
168 and derives its terminology accordingly.
169 .Qq In-line
170 elements described in
171 .Xr mdoc 7
172 are described simply as
173 .Qq elements .
174 .Pp
175 The AST is composed of
176 .Vt struct mdoc_node
177 nodes with block, head, body, element, root and text types as declared
178 by the
179 .Va type
180 field.
181 Each node also provides its parse point (the
182 .Va line ,
183 .Va sec ,
184 and
185 .Va pos
186 fields), its position in the tree (the
187 .Va parent ,
188 .Va child ,
189 .Va nchild ,
190 .Va next
191 and
192 .Va prev
193 fields) and some type-specific data, in particular, for nodes generated
194 from macros, the generating macro in the
195 .Va tok
196 field.
197 .Pp
198 The tree itself is arranged according to the following normal form,
199 where capitalised non-terminals represent nodes.
200 .Pp
201 .Bl -tag -width "ELEMENTXX" -compact
202 .It ROOT
203 \(<- mnode+
204 .It mnode
205 \(<- BLOCK | ELEMENT | TEXT
206 .It BLOCK
207 \(<- HEAD [TEXT] (BODY [TEXT])+ [TAIL [TEXT]]
208 .It ELEMENT
209 \(<- TEXT*
210 .It HEAD
211 \(<- mnode*
212 .It BODY
213 \(<- mnode* [ENDBODY mnode*]
214 .It TAIL
215 \(<- mnode*
216 .It TEXT
217 \(<- [[:printable:],0x1e]*
218 .El
219 .Pp
220 Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of
221 the BLOCK production: these refer to punctuation marks.
222 Furthermore, although a TEXT node will generally have a non-zero-length
223 string, in the specific case of
224 .Sq \&.Bd \-literal ,
225 an empty line will produce a zero-length string.
226 Multiple body parts are only found in invocations of
227 .Sq \&Bl \-column ,
228 where a new body introduces a new phrase.
229 .Ss Badly-nested Blocks
230 The ENDBODY node is available to end the formatting associated
231 with a given block before the physical end of that block.
232 It has a non-null
233 .Va end
234 field, is of the BODY
235 .Va type ,
236 has the same
237 .Va tok
238 as the BLOCK it is ending, and has a
239 .Va pending
240 field pointing to that BLOCK's BODY node.
241 It is an indirect child of that BODY node
242 and has no children of its own.
243 .Pp
244 An ENDBODY node is generated when a block ends while one of its child
245 blocks is still open, like in the following example:
246 .Bd -literal -offset indent
247 \&.Ao ao
248 \&.Bo bo ac
249 \&.Ac bc
250 \&.Bc end
251 .Ed
252 .Pp
253 This example results in the following block structure:
254 .Bd -literal -offset indent
255 BLOCK Ao
256 HEAD Ao
257 BODY Ao
258 TEXT ao
259 BLOCK Bo, pending -> Ao
260 HEAD Bo
261 BODY Bo
262 TEXT bo
263 TEXT ac
264 ENDBODY Ao, pending -> Ao
265 TEXT bc
266 TEXT end
267 .Ed
268 .Pp
269 Here, the formatting of the
270 .Sq \&Ao
271 block extends from TEXT ao to TEXT ac,
272 while the formatting of the
273 .Sq \&Bo
274 block extends from TEXT bo to TEXT bc.
275 It renders as follows in
276 .Fl T Ns Cm ascii
277 mode:
278 .Pp
279 .Dl <ao [bo ac> bc] end
280 .Pp
281 Support for badly-nested blocks is only provided for backward
282 compatibility with some older
283 .Xr mdoc 7
284 implementations.
285 Using badly-nested blocks is
286 .Em strongly discouraged :
287 the
288 .Fl T Ns Cm html
289 and
290 .Fl T Ns Cm xhtml
291 front-ends are unable to render them in any meaningful way.
292 Furthermore, behaviour when encountering badly-nested blocks is not
293 consistent across troff implementations, especially when using multiple
294 levels of badly-nested blocks.
295 .Sh EXAMPLES
296 The following example reads lines from stdin and parses them, operating
297 on the finished parse tree with
298 .Fn parsed .
299 This example does not error-check nor free memory upon failure.
300 .Bd -literal -offset indent
301 struct regset regs;
302 struct mdoc *mdoc;
303 const struct mdoc_node *node;
304 char *buf;
305 size_t len;
306 int line;
307
308 bzero(&regs, sizeof(struct regset));
309 line = 1;
310 mdoc = mdoc_alloc(&regs, NULL, NULL);
311 buf = NULL;
312 alloc_len = 0;
313
314 while ((len = getline(&buf, &alloc_len, stdin)) >= 0) {
315 if (len && buflen[len - 1] = '\en')
316 buf[len - 1] = '\e0';
317 if ( ! mdoc_parseln(mdoc, line, buf))
318 errx(1, "mdoc_parseln");
319 line++;
320 }
321
322 if ( ! mdoc_endparse(mdoc))
323 errx(1, "mdoc_endparse");
324 if (NULL == (node = mdoc_node(mdoc)))
325 errx(1, "mdoc_node");
326
327 parsed(mdoc, node);
328 mdoc_free(mdoc);
329 .Ed
330 .Pp
331 To compile this, execute
332 .Pp
333 .D1 % cc main.c libmdoc.a libmandoc.a
334 .Pp
335 where
336 .Pa main.c
337 is the example file.
338 .Sh SEE ALSO
339 .Xr mandoc 1 ,
340 .Xr mdoc 7
341 .Sh AUTHORS
342 The
343 .Nm
344 library was written by
345 .An Kristaps Dzonsons Aq kristaps@bsd.lv .