]> git.cameronkatri.com Git - mandoc.git/blob - mdoc.3
Add basic -Ttree support for tables.
[mandoc.git] / mdoc.3
1 .\" $Id: mdoc.3,v 1.52 2011/01/01 12:18:37 kristaps Exp $
2 .\"
3 .\" Copyright (c) 2009, 2010 Kristaps Dzonsons <kristaps@bsd.lv>
4 .\" Copyright (c) 2010 Ingo Schwarze <schwarze@openbsd.org>
5 .\"
6 .\" Permission to use, copy, modify, and distribute this software for any
7 .\" purpose with or without fee is hereby granted, provided that the above
8 .\" copyright notice and this permission notice appear in all copies.
9 .\"
10 .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
11 .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
12 .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
13 .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
14 .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
15 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
16 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
17 .\"
18 .Dd $Mdocdate: January 1 2011 $
19 .Dt MDOC 3
20 .Os
21 .Sh NAME
22 .Nm mdoc ,
23 .Nm mdoc_alloc ,
24 .Nm mdoc_endparse ,
25 .Nm mdoc_free ,
26 .Nm mdoc_meta ,
27 .Nm mdoc_node ,
28 .Nm mdoc_parseln ,
29 .Nm mdoc_reset
30 .Nd mdoc macro compiler library
31 .Sh SYNOPSIS
32 .In mandoc.h
33 .In mdoc.h
34 .Vt extern const char * const * mdoc_macronames;
35 .Vt extern const char * const * mdoc_argnames;
36 .Ft int
37 .Fo mdoc_addspan
38 .Fa "struct mdoc *mdoc"
39 .Fa "const struct tbl_span *span"
40 .Fc
41 .Ft "struct mdoc *"
42 .Fo mdoc_alloc
43 .Fa "struct regset *regs"
44 .Fa "void *data"
45 .Fa "mandocmsg msgs"
46 .Fc
47 .Ft int
48 .Fn mdoc_endparse "struct mdoc *mdoc"
49 .Ft void
50 .Fn mdoc_free "struct mdoc *mdoc"
51 .Ft "const struct mdoc_meta *"
52 .Fn mdoc_meta "const struct mdoc *mdoc"
53 .Ft "const struct mdoc_node *"
54 .Fn mdoc_node "const struct mdoc *mdoc"
55 .Ft int
56 .Fo mdoc_parseln
57 .Fa "struct mdoc *mdoc"
58 .Fa "int line"
59 .Fa "char *buf"
60 .Fc
61 .Ft int
62 .Fn mdoc_reset "struct mdoc *mdoc"
63 .Sh DESCRIPTION
64 The
65 .Nm mdoc
66 library parses lines of
67 .Xr mdoc 7
68 input
69 into an abstract syntax tree (AST).
70 .Pp
71 In general, applications initiate a parsing sequence with
72 .Fn mdoc_alloc ,
73 parse each line in a document with
74 .Fn mdoc_parseln ,
75 close the parsing session with
76 .Fn mdoc_endparse ,
77 operate over the syntax tree returned by
78 .Fn mdoc_node
79 and
80 .Fn mdoc_meta ,
81 then free all allocated memory with
82 .Fn mdoc_free .
83 The
84 .Fn mdoc_reset
85 function may be used in order to reset the parser for another input
86 sequence.
87 .Ss Types
88 .Bl -ohang
89 .It Vt struct mdoc
90 An opaque type.
91 Its values are only used privately within the library.
92 .It Vt struct mdoc_node
93 A parsed node.
94 See
95 .Sx Abstract Syntax Tree
96 for details.
97 .El
98 .Ss Functions
99 .Bl -ohang
100 .It Fn mdoc_addspan
101 Add a table span to the parsing stream.
102 Returns 0 on failure, 1 on success.
103 .It Fn mdoc_alloc
104 Allocates a parsing structure.
105 The
106 .Fa data
107 pointer is passed to
108 .Fa msgs .
109 Returns NULL on failure.
110 If non-NULL, the pointer must be freed with
111 .Fn mdoc_free .
112 .It Fn mdoc_reset
113 Reset the parser for another parse routine.
114 After its use,
115 .Fn mdoc_parseln
116 behaves as if invoked for the first time.
117 If it returns 0, memory could not be allocated.
118 .It Fn mdoc_free
119 Free all resources of a parser.
120 The pointer is no longer valid after invocation.
121 .It Fn mdoc_parseln
122 Parse a nil-terminated line of input.
123 This line should not contain the trailing newline.
124 Returns 0 on failure, 1 on success.
125 The input buffer
126 .Fa buf
127 is modified by this function.
128 .It Fn mdoc_endparse
129 Signals that the parse is complete.
130 Note that if
131 .Fn mdoc_endparse
132 is called subsequent to
133 .Fn mdoc_node ,
134 the resulting tree is incomplete.
135 Returns 0 on failure, 1 on success.
136 .It Fn mdoc_node
137 Returns the first node of the parse.
138 Note that if
139 .Fn mdoc_parseln
140 or
141 .Fn mdoc_endparse
142 return 0, the tree will be incomplete.
143 .It Fn mdoc_meta
144 Returns the document's parsed meta-data.
145 If this information has not yet been supplied or
146 .Fn mdoc_parseln
147 or
148 .Fn mdoc_endparse
149 return 0, the data will be incomplete.
150 .El
151 .Ss Variables
152 .Bl -ohang
153 .It Va mdoc_macronames
154 An array of string-ified token names.
155 .It Va mdoc_argnames
156 An array of string-ified token argument names.
157 .El
158 .Ss Abstract Syntax Tree
159 The
160 .Nm
161 functions produce an abstract syntax tree (AST) describing input in a
162 regular form.
163 It may be reviewed at any time with
164 .Fn mdoc_nodes ;
165 however, if called before
166 .Fn mdoc_endparse ,
167 or after
168 .Fn mdoc_endparse
169 or
170 .Fn mdoc_parseln
171 fail, it may be incomplete.
172 .Pp
173 This AST is governed by the ontological
174 rules dictated in
175 .Xr mdoc 7
176 and derives its terminology accordingly.
177 .Qq In-line
178 elements described in
179 .Xr mdoc 7
180 are described simply as
181 .Qq elements .
182 .Pp
183 The AST is composed of
184 .Vt struct mdoc_node
185 nodes with block, head, body, element, root and text types as declared
186 by the
187 .Va type
188 field.
189 Each node also provides its parse point (the
190 .Va line ,
191 .Va sec ,
192 and
193 .Va pos
194 fields), its position in the tree (the
195 .Va parent ,
196 .Va child ,
197 .Va nchild ,
198 .Va next
199 and
200 .Va prev
201 fields) and some type-specific data, in particular, for nodes generated
202 from macros, the generating macro in the
203 .Va tok
204 field.
205 .Pp
206 The tree itself is arranged according to the following normal form,
207 where capitalised non-terminals represent nodes.
208 .Pp
209 .Bl -tag -width "ELEMENTXX" -compact
210 .It ROOT
211 \(<- mnode+
212 .It mnode
213 \(<- BLOCK | ELEMENT | TEXT
214 .It BLOCK
215 \(<- HEAD [TEXT] (BODY [TEXT])+ [TAIL [TEXT]]
216 .It ELEMENT
217 \(<- TEXT*
218 .It HEAD
219 \(<- mnode*
220 .It BODY
221 \(<- mnode* [ENDBODY mnode*]
222 .It TAIL
223 \(<- mnode*
224 .It TEXT
225 \(<- [[:printable:],0x1e]*
226 .El
227 .Pp
228 Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of
229 the BLOCK production: these refer to punctuation marks.
230 Furthermore, although a TEXT node will generally have a non-zero-length
231 string, in the specific case of
232 .Sq \&.Bd \-literal ,
233 an empty line will produce a zero-length string.
234 Multiple body parts are only found in invocations of
235 .Sq \&Bl \-column ,
236 where a new body introduces a new phrase.
237 .Ss Badly-nested Blocks
238 The ENDBODY node is available to end the formatting associated
239 with a given block before the physical end of that block.
240 It has a non-null
241 .Va end
242 field, is of the BODY
243 .Va type ,
244 has the same
245 .Va tok
246 as the BLOCK it is ending, and has a
247 .Va pending
248 field pointing to that BLOCK's BODY node.
249 It is an indirect child of that BODY node
250 and has no children of its own.
251 .Pp
252 An ENDBODY node is generated when a block ends while one of its child
253 blocks is still open, like in the following example:
254 .Bd -literal -offset indent
255 \&.Ao ao
256 \&.Bo bo ac
257 \&.Ac bc
258 \&.Bc end
259 .Ed
260 .Pp
261 This example results in the following block structure:
262 .Bd -literal -offset indent
263 BLOCK Ao
264 HEAD Ao
265 BODY Ao
266 TEXT ao
267 BLOCK Bo, pending -> Ao
268 HEAD Bo
269 BODY Bo
270 TEXT bo
271 TEXT ac
272 ENDBODY Ao, pending -> Ao
273 TEXT bc
274 TEXT end
275 .Ed
276 .Pp
277 Here, the formatting of the
278 .Sq \&Ao
279 block extends from TEXT ao to TEXT ac,
280 while the formatting of the
281 .Sq \&Bo
282 block extends from TEXT bo to TEXT bc.
283 It renders as follows in
284 .Fl T Ns Cm ascii
285 mode:
286 .Pp
287 .Dl <ao [bo ac> bc] end
288 .Pp
289 Support for badly-nested blocks is only provided for backward
290 compatibility with some older
291 .Xr mdoc 7
292 implementations.
293 Using badly-nested blocks is
294 .Em strongly discouraged :
295 the
296 .Fl T Ns Cm html
297 and
298 .Fl T Ns Cm xhtml
299 front-ends are unable to render them in any meaningful way.
300 Furthermore, behaviour when encountering badly-nested blocks is not
301 consistent across troff implementations, especially when using multiple
302 levels of badly-nested blocks.
303 .Sh EXAMPLES
304 The following example reads lines from stdin and parses them, operating
305 on the finished parse tree with
306 .Fn parsed .
307 This example does not error-check nor free memory upon failure.
308 .Bd -literal -offset indent
309 struct regset regs;
310 struct mdoc *mdoc;
311 const struct mdoc_node *node;
312 char *buf;
313 size_t len;
314 int line;
315
316 bzero(&regs, sizeof(struct regset));
317 line = 1;
318 mdoc = mdoc_alloc(&regs, NULL, NULL);
319 buf = NULL;
320 alloc_len = 0;
321
322 while ((len = getline(&buf, &alloc_len, stdin)) >= 0) {
323 if (len && buflen[len - 1] = '\en')
324 buf[len - 1] = '\e0';
325 if ( ! mdoc_parseln(mdoc, line, buf))
326 errx(1, "mdoc_parseln");
327 line++;
328 }
329
330 if ( ! mdoc_endparse(mdoc))
331 errx(1, "mdoc_endparse");
332 if (NULL == (node = mdoc_node(mdoc)))
333 errx(1, "mdoc_node");
334
335 parsed(mdoc, node);
336 mdoc_free(mdoc);
337 .Ed
338 .Pp
339 To compile this, execute
340 .Pp
341 .Dl % cc main.c libmdoc.a libmandoc.a
342 .Pp
343 where
344 .Pa main.c
345 is the example file.
346 .Sh SEE ALSO
347 .Xr mandoc 1 ,
348 .Xr mdoc 7
349 .Sh AUTHORS
350 The
351 .Nm
352 library was written by
353 .An Kristaps Dzonsons Aq kristaps@bsd.lv .