]> git.cameronkatri.com Git - mandoc.git/blob - mdoc.3
More documentation in place.
[mandoc.git] / mdoc.3
1 .\" $Id: mdoc.3,v 1.7 2009/02/23 09:46:59 kristaps Exp $
2 .\"
3 .\" Copyright (c) 2009 Kristaps Dzonsons <kristaps@kth.se>
4 .\"
5 .\" Permission to use, copy, modify, and distribute this software for any
6 .\" purpose with or without fee is hereby granted, provided that the
7 .\" above copyright notice and this permission notice appear in all
8 .\" copies.
9 .\"
10 .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL
11 .\" WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED
12 .\" WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE
13 .\" AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL
14 .\" DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR
15 .\" PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER
16 .\" TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
17 .\" PERFORMANCE OF THIS SOFTWARE.
18 .\"
19 .Dd $Mdocdate: February 23 2009 $
20 .Dt mdoc 3
21 .Os
22 .\" SECTION
23 .Sh NAME
24 .Nm mdoc_alloc ,
25 .Nm mdoc_parseln ,
26 .Nm mdoc_endparse ,
27 .Nm mdoc_node ,
28 .Nm mdoc_meta ,
29 .Nm mdoc_free
30 .Nd mdoc macro compiler library
31 .\" SECTION
32 .Sh SYNOPSIS
33 .Fd #include <mdoc.h>
34 .Vt extern const char * const * mdoc_macronames;
35 .Vt extern const char * const * mdoc_argnames;
36 .Ft "struct mdoc *"
37 .Fn mdoc_alloc "void *data" "const struct mdoc_cb *cb"
38 .Ft void
39 .Fn mdoc_free "struct mdoc *mdoc"
40 .Ft int
41 .Fn mdoc_parseln "struct mdoc *mdoc" "int line" "char *buf"
42 .Ft "const struct mdoc_node *"
43 .Fn mdoc_node "struct mdoc *mdoc"
44 .Ft "const struct mdoc_meta *"
45 .Fn mdoc_meta "struct mdoc *mdoc"
46 .Ft int
47 .Fn mdoc_endparse "struct mdoc *mdoc"
48 .\" SECTION
49 .Sh DESCRIPTION
50 The
51 .Nm mdoc
52 library parses lines of mdoc input into an abstract syntax tree.
53 .Dq mdoc ,
54 which is used to format BSD manual pages, is a macro package of the
55 .Dq roff
56 language. The
57 .Nm
58 library implements only those macros documented in the
59 .Xr mdoc 7
60 and
61 .Xr mdoc.samples 7
62 manuals.
63 .\" PARAGRAPH
64 .Pp
65 .Nm
66 is
67 .Ud
68 .\" PARAGRAPH
69 .Pp
70 In general, applications initiate a parsing sequence with
71 .Fn mdoc_alloc ,
72 parse each line in a document with
73 .Fn mdoc_parseln ,
74 close the parsing session with
75 .Fn mdoc_endparse ,
76 operate over the syntax tree returned by
77 .Fn mdoc_node
78 and
79 .Fn mdoc_meta ,
80 then free all allocated memory with
81 .Fn mdoc_free .
82 See the
83 .Sx EXAMPLES
84 section for a full example.
85 .\" PARAGRAPH
86 .Pp
87 This section further defines the
88 .Sx Types ,
89 .Sx Functions
90 and
91 .Sx Variables
92 available to programmers. The last sub-section,
93 .Sx Abstract Syntax Tree ,
94 documents the output tree.
95 .\" SUBSECTION
96 .Ss Types
97 Both functions (see
98 .Sx Functions )
99 and variables (see
100 .Sx Variables )
101 may use the following types:
102 .Bl -ohang
103 .\" LIST-ITEM
104 .It Vt struct mdoc
105 An opaque type defined in
106 .Pa mdoc.c .
107 Its values are only used privately within the library.
108 .\" LIST-ITEM
109 .It Vt struct mdoc_cb
110 A set of message callbacks defined in
111 .Pa mdoc.h .
112 .\" LIST-ITEM
113 .It Vt struct mdoc_node
114 A parsed node. Defined in
115 .Pa mdoc.h .
116 See
117 .Sx Abstract Syntax Tree
118 for details.
119 .El
120 .\" SUBSECTION
121 .Ss Functions
122 Function descriptions follow:
123 .Bl -ohang
124 .\" LIST-ITEM
125 .It Fn mdoc_alloc
126 Allocates a parsing structure. The
127 .Fa data
128 pointer is passed to callbacks in
129 .Fa cb ,
130 which are documented further in the header file. Returns NULL on
131 failure. If non-NULL, the pointer must be freed with
132 .Fn mdoc_free .
133 .\" LIST-ITEM
134 .It Fn mdoc_free
135 Free all resources of a parser. The pointer is no longer valid after
136 invocation.
137 .\" LIST-ITEM
138 .It Fn mdoc_parseln
139 Parse a nil-terminated line of input. This line should not contain the
140 trailing newline. Returns 0 on failure, 1 on success. The input buffer
141 .Fa buf
142 is modified by this function.
143 .\" LIST-ITEM
144 .It Fn mdoc_endparse
145 Signals that the parse is complete. Note that if
146 .Fn mdoc_endparse
147 is called subsequent to
148 .Fn mdoc_node ,
149 the resulting tree is incomplete. Returns 0 on failure, 1 on success.
150 .\" LIST-ITEM
151 .It Fn mdoc_node
152 Returns the first node of the parse. Note that if
153 .Fn mdoc_parseln
154 or
155 .Fn mdoc_endparse
156 return 0, the tree will be incomplete.
157 .It Fn mdoc_meta
158 Returns the document's parsed meta-data. If this information has not
159 yet been supplied or
160 .Fn mdoc_parseln
161 or
162 .Fn mdoc_endparse
163 return 0, the data will be incomplete.
164 .El
165 .\" SUBSECTION
166 .Ss Variables
167 The following variables are also defined:
168 .Bl -ohang
169 .\" LIST-ITEM
170 .It Va mdoc_macronames
171 An array of string-ified token names.
172 .\" LIST-ITEM
173 .It Va mdoc_argnames
174 An array of string-ified token argument names.
175 .El
176 .\" SUBSECTION
177 .Ss Abstract Syntax Tree
178 The
179 .Nm
180 functions produce an abstract syntax tree (AST) describing the input
181 lines in a regular form. It may be reviewed at any time with
182 .Fn mdoc_nodes ;
183 however, if called before
184 .Fn mdoc_endparse ,
185 or after
186 .Fn mdoc_endparse
187 or
188 .Fn mdoc_parseln
189 fail, it may be incomplete.
190 .\" PARAGRAPH
191 .Pp
192 The AST is composed of
193 .Vt struct mdoc_node
194 nodes with block, head, body, element, root and text types as declared
195 by the
196 .Va type
197 field. Each node also provides its parse point (the
198 .Va line ,
199 .Va sec ,
200 and
201 .Va pos
202 fields), its position in the tree (the
203 .Va parent ,
204 .Va child ,
205 .Va next
206 and
207 .Va prev
208 fields) and type-specific data (the
209 .Va data
210 field).
211 .\" PARAGRAPH
212 .Pp
213 The tree itself is arranged according to the following normal form,
214 where capitalised non-terminals represent nodes.
215 .Pp
216 .Bl -tag -width "ELEMENTXX" -compact
217 .\" LIST-ITEM
218 .It ROOT
219 \(<- mnode+
220 .It mnode
221 \(<- BLOCK | ELEMENT | TEXT
222 .It BLOCK
223 \(<- (HEAD [TEXT])+ [BODY [TEXT]] [TAIL [TEXT]]
224 .It BLOCK
225 \(<- BODY [TEXT] [TAIL [TEXT]]
226 .It ELEMENT
227 \(<- TEXT*
228 .It HEAD
229 \(<- mnode+
230 .It BODY
231 \(<- mnode+
232 .It TAIL
233 \(<- mnode+
234 .It TEXT
235 \(<- [[:alpha:]]*
236 .El
237 .\" PARAGRAPH
238 .Pp
239 Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of
240 the BLOCK production. These refer to punctuation marks. Furthermore,
241 although a TEXT node will generally have a non-zero-length string, it
242 certain cases, such as
243 .Dq \&.Bd \-literal ,
244 an empty line will produce a zero-length string.
245 .\" PARAGRAPH
246 .Pp
247 The rule-of-thumb for mapping node types to macros follows: in-line
248 elements, such as
249 .Dq \&.Em foo ,
250 are classified as ELEMENT nodes, which can only contain text.
251 Multi-line elements such as
252 .Dq \&.Sh
253 are BLOCK elements, where the HEAD constitutes line contents and the
254 BODY constitutes subsequent lines. In-line elements with matching
255 pairs, such as
256 .Dq \&.So
257 and
258 .Dq \&.Sc ,
259 are BLOCK elements with no HEAD tag. The only exception to this is
260 .Dq \&.Eo
261 and
262 .Dq \&.Ec ,
263 which has a HEAD and TAIL node corresponding to the enclosure string.
264 TEXT nodes, obviously, constitute text; the ROOT node is the document's
265 root.
266 .\" SECTION
267 .Sh EXAMPLES
268 The following example reads lines from stdin and parses them, operating
269 on the finished parse tree with
270 .Fn parsed .
271 Note that, if the last line of the file isn't newline-terminated, this
272 will truncate the file's last character (see
273 .Xr fgetln 3 ) .
274 Further, this example does not error-check nor free memory upon failure.
275 .Bd -literal
276 struct mdoc *mdoc;
277 struct mdoc_node *node;
278 char *buf;
279 size_t len;
280 int line;
281
282 line = 1;
283 mdoc = mdoc_alloc(NULL, NULL);
284
285 while ((buf = fgetln(fp, &len))) {
286 buf[len - 1] = '\\0';
287 if ( ! mdoc_parseln(mdoc, line, buf))
288 errx(1, "mdoc_parseln");
289 line++;
290 }
291
292 if ( ! mdoc_endparse(mdoc))
293 errx(1, "mdoc_endparse");
294 if (NULL == (node = mdoc_node(mdoc)))
295 errx(1, "mdoc_node");
296
297 parsed(mdoc, node);
298 mdoc_free(mdoc);
299 .Ed
300 .\" SECTION
301 .Sh SEE ALSO
302 .Xr mdoc 7 ,
303 .Xr mdoc.samples 7 ,
304 .Xr groff 1 ,
305 .Xr mdocml 1
306 .\" SECTION
307 .Sh AUTHORS
308 The
309 .Nm
310 utility was written by
311 .An Kristaps Dzonsons Aq kristaps@kth.se .
312 .\" SECTION
313 .Sh BUGS
314 Bugs, un-implemented macros and incompabilities are documented in this
315 section. The baseline for determining whether macro parsing is
316 .Qq incompatible
317 is the default
318 .Xr groff 1
319 system bundled with
320 .Ox .
321 .Pp
322 Un-implemented: the
323 .Sq \&Xc
324 and
325 .Sq \&Xo
326 macros aren't handled when used to span lines for the
327 .Sq \&It
328 macro. Such usage is specifically discouraged in
329 .Xr mdoc.samples 7 .
330 .Pp
331 Bugs: when
332 .Sq \&It \-column
333 is invoked, whitespace is not stripped around
334 .Sq \&Ta
335 or tab-character separators.
336 .Pp
337 Incompatible: the
338 .Sq \&At
339 macro only accepts a single parameter. Furthermore, several macros
340 .Pf ( Sq \&Pp ,
341 .Sq \&It ,
342 and possibly others) accept multiple arguments with a warning.
343 .Pp
344 Incompatible: only those macros specified by
345 .Xr mdoc.samples 7
346 and
347 .Xr mdoc 7
348 for
349 .Ox
350 are supported; support for
351 .Nx
352 and other
353 .Bx
354 systems is in progress.