Clean-up: added `Nm mdoc' to mdoc.3.
[mandoc.git] / mdoc.3
1 .\" $Id: mdoc.3,v 1.39 2010/05/25 21:46:48 kristaps Exp $
2 .\"
3 .\" Copyright (c) 2009-2010 Kristaps Dzonsons <kristaps@bsd.lv>
4 .\"
5 .\" Permission to use, copy, modify, and distribute this software for any
6 .\" purpose with or without fee is hereby granted, provided that the above
7 .\" copyright notice and this permission notice appear in all copies.
8 .\"
9 .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
10 .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
11 .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
12 .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
13 .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
14 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
15 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
16 .\"
17 .Dd $Mdocdate: May 25 2010 $
18 .Dt MDOC 3
19 .Os
20 .Sh NAME
21 .Nm mdoc ,
22 .Nm mdoc_alloc ,
23 .Nm mdoc_endparse ,
24 .Nm mdoc_free ,
25 .Nm mdoc_meta ,
26 .Nm mdoc_node ,
27 .Nm mdoc_parseln ,
28 .Nm mdoc_reset
29 .Nd mdoc macro compiler library
30 .Sh SYNOPSIS
31 .In mandoc.h
32 .In mdoc.h
33 .Vt extern const char * const * mdoc_macronames;
34 .Vt extern const char * const * mdoc_argnames;
35 .Ft "struct mdoc *"
36 .Fn mdoc_alloc "void *data" "int pflags" "mandocmsg msgs"
37 .Ft int
38 .Fn mdoc_endparse "struct mdoc *mdoc"
39 .Ft void
40 .Fn mdoc_free "struct mdoc *mdoc"
41 .Ft "const struct mdoc_meta *"
42 .Fn mdoc_meta "const struct mdoc *mdoc"
43 .Ft "const struct mdoc_node *"
44 .Fn mdoc_node "const struct mdoc *mdoc"
45 .Ft int
46 .Fn mdoc_parseln "struct mdoc *mdoc" "int line" "char *buf"
47 .Ft int
48 .Fn mdoc_reset "struct mdoc *mdoc"
49 .Sh DESCRIPTION
50 The
51 .Nm mdoc
52 library parses lines of
53 .Xr mdoc 7
54 input
55 into an abstract syntax tree (AST).
56 .Pp
57 In general, applications initiate a parsing sequence with
58 .Fn mdoc_alloc ,
59 parse each line in a document with
60 .Fn mdoc_parseln ,
61 close the parsing session with
62 .Fn mdoc_endparse ,
63 operate over the syntax tree returned by
64 .Fn mdoc_node
65 and
66 .Fn mdoc_meta ,
67 then free all allocated memory with
68 .Fn mdoc_free .
69 The
70 .Fn mdoc_reset
71 function may be used in order to reset the parser for another input
72 sequence.
73 See the
74 .Sx EXAMPLES
75 section for a simple example.
76 .Pp
77 This section further defines the
78 .Sx Types ,
79 .Sx Functions
80 and
81 .Sx Variables
82 available to programmers.
83 Following that, the
84 .Sx Abstract Syntax Tree
85 section documents the output tree.
86 .Ss Types
87 Both functions (see
88 .Sx Functions )
89 and variables (see
90 .Sx Variables )
91 may use the following types:
92 .Bl -ohang
93 .It Vt struct mdoc
94 An opaque type defined in
95 .Pa mdoc.c .
96 Its values are only used privately within the library.
97 .It Vt struct mdoc_node
98 A parsed node.
99 Defined in
100 .Pa mdoc.h .
101 See
102 .Sx Abstract Syntax Tree
103 for details.
104 .It Vt mandocmsg
105 A function callback type defined in
106 .Pa mandoc.h .
107 .El
108 .Ss Functions
109 Function descriptions follow:
110 .Bl -ohang
111 .It Fn mdoc_alloc
112 Allocates a parsing structure.
113 The
114 .Fa data
115 pointer is passed to callbacks in
116 .Fa cb ,
117 which are documented further in the header file.
118 The
119 .Fa pflags
120 arguments are defined in
121 .Pa mdoc.h .
122 Returns NULL on failure.
123 If non-NULL, the pointer must be freed with
124 .Fn mdoc_free .
125 .It Fn mdoc_reset
126 Reset the parser for another parse routine.
127 After its use,
128 .Fn mdoc_parseln
129 behaves as if invoked for the first time.
130 If it returns 0, memory could not be allocated.
131 .It Fn mdoc_free
132 Free all resources of a parser.
133 The pointer is no longer valid after invocation.
134 .It Fn mdoc_parseln
135 Parse a nil-terminated line of input.
136 This line should not contain the trailing newline.
137 Returns 0 on failure, 1 on success.
138 The input buffer
139 .Fa buf
140 is modified by this function.
141 .It Fn mdoc_endparse
142 Signals that the parse is complete.
143 Note that if
144 .Fn mdoc_endparse
145 is called subsequent to
146 .Fn mdoc_node ,
147 the resulting tree is incomplete.
148 Returns 0 on failure, 1 on success.
149 .It Fn mdoc_node
150 Returns the first node of the parse.
151 Note that if
152 .Fn mdoc_parseln
153 or
154 .Fn mdoc_endparse
155 return 0, the tree will be incomplete.
156 .It Fn mdoc_meta
157 Returns the document's parsed meta-data.
158 If this information has not yet been supplied or
159 .Fn mdoc_parseln
160 or
161 .Fn mdoc_endparse
162 return 0, the data will be incomplete.
163 .El
164 .Ss Variables
165 The following variables are also defined:
166 .Bl -ohang
167 .It Va mdoc_macronames
168 An array of string-ified token names.
169 .It Va mdoc_argnames
170 An array of string-ified token argument names.
171 .El
172 .Ss Abstract Syntax Tree
173 The
174 .Nm
175 functions produce an abstract syntax tree (AST) describing input in a
176 regular form.
177 It may be reviewed at any time with
178 .Fn mdoc_nodes ;
179 however, if called before
180 .Fn mdoc_endparse ,
181 or after
182 .Fn mdoc_endparse
183 or
184 .Fn mdoc_parseln
185 fail, it may be incomplete.
186 .Pp
187 This AST is governed by the ontological
188 rules dictated in
189 .Xr mdoc 7
190 and derives its terminology accordingly.
191 .Qq In-line
192 elements described in
193 .Xr mdoc 7
194 are described simply as
195 .Qq elements .
196 .Pp
197 The AST is composed of
198 .Vt struct mdoc_node
199 nodes with block, head, body, element, root and text types as declared
200 by the
201 .Va type
202 field.
203 Each node also provides its parse point (the
204 .Va line ,
205 .Va sec ,
206 and
207 .Va pos
208 fields), its position in the tree (the
209 .Va parent ,
210 .Va child ,
211 .Va next
212 and
213 .Va prev
214 fields) and some type-specific data.
215 .Pp
216 The tree itself is arranged according to the following normal form,
217 where capitalised non-terminals represent nodes.
218 .Pp
219 .Bl -tag -width "ELEMENTXX" -compact
220 .It ROOT
221 \(<- mnode+
222 .It mnode
223 \(<- BLOCK | ELEMENT | TEXT
224 .It BLOCK
225 \(<- (HEAD [TEXT])+ [BODY [TEXT]] [TAIL [TEXT]]
226 .It BLOCK
227 \(<- BODY [TEXT] [TAIL [TEXT]]
228 .It ELEMENT
229 \(<- TEXT*
230 .It HEAD
231 \(<- mnode+
232 .It BODY
233 \(<- mnode+
234 .It TAIL
235 \(<- mnode+
236 .It TEXT
237 \(<- [[:printable:],0x1e]*
238 .El
239 .Pp
240 Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of
241 the BLOCK production.
242 These refer to punctuation marks.
243 Furthermore, although a TEXT node will generally have a non-zero-length
244 string, in the specific case of
245 .Sq \&.Bd \-literal ,
246 an empty line will produce a zero-length string.
247 .Sh EXAMPLES
248 The following example reads lines from stdin and parses them, operating
249 on the finished parse tree with
250 .Fn parsed .
251 This example does not error-check nor free memory upon failure.
252 .Bd -literal -offset indent
253 struct mdoc *mdoc;
254 const struct mdoc_node *node;
255 char *buf;
256 size_t len;
257 int line;
258
259 line = 1;
260 mdoc = mdoc_alloc(NULL, 0, NULL);
261 buf = NULL;
262 alloc_len = 0;
263
264 while ((len = getline(&buf, &alloc_len, stdin)) >= 0) {
265 if (len && buflen[len - 1] = '\en')
266 buf[len - 1] = '\e0';
267 if ( ! mdoc_parseln(mdoc, line, buf))
268 errx(1, "mdoc_parseln");
269 line++;
270 }
271
272 if ( ! mdoc_endparse(mdoc))
273 errx(1, "mdoc_endparse");
274 if (NULL == (node = mdoc_node(mdoc)))
275 errx(1, "mdoc_node");
276
277 parsed(mdoc, node);
278 mdoc_free(mdoc);
279 .Ed
280 .Pp
281 Please see
282 .Pa main.c
283 in the source archive for a rigorous reference.
284 .Sh SEE ALSO
285 .Xr mandoc 1 ,
286 .Xr mdoc 7
287 .Sh AUTHORS
288 The
289 .Nm
290 library was written by
291 .An Kristaps Dzonsons Aq kristaps@bsd.lv .