]> git.cameronkatri.com Git - mandoc.git/blob - mdoc.3
Fix man.7 to include AT and UC in its syntax table.
[mandoc.git] / mdoc.3
1 .\" $Id: mdoc.3,v 1.40 2010/05/25 22:16:59 kristaps Exp $
2 .\"
3 .\" Copyright (c) 2009-2010 Kristaps Dzonsons <kristaps@bsd.lv>
4 .\"
5 .\" Permission to use, copy, modify, and distribute this software for any
6 .\" purpose with or without fee is hereby granted, provided that the above
7 .\" copyright notice and this permission notice appear in all copies.
8 .\"
9 .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
10 .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
11 .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
12 .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
13 .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
14 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
15 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
16 .\"
17 .Dd $Mdocdate: May 25 2010 $
18 .Dt MDOC 3
19 .Os
20 .Sh NAME
21 .Nm mdoc ,
22 .Nm mdoc_alloc ,
23 .Nm mdoc_endparse ,
24 .Nm mdoc_free ,
25 .Nm mdoc_meta ,
26 .Nm mdoc_node ,
27 .Nm mdoc_parseln ,
28 .Nm mdoc_reset
29 .Nd mdoc macro compiler library
30 .Sh SYNOPSIS
31 .In mandoc.h
32 .In mdoc.h
33 .Vt extern const char * const * mdoc_macronames;
34 .Vt extern const char * const * mdoc_argnames;
35 .Ft "struct mdoc *"
36 .Fn mdoc_alloc "void *data" "int pflags" "mandocmsg msgs"
37 .Ft int
38 .Fn mdoc_endparse "struct mdoc *mdoc"
39 .Ft void
40 .Fn mdoc_free "struct mdoc *mdoc"
41 .Ft "const struct mdoc_meta *"
42 .Fn mdoc_meta "const struct mdoc *mdoc"
43 .Ft "const struct mdoc_node *"
44 .Fn mdoc_node "const struct mdoc *mdoc"
45 .Ft int
46 .Fn mdoc_parseln "struct mdoc *mdoc" "int line" "char *buf"
47 .Ft int
48 .Fn mdoc_reset "struct mdoc *mdoc"
49 .Sh DESCRIPTION
50 The
51 .Nm mdoc
52 library parses lines of
53 .Xr mdoc 7
54 input
55 into an abstract syntax tree (AST).
56 .Pp
57 In general, applications initiate a parsing sequence with
58 .Fn mdoc_alloc ,
59 parse each line in a document with
60 .Fn mdoc_parseln ,
61 close the parsing session with
62 .Fn mdoc_endparse ,
63 operate over the syntax tree returned by
64 .Fn mdoc_node
65 and
66 .Fn mdoc_meta ,
67 then free all allocated memory with
68 .Fn mdoc_free .
69 The
70 .Fn mdoc_reset
71 function may be used in order to reset the parser for another input
72 sequence.
73 See the
74 .Sx EXAMPLES
75 section for a simple example.
76 .Pp
77 This section further defines the
78 .Sx Types ,
79 .Sx Functions
80 and
81 .Sx Variables
82 available to programmers.
83 Following that, the
84 .Sx Abstract Syntax Tree
85 section documents the output tree.
86 .Ss Types
87 Both functions (see
88 .Sx Functions )
89 and variables (see
90 .Sx Variables )
91 may use the following types:
92 .Bl -ohang
93 .It Vt struct mdoc
94 An opaque type defined in
95 .Pa mdoc.c .
96 Its values are only used privately within the library.
97 .It Vt struct mdoc_node
98 A parsed node.
99 Defined in
100 .Pa mdoc.h .
101 See
102 .Sx Abstract Syntax Tree
103 for details.
104 .It Vt mandocmsg
105 A function callback type defined in
106 .Pa mandoc.h .
107 .El
108 .Ss Functions
109 Function descriptions follow:
110 .Bl -ohang
111 .It Fn mdoc_alloc
112 Allocates a parsing structure.
113 The
114 .Fa data
115 pointer is passed to
116 .Fa msgs .
117 The
118 .Fa pflags
119 arguments are defined in
120 .Pa mdoc.h .
121 Returns NULL on failure.
122 If non-NULL, the pointer must be freed with
123 .Fn mdoc_free .
124 .It Fn mdoc_reset
125 Reset the parser for another parse routine.
126 After its use,
127 .Fn mdoc_parseln
128 behaves as if invoked for the first time.
129 If it returns 0, memory could not be allocated.
130 .It Fn mdoc_free
131 Free all resources of a parser.
132 The pointer is no longer valid after invocation.
133 .It Fn mdoc_parseln
134 Parse a nil-terminated line of input.
135 This line should not contain the trailing newline.
136 Returns 0 on failure, 1 on success.
137 The input buffer
138 .Fa buf
139 is modified by this function.
140 .It Fn mdoc_endparse
141 Signals that the parse is complete.
142 Note that if
143 .Fn mdoc_endparse
144 is called subsequent to
145 .Fn mdoc_node ,
146 the resulting tree is incomplete.
147 Returns 0 on failure, 1 on success.
148 .It Fn mdoc_node
149 Returns the first node of the parse.
150 Note that if
151 .Fn mdoc_parseln
152 or
153 .Fn mdoc_endparse
154 return 0, the tree will be incomplete.
155 .It Fn mdoc_meta
156 Returns the document's parsed meta-data.
157 If this information has not yet been supplied or
158 .Fn mdoc_parseln
159 or
160 .Fn mdoc_endparse
161 return 0, the data will be incomplete.
162 .El
163 .Ss Variables
164 The following variables are also defined:
165 .Bl -ohang
166 .It Va mdoc_macronames
167 An array of string-ified token names.
168 .It Va mdoc_argnames
169 An array of string-ified token argument names.
170 .El
171 .Ss Abstract Syntax Tree
172 The
173 .Nm
174 functions produce an abstract syntax tree (AST) describing input in a
175 regular form.
176 It may be reviewed at any time with
177 .Fn mdoc_nodes ;
178 however, if called before
179 .Fn mdoc_endparse ,
180 or after
181 .Fn mdoc_endparse
182 or
183 .Fn mdoc_parseln
184 fail, it may be incomplete.
185 .Pp
186 This AST is governed by the ontological
187 rules dictated in
188 .Xr mdoc 7
189 and derives its terminology accordingly.
190 .Qq In-line
191 elements described in
192 .Xr mdoc 7
193 are described simply as
194 .Qq elements .
195 .Pp
196 The AST is composed of
197 .Vt struct mdoc_node
198 nodes with block, head, body, element, root and text types as declared
199 by the
200 .Va type
201 field.
202 Each node also provides its parse point (the
203 .Va line ,
204 .Va sec ,
205 and
206 .Va pos
207 fields), its position in the tree (the
208 .Va parent ,
209 .Va child ,
210 .Va next
211 and
212 .Va prev
213 fields) and some type-specific data.
214 .Pp
215 The tree itself is arranged according to the following normal form,
216 where capitalised non-terminals represent nodes.
217 .Pp
218 .Bl -tag -width "ELEMENTXX" -compact
219 .It ROOT
220 \(<- mnode+
221 .It mnode
222 \(<- BLOCK | ELEMENT | TEXT
223 .It BLOCK
224 \(<- (HEAD [TEXT])+ [BODY [TEXT]] [TAIL [TEXT]]
225 .It BLOCK
226 \(<- BODY [TEXT] [TAIL [TEXT]]
227 .It ELEMENT
228 \(<- TEXT*
229 .It HEAD
230 \(<- mnode+
231 .It BODY
232 \(<- mnode+
233 .It TAIL
234 \(<- mnode+
235 .It TEXT
236 \(<- [[:printable:],0x1e]*
237 .El
238 .Pp
239 Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of
240 the BLOCK production.
241 These refer to punctuation marks.
242 Furthermore, although a TEXT node will generally have a non-zero-length
243 string, in the specific case of
244 .Sq \&.Bd \-literal ,
245 an empty line will produce a zero-length string.
246 .Sh EXAMPLES
247 The following example reads lines from stdin and parses them, operating
248 on the finished parse tree with
249 .Fn parsed .
250 This example does not error-check nor free memory upon failure.
251 .Bd -literal -offset indent
252 struct mdoc *mdoc;
253 const struct mdoc_node *node;
254 char *buf;
255 size_t len;
256 int line;
257
258 line = 1;
259 mdoc = mdoc_alloc(NULL, 0, NULL);
260 buf = NULL;
261 alloc_len = 0;
262
263 while ((len = getline(&buf, &alloc_len, stdin)) >= 0) {
264 if (len && buflen[len - 1] = '\en')
265 buf[len - 1] = '\e0';
266 if ( ! mdoc_parseln(mdoc, line, buf))
267 errx(1, "mdoc_parseln");
268 line++;
269 }
270
271 if ( ! mdoc_endparse(mdoc))
272 errx(1, "mdoc_endparse");
273 if (NULL == (node = mdoc_node(mdoc)))
274 errx(1, "mdoc_node");
275
276 parsed(mdoc, node);
277 mdoc_free(mdoc);
278 .Ed
279 .Pp
280 Please see
281 .Pa main.c
282 in the source archive for a rigorous reference.
283 .Sh SEE ALSO
284 .Xr mandoc 1 ,
285 .Xr mdoc 7
286 .Sh AUTHORS
287 The
288 .Nm
289 library was written by
290 .An Kristaps Dzonsons Aq kristaps@bsd.lv .