]> git.cameronkatri.com Git - mandoc.git/blob - mdoc.3
Clean-up: new-sentence, new-line for mdoc.3.
[mandoc.git] / mdoc.3
1 .\" $Id: mdoc.3,v 1.38 2010/05/25 21:38:05 kristaps Exp $
2 .\"
3 .\" Copyright (c) 2009-2010 Kristaps Dzonsons <kristaps@bsd.lv>
4 .\"
5 .\" Permission to use, copy, modify, and distribute this software for any
6 .\" purpose with or without fee is hereby granted, provided that the above
7 .\" copyright notice and this permission notice appear in all copies.
8 .\"
9 .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
10 .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
11 .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
12 .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
13 .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
14 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
15 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
16 .\"
17 .Dd $Mdocdate: May 25 2010 $
18 .Dt MDOC 3
19 .Os
20 .Sh NAME
21 .Nm mdoc_alloc ,
22 .Nm mdoc_endparse ,
23 .Nm mdoc_free ,
24 .Nm mdoc_meta ,
25 .Nm mdoc_node ,
26 .Nm mdoc_parseln ,
27 .Nm mdoc_reset
28 .Nd mdoc macro compiler library
29 .Sh SYNOPSIS
30 .In mandoc.h
31 .In mdoc.h
32 .Vt extern const char * const * mdoc_macronames;
33 .Vt extern const char * const * mdoc_argnames;
34 .Ft "struct mdoc *"
35 .Fn mdoc_alloc "void *data" "int pflags" "mandocmsg msgs"
36 .Ft int
37 .Fn mdoc_endparse "struct mdoc *mdoc"
38 .Ft void
39 .Fn mdoc_free "struct mdoc *mdoc"
40 .Ft "const struct mdoc_meta *"
41 .Fn mdoc_meta "const struct mdoc *mdoc"
42 .Ft "const struct mdoc_node *"
43 .Fn mdoc_node "const struct mdoc *mdoc"
44 .Ft int
45 .Fn mdoc_parseln "struct mdoc *mdoc" "int line" "char *buf"
46 .Ft int
47 .Fn mdoc_reset "struct mdoc *mdoc"
48 .Sh DESCRIPTION
49 The
50 .Nm mdoc
51 library parses lines of
52 .Xr mdoc 7
53 input
54 into an abstract syntax tree (AST).
55 .Pp
56 In general, applications initiate a parsing sequence with
57 .Fn mdoc_alloc ,
58 parse each line in a document with
59 .Fn mdoc_parseln ,
60 close the parsing session with
61 .Fn mdoc_endparse ,
62 operate over the syntax tree returned by
63 .Fn mdoc_node
64 and
65 .Fn mdoc_meta ,
66 then free all allocated memory with
67 .Fn mdoc_free .
68 The
69 .Fn mdoc_reset
70 function may be used in order to reset the parser for another input
71 sequence.
72 See the
73 .Sx EXAMPLES
74 section for a simple example.
75 .Pp
76 This section further defines the
77 .Sx Types ,
78 .Sx Functions
79 and
80 .Sx Variables
81 available to programmers.
82 Following that, the
83 .Sx Abstract Syntax Tree
84 section documents the output tree.
85 .Ss Types
86 Both functions (see
87 .Sx Functions )
88 and variables (see
89 .Sx Variables )
90 may use the following types:
91 .Bl -ohang
92 .It Vt struct mdoc
93 An opaque type defined in
94 .Pa mdoc.c .
95 Its values are only used privately within the library.
96 .It Vt struct mdoc_node
97 A parsed node.
98 Defined in
99 .Pa mdoc.h .
100 See
101 .Sx Abstract Syntax Tree
102 for details.
103 .It Vt mandocmsg
104 A function callback type defined in
105 .Pa mandoc.h .
106 .El
107 .Ss Functions
108 Function descriptions follow:
109 .Bl -ohang
110 .It Fn mdoc_alloc
111 Allocates a parsing structure.
112 The
113 .Fa data
114 pointer is passed to callbacks in
115 .Fa cb ,
116 which are documented further in the header file.
117 The
118 .Fa pflags
119 arguments are defined in
120 .Pa mdoc.h .
121 Returns NULL on failure.
122 If non-NULL, the pointer must be freed with
123 .Fn mdoc_free .
124 .It Fn mdoc_reset
125 Reset the parser for another parse routine.
126 After its use,
127 .Fn mdoc_parseln
128 behaves as if invoked for the first time.
129 If it returns 0, memory could not be allocated.
130 .It Fn mdoc_free
131 Free all resources of a parser.
132 The pointer is no longer valid after invocation.
133 .It Fn mdoc_parseln
134 Parse a nil-terminated line of input.
135 This line should not contain the trailing newline.
136 Returns 0 on failure, 1 on success.
137 The input buffer
138 .Fa buf
139 is modified by this function.
140 .It Fn mdoc_endparse
141 Signals that the parse is complete.
142 Note that if
143 .Fn mdoc_endparse
144 is called subsequent to
145 .Fn mdoc_node ,
146 the resulting tree is incomplete.
147 Returns 0 on failure, 1 on success.
148 .It Fn mdoc_node
149 Returns the first node of the parse.
150 Note that if
151 .Fn mdoc_parseln
152 or
153 .Fn mdoc_endparse
154 return 0, the tree will be incomplete.
155 .It Fn mdoc_meta
156 Returns the document's parsed meta-data.
157 If this information has not yet been supplied or
158 .Fn mdoc_parseln
159 or
160 .Fn mdoc_endparse
161 return 0, the data will be incomplete.
162 .El
163 .Ss Variables
164 The following variables are also defined:
165 .Bl -ohang
166 .It Va mdoc_macronames
167 An array of string-ified token names.
168 .It Va mdoc_argnames
169 An array of string-ified token argument names.
170 .El
171 .Ss Abstract Syntax Tree
172 The
173 .Nm
174 functions produce an abstract syntax tree (AST) describing input in a
175 regular form.
176 It may be reviewed at any time with
177 .Fn mdoc_nodes ;
178 however, if called before
179 .Fn mdoc_endparse ,
180 or after
181 .Fn mdoc_endparse
182 or
183 .Fn mdoc_parseln
184 fail, it may be incomplete.
185 .Pp
186 This AST is governed by the ontological
187 rules dictated in
188 .Xr mdoc 7
189 and derives its terminology accordingly.
190 .Qq In-line
191 elements described in
192 .Xr mdoc 7
193 are described simply as
194 .Qq elements .
195 .Pp
196 The AST is composed of
197 .Vt struct mdoc_node
198 nodes with block, head, body, element, root and text types as declared
199 by the
200 .Va type
201 field.
202 Each node also provides its parse point (the
203 .Va line ,
204 .Va sec ,
205 and
206 .Va pos
207 fields), its position in the tree (the
208 .Va parent ,
209 .Va child ,
210 .Va next
211 and
212 .Va prev
213 fields) and some type-specific data.
214 .Pp
215 The tree itself is arranged according to the following normal form,
216 where capitalised non-terminals represent nodes.
217 .Pp
218 .Bl -tag -width "ELEMENTXX" -compact
219 .It ROOT
220 \(<- mnode+
221 .It mnode
222 \(<- BLOCK | ELEMENT | TEXT
223 .It BLOCK
224 \(<- (HEAD [TEXT])+ [BODY [TEXT]] [TAIL [TEXT]]
225 .It BLOCK
226 \(<- BODY [TEXT] [TAIL [TEXT]]
227 .It ELEMENT
228 \(<- TEXT*
229 .It HEAD
230 \(<- mnode+
231 .It BODY
232 \(<- mnode+
233 .It TAIL
234 \(<- mnode+
235 .It TEXT
236 \(<- [[:printable:],0x1e]*
237 .El
238 .Pp
239 Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of
240 the BLOCK production.
241 These refer to punctuation marks.
242 Furthermore, although a TEXT node will generally have a non-zero-length
243 string, in the specific case of
244 .Sq \&.Bd \-literal ,
245 an empty line will produce a zero-length string.
246 .Sh EXAMPLES
247 The following example reads lines from stdin and parses them, operating
248 on the finished parse tree with
249 .Fn parsed .
250 This example does not error-check nor free memory upon failure.
251 .Bd -literal -offset indent
252 struct mdoc *mdoc;
253 const struct mdoc_node *node;
254 char *buf;
255 size_t len;
256 int line;
257
258 line = 1;
259 mdoc = mdoc_alloc(NULL, 0, NULL);
260 buf = NULL;
261 alloc_len = 0;
262
263 while ((len = getline(&buf, &alloc_len, stdin)) >= 0) {
264 if (len && buflen[len - 1] = '\en')
265 buf[len - 1] = '\e0';
266 if ( ! mdoc_parseln(mdoc, line, buf))
267 errx(1, "mdoc_parseln");
268 line++;
269 }
270
271 if ( ! mdoc_endparse(mdoc))
272 errx(1, "mdoc_endparse");
273 if (NULL == (node = mdoc_node(mdoc)))
274 errx(1, "mdoc_node");
275
276 parsed(mdoc, node);
277 mdoc_free(mdoc);
278 .Ed
279 .Pp
280 Please see
281 .Pa main.c
282 in the source archive for a rigorous reference.
283 .Sh SEE ALSO
284 .Xr mandoc 1 ,
285 .Xr mdoc 7
286 .Sh AUTHORS
287 The
288 .Nm
289 library was written by
290 .An Kristaps Dzonsons Aq kristaps@bsd.lv .