]> git.cameronkatri.com Git - mandoc.git/blob - mandoc_html.3
A missing initialization could randomly cause regular expression
[mandoc.git] / mandoc_html.3
1 .\" $Id: mandoc_html.3,v 1.5 2017/01/28 22:36:38 schwarze Exp $
2 .\"
3 .\" Copyright (c) 2014, 2017 Ingo Schwarze <schwarze@openbsd.org>
4 .\"
5 .\" Permission to use, copy, modify, and distribute this software for any
6 .\" purpose with or without fee is hereby granted, provided that the above
7 .\" copyright notice and this permission notice appear in all copies.
8 .\"
9 .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
10 .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
11 .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
12 .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
13 .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
14 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
15 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
16 .\"
17 .Dd $Mdocdate: January 28 2017 $
18 .Dt MANDOC_HTML 3
19 .Os
20 .Sh NAME
21 .Nm mandoc_html
22 .Nd internals of the mandoc HTML formatter
23 .Sh SYNOPSIS
24 .In "html.h"
25 .Ft void
26 .Fn print_gen_decls "struct html *h"
27 .Ft void
28 .Fn print_gen_head "struct html *h"
29 .Ft struct tag *
30 .Fo print_otag
31 .Fa "struct html *h"
32 .Fa "enum htmltag tag"
33 .Fa "const char *fmt"
34 .Fa ...
35 .Fc
36 .Ft void
37 .Fo print_tagq
38 .Fa "struct html *h"
39 .Fa "const struct tag *until"
40 .Fc
41 .Ft void
42 .Fo print_stagq
43 .Fa "struct html *h"
44 .Fa "const struct tag *suntil"
45 .Fc
46 .Ft void
47 .Fo print_text
48 .Fa "struct html *h"
49 .Fa "const char *word"
50 .Fc
51 .Sh DESCRIPTION
52 The mandoc HTML formatter is not a formal library.
53 However, as it is compiled into more than one program, in particular
54 .Xr mandoc 1
55 and
56 .Xr man.cgi 8 ,
57 and because it may be security-critical in some contexts,
58 some documentation is useful to help to use it correctly and
59 to prevent XSS vulnerabilities.
60 .Pp
61 The formatter produces HTML output on the standard output.
62 Since proper escaping is usually required and best taken care of
63 at one central place, the language-specific formatters
64 .Po
65 .Pa *_html.c ,
66 see
67 .Sx FILES
68 .Pc
69 are not supposed to print directly to
70 .Dv stdout
71 using functions like
72 .Xr printf 3 ,
73 .Xr putc 3 ,
74 .Xr puts 3 ,
75 or
76 .Xr write 2 .
77 Instead, they are expected to use the output functions declared in
78 .Pa html.h
79 and implemented as part of the main HTML formatting engine in
80 .Pa html.c .
81 .Ss Data structures
82 These structures are declared in
83 .Pa html.h .
84 .Bl -tag -width Ds
85 .It Vt struct html
86 Internal state of the HTML formatter.
87 .It Vt struct tag
88 One entry for the LIFO stack of HTML elements.
89 Members are
90 .Fa "enum htmltag tag"
91 and
92 .Fa "struct tag *next" .
93 .El
94 .Ss Private interface functions
95 The function
96 .Fn print_gen_decls
97 prints the opening
98 .Ao Pf \&? Ic xml ? Ac
99 and
100 .Aq Pf \&! Ic DOCTYPE
101 declarations required for the current document type.
102 .Pp
103 The function
104 .Fn print_gen_head
105 prints the opening
106 .Aq Ic META
107 and
108 .Aq Ic LINK
109 elements for the document
110 .Aq Ic HEAD ,
111 using the
112 .Fa style
113 member of
114 .Fa h
115 unless that is
116 .Dv NULL .
117 It uses
118 .Fn print_otag
119 which takes care of properly encoding attributes,
120 which is relevant for the
121 .Fa style
122 link in particular.
123 .Pp
124 The function
125 .Fn print_otag
126 prints the start tag of an HTML element with the name
127 .Fa tag ,
128 optionally including the attributes specified by
129 .Fa fmt .
130 If
131 .Fa fmt
132 is the empty string, no attributes are written.
133 Each letter of
134 .Fa fmt
135 specifies one attribute to write.
136 Most attributes require one
137 .Va char *
138 argument which becomes the value of the attribute.
139 The arguments have to be given in the same order as the attribute letters.
140 If an argument is
141 .Dv NULL ,
142 the respective attribute is not written.
143 .Bl -tag -width 1n -offset indent
144 .It Cm c
145 Print a
146 .Cm class
147 attribute.
148 .It Cm h
149 Print a
150 .Cm href
151 attribute.
152 This attribute letter can optionally be followed by a modifier letter.
153 If followed by
154 .Cm R ,
155 it formats the link as a local one by prefixing a
156 .Sq #
157 character.
158 If followed by
159 .Cm I ,
160 it interpretes the argument as a header file name
161 and generates a link using the
162 .Xr mandoc 1
163 .Fl O Cm includes
164 option.
165 If followed by
166 .Cm M ,
167 it takes two arguments instead of one, a manual page name and
168 section, and formats them as a link to a manual page using the
169 .Xr mandoc 1
170 .Fl O Cm man
171 option.
172 .It Cm i
173 Print an
174 .Cm id
175 attribute.
176 .It Cm \&?
177 Print an arbitrary attribute.
178 This format letter requires two
179 .Vt char *
180 arguments, the attribute name and the value.
181 The name must not be
182 .Dv NULL .
183 .It Cm s
184 Print a
185 .Cm style
186 attribute.
187 If present, it must be the last format letter.
188 In contrast to the other format letters, this one does not yet
189 print the value and does not take an argument.
190 Instead, the rest of the format string consists of pairs of
191 argument type letters and style name letters.
192 .El
193 .Pp
194 Argument type letters each require on argument as follows:
195 .Bl -tag -width 1n -offset indent
196 .It Cm h
197 Requires one
198 .Vt int
199 argument, interpreted as a horizontal length in units of
200 .Dv SCALE_EN .
201 .It Cm s
202 Requires one
203 .Vt char *
204 argument, used as a style value.
205 .It Cm u
206 Requires one
207 .Vt struct roffsu *
208 argument, used as a length.
209 .It Cm v
210 Requires one
211 .Vt int
212 argument, interpreted as a vertical length in units of
213 .Dv SCALE_VS .
214 .It Cm w
215 Requires one
216 .Vt char *
217 argument, interpreted as an
218 .Xr mdoc 7 Ns -style
219 width specifier.
220 If the argument is
221 .Dv NULL ,
222 nothing is printed for this pair.
223 .It Cm W
224 Similar to
225 .Cm w ,
226 but makes the width negative by multiplying it with \(mi1.
227 .El
228 .Pp
229 Style name letters decide what to do with the preceding argument:
230 .Bl -tag -width 1n -offset indent
231 .It Cm b
232 Set
233 .Cm margin-bottom
234 to the given length.
235 .It Cm h
236 Set
237 .Cm height
238 to the given length.
239 .It Cm i
240 Set
241 .Cm text-indent
242 to the given length.
243 .It Cm l
244 Set
245 .Cm margin-left
246 to the given length.
247 .It Cm t
248 Set
249 .Cm margin-top
250 to the given length.
251 .It Cm w
252 Set
253 .Cm width
254 to the given length.
255 .It Cm W
256 Set
257 .Cm min-width
258 to the given length.
259 .It Cm \&?
260 The special pair
261 .Cm s?
262 requires two
263 .Vt char *
264 arguments.
265 The first is the style name, the second its value.
266 The style name must not be
267 .Dv NULL .
268 .El
269 .Pp
270 .Fn print_otag
271 uses the private function
272 .Fn print_encode
273 to take care of HTML encoding.
274 If required by the element type, it remembers in
275 .Fa h
276 that the element is open.
277 The function
278 .Fn print_tagq
279 is used to close out all open elements up to and including
280 .Fa until ;
281 .Fn print_stagq
282 is a variant to close out all open elements up to but excluding
283 .Fa suntil .
284 .Pp
285 The function
286 .Fn print_text
287 prints HTML element content.
288 It uses the private function
289 .Fn print_encode
290 to take care of HTML encoding.
291 If the document has requested a non-standard font, for example using a
292 .Xr roff 7
293 .Ic \ef
294 font escape sequence,
295 .Fn print_text
296 wraps
297 .Fa word
298 in an HTML font selection element using the
299 .Fn print_otag
300 and
301 .Fn print_tagq
302 functions.
303 .Pp
304 The functions
305 .Fn html_strlen ,
306 .Fn print_eqn ,
307 .Fn print_tbl ,
308 and
309 .Fn print_tblclose
310 are not yet documented.
311 .Sh FILES
312 .Bl -tag -width mandoc_aux.c -compact
313 .It Pa main.h
314 declarations of public functions for use by the main program,
315 not yet documented
316 .It Pa html.h
317 declarations of data types and private functions
318 for use by language-specific HTML formatters
319 .It Pa html.c
320 main HTML formatting engine and utility functions
321 .It Pa mdoc_html.c
322 .Xr mdoc 7
323 HTML formatter
324 .It Pa man_html.c
325 .Xr man 7
326 HTML formatter
327 .It Pa tbl_html.c
328 .Xr tbl 7
329 HTML formatter
330 .It Pa eqn_html.c
331 .Xr eqn 7
332 HTML formatter
333 .It Pa out.h
334 declarations of data types and private functions
335 for shared use by all mandoc formatters,
336 not yet documented
337 .It Pa out.c
338 private functions for shared use by all mandoc formatters
339 .It Pa mandoc_aux.h
340 declarations of common mandoc utility functions, see
341 .Xr mandoc 3
342 .It Pa mandoc_aux.c
343 implementation of common mandoc utility functions
344 .El
345 .Sh SEE ALSO
346 .Xr mandoc 1 ,
347 .Xr mandoc 3 ,
348 .Xr man.cgi 8
349 .Sh AUTHORS
350 .An -nosplit
351 The mandoc HTML formatter was written by
352 .An Kristaps Dzonsons Aq Mt kristaps@bsd.lv .
353 It is maintained by
354 .An Ingo Schwarze Aq Mt schwarze@openbsd.org ,
355 who also wrote this manual.