]> git.cameronkatri.com Git - mandoc.git/blob - mandoc_html.3
Make the "make depend" maintainer target more convenient
[mandoc.git] / mandoc_html.3
1 .\" $Id: mandoc_html.3,v 1.20 2020/03/13 15:32:28 schwarze Exp $
2 .\"
3 .\" Copyright (c) 2014, 2017, 2018 Ingo Schwarze <schwarze@openbsd.org>
4 .\"
5 .\" Permission to use, copy, modify, and distribute this software for any
6 .\" purpose with or without fee is hereby granted, provided that the above
7 .\" copyright notice and this permission notice appear in all copies.
8 .\"
9 .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
10 .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
11 .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
12 .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
13 .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
14 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
15 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
16 .\"
17 .Dd $Mdocdate: March 13 2020 $
18 .Dt MANDOC_HTML 3
19 .Os
20 .Sh NAME
21 .Nm mandoc_html
22 .Nd internals of the mandoc HTML formatter
23 .Sh SYNOPSIS
24 .In "html.h"
25 .Ft void
26 .Fn print_gen_decls "struct html *h"
27 .Ft void
28 .Fn print_gen_comment "struct html *h" "struct roff_node *n"
29 .Ft void
30 .Fn print_gen_head "struct html *h"
31 .Ft struct tag *
32 .Fo print_otag
33 .Fa "struct html *h"
34 .Fa "enum htmltag tag"
35 .Fa "const char *fmt"
36 .Fa ...
37 .Fc
38 .Ft void
39 .Fo print_tagq
40 .Fa "struct html *h"
41 .Fa "const struct tag *until"
42 .Fc
43 .Ft void
44 .Fo print_stagq
45 .Fa "struct html *h"
46 .Fa "const struct tag *suntil"
47 .Fc
48 .Ft void
49 .Fo print_text
50 .Fa "struct html *h"
51 .Fa "const char *word"
52 .Fc
53 .Ft char *
54 .Fo html_make_id
55 .Fa "const struct roff_node *n"
56 .Fa "int unique"
57 .Fc
58 .Ft struct tag *
59 .Fo print_otag_id
60 .Fa "struct html *h"
61 .Fa "enum htmltag tag"
62 .Fa "const char *cattr"
63 .Fa "struct roff_node *n"
64 .Fc
65 .Sh DESCRIPTION
66 The mandoc HTML formatter is not a formal library.
67 However, as it is compiled into more than one program, in particular
68 .Xr mandoc 1
69 and
70 .Xr man.cgi 8 ,
71 and because it may be security-critical in some contexts,
72 some documentation is useful to help to use it correctly and
73 to prevent XSS vulnerabilities.
74 .Pp
75 The formatter produces HTML output on the standard output.
76 Since proper escaping is usually required and best taken care of
77 at one central place, the language-specific formatters
78 .Po
79 .Pa *_html.c ,
80 see
81 .Sx FILES
82 .Pc
83 are not supposed to print directly to
84 .Dv stdout
85 using functions like
86 .Xr printf 3 ,
87 .Xr putc 3 ,
88 .Xr puts 3 ,
89 or
90 .Xr write 2 .
91 Instead, they are expected to use the output functions declared in
92 .Pa html.h
93 and implemented as part of the main HTML formatting engine in
94 .Pa html.c .
95 .Ss Data structures
96 These structures are declared in
97 .Pa html.h .
98 .Bl -tag -width Ds
99 .It Vt struct html
100 Internal state of the HTML formatter.
101 .It Vt struct tag
102 One entry for the LIFO stack of HTML elements.
103 Members are
104 .Fa "enum htmltag tag"
105 and
106 .Fa "struct tag *next" .
107 .El
108 .Ss Private interface functions
109 The function
110 .Fn print_gen_decls
111 prints the opening
112 .Ao Pf \&? Ic xml ? Ac
113 and
114 .Aq Pf \&! Ic DOCTYPE
115 declarations required for the current document type.
116 .Pp
117 The function
118 .Fn print_gen_comment
119 prints the leading comments, usually containing a Copyright notice
120 and license, as an HTML comment.
121 It is intended to be called right after opening the
122 .Aq Ic HTML
123 element.
124 Pass the first
125 .Dv ROFFT_COMMENT
126 node in
127 .Fa n .
128 .Pp
129 The function
130 .Fn print_gen_head
131 prints the opening
132 .Aq Ic META
133 and
134 .Aq Ic LINK
135 elements for the document
136 .Aq Ic HEAD ,
137 using the
138 .Fa style
139 member of
140 .Fa h
141 unless that is
142 .Dv NULL .
143 It uses
144 .Fn print_otag
145 which takes care of properly encoding attributes,
146 which is relevant for the
147 .Fa style
148 link in particular.
149 .Pp
150 The function
151 .Fn print_otag
152 prints the start tag of an HTML element with the name
153 .Fa tag ,
154 optionally including the attributes specified by
155 .Fa fmt .
156 If
157 .Fa fmt
158 is the empty string, no attributes are written.
159 Each letter of
160 .Fa fmt
161 specifies one attribute to write.
162 Most attributes require one
163 .Va char *
164 argument which becomes the value of the attribute.
165 The arguments have to be given in the same order as the attribute letters.
166 If an argument is
167 .Dv NULL ,
168 the respective attribute is not written.
169 .Bl -tag -width 1n -offset indent
170 .It Cm c
171 Print a
172 .Cm class
173 attribute.
174 .It Cm h
175 Print a
176 .Cm href
177 attribute.
178 This attribute letter can optionally be followed by a modifier letter.
179 If followed by
180 .Cm R ,
181 it formats the link as a local one by prefixing a
182 .Sq #
183 character.
184 If followed by
185 .Cm I ,
186 it interpretes the argument as a header file name
187 and generates a link using the
188 .Xr mandoc 1
189 .Fl O Cm includes
190 option.
191 If followed by
192 .Cm M ,
193 it takes two arguments instead of one, a manual page name and
194 section, and formats them as a link to a manual page using the
195 .Xr mandoc 1
196 .Fl O Cm man
197 option.
198 .It Cm i
199 Print an
200 .Cm id
201 attribute.
202 .It Cm \&?
203 Print an arbitrary attribute.
204 This format letter requires two
205 .Vt char *
206 arguments, the attribute name and the value.
207 The name must not be
208 .Dv NULL .
209 .It Cm s
210 Print a
211 .Cm style
212 attribute.
213 If present, it must be the last format letter.
214 It requires two
215 .Va char *
216 arguments.
217 The first is the name of the style property, the second its value.
218 The name must not be
219 .Dv NULL .
220 The
221 .Cm s
222 .Ar fmt
223 letter can be repeated, each repetition requiring an additional pair of
224 .Va char *
225 arguments.
226 .El
227 .Pp
228 .Fn print_otag
229 uses the private function
230 .Fn print_encode
231 to take care of HTML encoding.
232 If required by the element type, it remembers in
233 .Fa h
234 that the element is open.
235 The function
236 .Fn print_tagq
237 is used to close out all open elements up to and including
238 .Fa until ;
239 .Fn print_stagq
240 is a variant to close out all open elements up to but excluding
241 .Fa suntil .
242 .Pp
243 The function
244 .Fn print_text
245 prints HTML element content.
246 It uses the private function
247 .Fn print_encode
248 to take care of HTML encoding.
249 If the document has requested a non-standard font, for example using a
250 .Xr roff 7
251 .Ic \ef
252 font escape sequence,
253 .Fn print_text
254 wraps
255 .Fa word
256 in an HTML font selection element using the
257 .Fn print_otag
258 and
259 .Fn print_tagq
260 functions.
261 .Pp
262 The function
263 .Fn html_make_id
264 allocates a string to be used for the
265 .Cm id
266 attribute of an HTML element and/or as a segment identifier for a URI in an
267 .Aq Ic A
268 element.
269 If
270 .Fa n
271 contains a
272 .Fa string
273 attribute, it is used; otherwise, child nodes are used.
274 If
275 .Fa n
276 is an
277 .Ic \&Sh ,
278 .Ic \&Ss ,
279 .Ic \&Sx ,
280 .Ic SH ,
281 or
282 .Ic SS
283 node, the resulting string is the concatenation of the child strings;
284 for other node types, only the first child is used.
285 Bytes not permitted in URI-fragment strings are replaced by underscores.
286 If any of the children to be used is not a text node,
287 no string is generated and
288 .Dv NULL
289 is returned instead.
290 If the
291 .Fa unique
292 argument is non-zero, deduplication is performed by appending an
293 underscore and a decimal integer, if necessary.
294 .Pp
295 The function
296 .Fn print_otag_id
297 opens a
298 .Fa tag
299 element of class
300 .Fa cattr
301 for the node
302 .Fa n .
303 If the flag
304 .Dv NODE_ID
305 is set in
306 .Fa n ,
307 it attempts to generate an
308 .Cm id
309 attribute with
310 .Fn html_make_id .
311 If an
312 .Cm id
313 attribute is written,
314 .Fn print_otag_id
315 also adds an
316 .Aq Ic A
317 element of class
318 .Qq permalink :
319 outside if
320 .Fa n
321 generates a phrasing element, or inside otherwise.
322 This function is a wrapper around
323 .Fn html_make_id
324 and
325 .Fn print_otag ,
326 fixing the
327 .Fa unique
328 argument to 1 and the
329 .Fa fmt
330 arguments to
331 .Qq chR
332 and
333 .Qq ci ,
334 respectively.
335 .Pp
336 The functions
337 .Fn print_eqn ,
338 .Fn print_tbl ,
339 and
340 .Fn print_tblclose
341 are not yet documented.
342 .Sh RETURN VALUES
343 The functions
344 .Fn print_otag
345 and
346 .Fn print_otag_id
347 return a pointer to a new element on the stack of HTML elements.
348 When
349 .Fn print_otag_id
350 opens two elements, a pointer to the outer one is returned.
351 The memory pointed to is owned by the library and is automatically
352 .Xr free 3 Ns d
353 when
354 .Fn print_tagq
355 is called on it or when
356 .Fn print_stagq
357 is called on a parent element.
358 .Pp
359 The function
360 .Fn html_make_id
361 returns a newly allocated string or
362 .Dv NULL
363 if
364 .Fa n
365 lacks text data to create the attribute from.
366 If the
367 .Fa unique
368 argument is 0, the caller is responsible for
369 .Xr free 3 Ns ing
370 the returned string after using it.
371 If the
372 .Fa unique
373 argument is non-zero, the
374 .Va id_unique
375 ohash table is used for de-duplication and owns the returned string.
376 In this case, it will be freed automatically by
377 .Fn html_reset
378 or
379 .Fn html_free .
380 .Pp
381 In case of
382 .Xr malloc 3
383 failure, these functions do not return but call
384 .Xr err 3 .
385 .Sh FILES
386 .Bl -tag -width mandoc_aux.c -compact
387 .It Pa main.h
388 declarations of public functions for use by the main program,
389 not yet documented
390 .It Pa html.h
391 declarations of data types and private functions
392 for use by language-specific HTML formatters
393 .It Pa html.c
394 main HTML formatting engine and utility functions
395 .It Pa mdoc_html.c
396 .Xr mdoc 7
397 HTML formatter
398 .It Pa man_html.c
399 .Xr man 7
400 HTML formatter
401 .It Pa tbl_html.c
402 .Xr tbl 7
403 HTML formatter
404 .It Pa eqn_html.c
405 .Xr eqn 7
406 HTML formatter
407 .It Pa out.h
408 declarations of data types and private functions
409 for shared use by all mandoc formatters,
410 not yet documented
411 .It Pa out.c
412 private functions for shared use by all mandoc formatters
413 .It Pa mandoc_aux.h
414 declarations of common mandoc utility functions, see
415 .Xr mandoc 3
416 .It Pa mandoc_aux.c
417 implementation of common mandoc utility functions
418 .El
419 .Sh SEE ALSO
420 .Xr mandoc 1 ,
421 .Xr mandoc 3 ,
422 .Xr man.cgi 8
423 .Sh AUTHORS
424 .An -nosplit
425 The mandoc HTML formatter was written by
426 .An Kristaps Dzonsons Aq Mt kristaps@bsd.lv .
427 It is maintained by
428 .An Ingo Schwarze Aq Mt schwarze@openbsd.org ,
429 who also wrote this manual.