1 .\" $Id: mandoc_html.3,v 1.23 2020/04/24 13:13:06 schwarze Exp $
3 .\" Copyright (c) 2014, 2017, 2018 Ingo Schwarze <schwarze@openbsd.org>
5 .\" Permission to use, copy, modify, and distribute this software for any
6 .\" purpose with or without fee is hereby granted, provided that the above
7 .\" copyright notice and this permission notice appear in all copies.
9 .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
10 .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
11 .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
12 .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
13 .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
14 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
15 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
17 .Dd $Mdocdate: April 24 2020 $
22 .Nd internals of the mandoc HTML formatter
25 .Fd #include """mandoc.h"""
26 .Fd #include """roff.h"""
27 .Fd #include """out.h"""
28 .Fd #include """html.h"""
30 .Fn print_gen_decls "struct html *h"
32 .Fn print_gen_comment "struct html *h" "struct roff_node *n"
34 .Fn print_gen_head "struct html *h"
38 .Fa "enum htmltag tag"
45 .Fa "const struct tag *until"
50 .Fa "const struct tag *suntil"
53 .Fn html_close_paragraph "struct html *h"
57 .Fa "enum roff_tok tok"
62 .Fa "enum mandoc_esc font"
67 .Fa "const char *word"
72 .Fa "const char *word"
73 .Fa "struct roff_node *n"
77 .Fa "const struct roff_node *n"
83 .Fa "enum htmltag tag"
84 .Fa "const char *cattr"
85 .Fa "struct roff_node *n"
88 .Fn print_endline "struct html *h"
90 The mandoc HTML formatter is not a formal library.
91 However, as it is compiled into more than one program, in particular
95 and because it may be security-critical in some contexts,
96 some documentation is useful to help to use it correctly and
97 to prevent XSS vulnerabilities.
99 The formatter produces HTML output on the standard output.
100 Since proper escaping is usually required and best taken care of
101 at one central place, the language-specific formatters
107 are not supposed to print directly to
115 Instead, they are expected to use the output functions declared in
117 and implemented as part of the main HTML formatting engine in
120 These structures are declared in
124 Internal state of the HTML formatter.
126 One entry for the LIFO stack of HTML elements.
128 .Fa "enum htmltag tag"
130 .Fa "struct tag *next" .
132 .Ss Private interface functions
136 .Aq Pf \&! Ic DOCTYPE
140 .Fn print_gen_comment
141 prints the leading comments, usually containing a Copyright notice
142 and license, as an HTML comment.
143 It is intended to be called right after opening the
157 elements for the document
167 which takes care of properly encoding attributes,
168 which is relevant for the
174 prints the start tag of an HTML element with the name
176 optionally including the attributes specified by
180 is the empty string, no attributes are written.
183 specifies one attribute to write.
184 Most attributes require one
186 argument which becomes the value of the attribute.
187 The arguments have to be given in the same order as the attribute letters.
190 the respective attribute is not written.
191 .Bl -tag -width 1n -offset indent
200 This attribute letter can optionally be followed by a modifier letter.
203 it formats the link as a local one by prefixing a
208 it interpretes the argument as a header file name
209 and generates a link using the
215 it takes two arguments instead of one, a manual page name and
216 section, and formats them as a link to a manual page using the
225 Print an arbitrary attribute.
226 This format letter requires two
228 arguments, the attribute name and the value.
235 If present, it must be the last format letter.
239 The first is the name of the style property, the second its value.
245 letter can be repeated, each repetition requiring an additional pair of
251 uses the private function
253 to take care of HTML encoding.
254 If required by the element type, it remembers in
256 that the element is open.
259 is used to close out all open elements up to and including
262 is a variant to close out all open elements up to but excluding
265 .Fn html_close_paragraph
266 closes all open elements that establish phrasing context,
267 thus returning to the innermost flow context.
271 switches to fill mode if
275 or to no-fill mode if
279 Switching from fill mode to no-fill mode closes the current paragraph
283 Switching in the opposite direction closes the
285 element, but does not open a new paragraph.
288 matches the mode that is already active, no elements are closed nor opened.
293 the mode remains as it is.
300 .Dv ESCAPE_FONTROMAN ,
301 .Dv ESCAPE_FONTBOLD ,
302 .Dv ESCAPE_FONTITALIC ,
306 for future text output and internally remembers
307 the font that was active before the change.
311 .Dv ESCAPE_FONTPREV ,
312 the current and the previous font are exchanged.
313 This function only changes the internal state of the
315 object; no HTML elements are written yet.
316 Subsequent text output will write font elements when needed.
320 prints HTML element content.
321 It uses the private function
323 to take care of HTML encoding.
324 If the document has requested a non-standard font, for example using a
327 font escape sequence,
331 in an HTML font selection element using the
338 .Fn print_tagged_text
351 and yields a segment identifier when passed to
356 allocates a string to be used for the
358 attribute of an HTML element and/or as a segment identifier for a URI in an
365 attribute, it is used; otherwise, child nodes are used.
375 node, the resulting string is the concatenation of the child strings;
376 for other node types, only the first child is used.
377 Bytes not permitted in URI-fragment strings are replaced by underscores.
378 If any of the children to be used is not a text node,
379 no string is generated and
384 argument is non-zero, deduplication is performed by appending an
385 underscore and a decimal integer, if necessary.
388 argument is 1, this is assumed to be the first call for this tag
389 at this location, typically for use by
391 so the integer is incremented before use.
394 argument is 2, this is ssumed to be the second call for this tag
395 at this location, typically for use by
397 so the existing integer, if any, is used without incrementing it.
411 it attempts to generate an
426 generates an element that can only occur in phrasing context,
428 This function is a wrapper around
432 automatically chosing the
434 argument appropriately and setting the
444 makes sure subsequent output starts on a new HTML output line.
445 If nothing was printed on the current output line yet, it has no effect.
446 Otherwise, it appends any buffered text to the current output line,
447 ends the line, and updates the internal state of the
456 are not yet documented.
462 return a pointer to a new element on the stack of HTML elements.
465 opens two elements, a pointer to the outer one is returned.
466 The memory pointed to is owned by the library and is automatically
470 is called on it or when
472 is called on a parent element.
478 if fill mode was active before the call or
484 returns a newly allocated string or
488 lacks text data to create the attribute from.
489 The caller is responsible for
491 the returned string after using it.
495 failure, these functions do not return but call
498 .Bl -tag -width mandoc_aux.c -compact
500 declarations of public functions for use by the main program,
503 declarations of data types and private functions
504 for use by language-specific HTML formatters
506 main HTML formatting engine and utility functions
521 HTML formatter, handling requests like
531 declarations of data types and private functions
532 for shared use by all mandoc formatters,
535 private functions for shared use by all mandoc formatters
537 declarations of common mandoc utility functions, see
540 implementation of common mandoc utility functions
548 The mandoc HTML formatter was written by
549 .An Kristaps Dzonsons Aq Mt kristaps@bsd.lv .
551 .An Ingo Schwarze Aq Mt schwarze@openbsd.org ,
552 who also wrote this manual.