aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/html.c
Commit message (Collapse)AuthorAgeFilesLines
* Reset HTML formatter state, in particular the id_unique hash,Ingo Schwarze2019-03-031-7/+18
| | | | | | | | after processing each manual page, such that the next page starts from a clean state and doesn't continue suffix numbering. Issue found while looking at https://github.com/Debian/debiman/issues/48 which was brought up by Orestis Ioannou <oorestisime at github>.
* Wrap .Sh/.SH sections and .Ss/.SS subsections in HTML <section> elementsIngo Schwarze2019-03-011-1/+2
| | | | | | as recommended for accessibility by the HTML 5 standard. Triggered by a similar, but slightly different suggestion from Laura Morales <lauretas at mail dot com>.
* The .UR and .MT blocks in man(7) are represented by <a> elementsIngo Schwarze2019-01-181-42/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | which establish phrasing context, but they can contain paragraph breaks (which is relevant for terminal formatting, so we can't just change the structure of the syntax tree), which are respresented by <p> elements and cannot occur inside <a>. Fix this by prematurely closing the <a> element in the HTML formatter. This menas that the clickable text in HTML output is shorter than what is represented as the link text in terminal output, but in HTML, it is frankly impossible to have the clickable area of a hyperlink extend across a paragraph break. The difference in presentation is not a major problem, and besides, paragraph breaks inside .UR are rather poor style in the first place. The implementation is quite tricky. Naively closing out the <a> prematurely would result in accessing a stale pointer when later reaching the physical end of the .UR block. So this commit separates visual and structural closing of "struct tag" stack items. Visual closing means that the HTML element is closed but the "struct tag" remains on the stack, to avoid later access to a stale pointer and to avoid closing the same HTML element a second time later. This also needs reference counting of pointers to "struct tag" stack items because often more than one child holds a pointer to the same parent item, and only the outermost child can safely do the physical closing. In the whole corpus of nearly half a million manual pages on man.openbsd.org, this problem occurs in exactly one page: the groff(1) version 1.20.1 manual contained in DragonFly-3.8.2, which contains a formatting error triggering the bug.
* Remove the HTML title= attributes which harmed accessibility andIngo Schwarze2019-01-111-7/+1
| | | | | | | violated the principle of separation of content and presentation. Instead, implement the tooltips purely in CSS. Thanks to John Gardner <gardnerjohng at gmail dot com> for suggesting most of the styling in the new ::before rules.
* Represent mdoc(7) .Pp (and .sp, and some SYNOPSIS and .Rs features)Ingo Schwarze2019-01-071-13/+11
| | | | | | | | | | | | | | | | | | | by the <p> HTML element and use the html_fillmode() mechanism for .Bd -unfilled, just like it was done for man(7) earlier, finally getting rid both of the horrible <div class="Pp"></div> hack and of the worst HTML syntax violations caused by nested displays. Care is needed because in some situations, paragraphs have to remain open across several subsequent macros, whereas in other situations, they must get closed together with a block containing them. Some implementation details include: * Always close paragraphs before emitting HTML flow content. * Let html_close_paragraph() also close <pre> for extra safety. * Drop the old, now unused function print_paragraph(). * Minor adjustments in the top-level man(7) node formatter for symmetry. * Bugfix: .Ss heads suspend no-fill mode, even though .Ss doesn't end it. * Bugfix: give up on .Op semantic markup for now, see the comment.
* Finally, represent the man(7) .PP and .HP macros by the naturalIngo Schwarze2019-01-061-2/+17
| | | | | | | | | | | choice, which is the <p> HTML element. On top of the previous fill-mode improvements, the key to making this possible is to automatically close the <p> when required: before headers, subsequent paragraphs, lists, indented blocks, synopsis blocks, tbl(7) blocks, and before blocks using no-fill mode. In man(7) documents, represent the .sp request by a blank line in no-fill mode and in the same way as .PP in fill mode.
* Now that the NODE_NOFILL flag in the syntax tree is accurate,Ingo Schwarze2019-01-051-1/+34
| | | | | | | | | | use it in the man(7) HTML formatter rather than keeping fill mode state locally, resulting in massive simplification (minus 40 LOC). Move the html_fillmode() state handler function to the html.c module such that both the man(7) and the roff(7) formatter (and in the future, also the mdoc(7) formatter) can use it. Give it a query mode, to be invoked with TOKEN_NONE.
* Yet another round of improvements to manual font selection.Ingo Schwarze2018-12-161-4/+3
| | | | | | | | | Unify handling of \f and .ft. Support \f4 (bold+italic). Support ".ft BI" and ".ft CW" for terminal output. Support the .ft request in HTML output. Reject the bogus fonts \f(C1, \f(C2, \f(C3, and \f(CP. In regress.pl, only strip leading whitespace in math mode.
* Several improvements to escape sequence handling.Ingo Schwarze2018-12-151-4/+6
| | | | | | | | | | | | | | | | | | | | | | | * Add the missing special character \_ (underscore). * Partial implementations of \a (leader character) and \E (uninterpreted escape character). * Parse and ignore \r (reverse line feed). * Add a WARNING message about undefined escape sequences. * Add an UNSUPP message about unsupported escape sequences. * Mark \! and \? (transparent throughput) and \O (suppress output) as unsupported. * Treat the various variants of zero-width spaces as one-byte escape sequences rather than as special characters, to avoid defining bogus forms with square brackets. * For special characters with one-byte names, do not define bogus forms with square brackets, except for \[-], which is valid. * In the form with square brackets, undefined special characters do not fall back to printing the name verbatim, not even for one-byte names. * Starting a special character name with a blank is an error. * Undefined escape sequences never abort formatting of the input string, not even in HTML output mode. * Document the newly handled escapes, and a few that were missing. * Regression tests for most of the above.
* HTML syntax audit: render \p as <br/>, not as <div>.Ingo Schwarze2018-12-041-5/+2
| | | | It can occur anywhere, in particular in phrasing context.
* Support more than one style attribute one the same HTML element.Ingo Schwarze2018-11-261-16/+25
| | | | | In fact, this is already required when a table uses non-default horizontal and vertical alignment in the same cell.
* When a font escape appears in the middle of a string,Ingo Schwarze2018-11-231-2/+5
| | | | | make sure it doesn't cause output of bogus whitespace. Fixing a bug reported by Pali dot Rohar at gmail dot com.
* Implement the \f(CW and \f(CR (constant width font) escape sequencesIngo Schwarze2018-10-251-1/+11
| | | | | | | | | for HTML output. Somewhat relevant because pod2man(1) relies on this. Missing feature reported by Pali dot Rohar at gmail dot com. Note that constant width font was already correctly selected before this when required by semantic markup. Only attempting physical markup with the low-level escape sequence was ineffective.
* Add an option -T html -O toc to add a brief table of contents nearIngo Schwarze2018-10-021-1/+3
| | | | | the top of HTML pages containing at least two non-standard sections. Suggested by Adam Kalisz and discussed with kristaps@ during EuroBSDCon 2018.
* Support a second argument to -O man,Ingo Schwarze2018-10-021-3/+19
| | | | | | selecting the format according to local existence of the file. Suggested by kristaps@ during EuroBSDCon 2018. Written on the train Frankfurt-Karlsruhe returning from EuroBSDCon.
* Implement the \*(.T predefined string (interpolate device name)Ingo Schwarze2018-08-161-1/+4
| | | | | by allowing the preprocessor to pass it through to the formatters. Used for example by the groff_char(7) manual page.
* Delete substantial amounts of codeIngo Schwarze2018-06-251-139/+18
| | | | now that we no longer use variable style= attributes.
* Delete support for style=margin-left attributes, which are no longer used.Ingo Schwarze2018-06-251-11/+3
|
* Delete support for style=width attributes, which are no longer used.Ingo Schwarze2018-06-251-41/+1
|
* Do not write <colgroup> elements. Their only purpose is to enforceIngo Schwarze2018-06-251-3/+1
| | | | | | author-specified column widths, which can harm responsive design and provide no real benefit: HTML rendering engines usually do just fine automatically selecting appropriate column widths.
* Delete support for the style=text-indent attribute, which is no longer used.Ingo Schwarze2018-06-251-4/+1
|
* Revert previous: style=height is still used by roff_html.c, and itIngo Schwarze2018-06-181-1/+4
| | | | | doesn't actually harm responsive design, so keep it for now. Bug reported in de.comp.os.unix.bsd via naddy@, thanks.
* delete support for the HTML style=height property, which is no longer usedIngo Schwarze2018-06-101-4/+1
|
* Delete support for the print_otag(sw) * and - modifiers,Ingo Schwarze2018-05-291-12/+1
| | | | which are no longer used because we write fewer style= attributes.
* URL-fragment strings can only contain certain characters.Ingo Schwarze2018-05-281-3/+9
| | | | | Fixing HTML syntax violations e.g. in pf.conf(5) and ifconfig(8) reported by Anton Lazarov <lists at wrant dot com>.
* Do not write duplicate id= attributes, they violate HTML syntax.Ingo Schwarze2018-05-251-5/+45
| | | | | Append suffixes for disambiguation. Issue first reported by Jakub Klinkovsky <j dot l dot k at gmx dot com> (Arch Linux).
* Use <span> for .Ms rather than <b>; discussed with John Gardner.Ingo Schwarze2018-05-211-1/+3
|
* Use <span> for .Ad rather than <i>; also suggested by John Gardner.Ingo Schwarze2018-05-211-2/+2
|
* Use <span> rather than abusing <i> for .Pa;Ingo Schwarze2018-05-211-1/+3
| | | | suggested by John Gardner <gardnerjohng at gmail dot com>.
* Fix a long-standing issue:Ingo Schwarze2018-05-091-1/+4
| | | | | | | | | Some macros (Nd, Oo) can contain blocks but rendered as elements that can only contain phrasing content, resulting in invalid HTML nesting. Switch them to <div>. Also move the related "display: inline" style from the HTML to the CSS. Reminded during a conversation with John Gardner.
* Eliminate the class=It-* attributes.Ingo Schwarze2018-05-081-1/+5
| | | | | Cleaner HTML, more idiomatic CSS, and minus 30 lines of C code. Suggested by John Gardner <gardnerjohng at gmail dot com>.
* Switch the emitted HTML element from <b> to <code> for the fixedIngo Schwarze2018-05-081-1/+7
| | | | | | | | syntax element macros .Nm, .Fl, .Cm, .Ic, .In, .Fd, .Fn, and .Cd. Adjust both the internal and external style sheets such that rendering remains unchanged in typical browsers. Based on feedback from John Gardner <gardnerjohng at gmail dot com>.
* skip printing the embedded style sheet if an external style is referencedIngo Schwarze2018-05-011-6/+7
|
* preserve comments before .Dd and .TH (typically Copyright and license)Ingo Schwarze2018-04-131-2/+28
| | | | | in full HTML output, but not with -Ofragment, e.g. in man.cgi(8); suggested by Thomas Klausner <wiz at NetBSD>
* fix a NULL pointer access on deroff() failure;Ingo Schwarze2017-09-061-1/+3
| | | | | could be triggered with '.SS ""'; reported by Michael <Stapelberg at debian>
* In .Bl -tag and -hang, do not print a margin-left style attributeIngo Schwarze2017-07-151-14/+18
| | | | | | for each individual item if the -width argument matches the default of 6n. Suggested by Steffen Nurpmeso <steffen at sdaoden dot eu> on <groff at GNU dot org> in April 2017.
* Fix an assertion failure triggered by print_otag("sw+-l", NULL).Ingo Schwarze2017-07-141-2/+7
| | | | | | Even though we skip the style when the argument is NULL, we must still consume the options. Not found with afl(1), but during manual testing of the previous patch...
* Handle .Bl -compact via CSS rather than writing individual styleIngo Schwarze2017-07-141-12/+1
| | | | | | | | attributes into .It blocks; suggested by Steffen Nurpmeso <steffen at sdaoden dot eu> on <groff at GNU dot org> in April 2017. Delete margin-bottom and margin-top style names and the 'v' argument letter from print_otag() because they are no longer used.
* print HTML character references as 4+ digits hexadecimal, like Unicode;Ingo Schwarze2017-07-141-4/+4
| | | | from bentley@, tweaked by me
* Write text boxes as <mi>, <mn>, or <mo> as appropriate,Ingo Schwarze2017-06-231-1/+2
| | | | | and write fontstyle or fontweight attributes where required. Missing features reported by bentley@.
* implement the roff(7) \p (break output line) escape sequenceIngo Schwarze2017-06-141-8/+27
|
* make the internal a2roffsu() interface more powerful by returningIngo Schwarze2017-06-081-2/+5
| | | | | a pointer to the end of the parsed data, making it easier to parse subsequent bytes
* Tweak previous: tb@ noticed that some browser/font combinationsIngo Schwarze2017-05-141-2/+2
| | | | | | have so amazingly wide bold fonts (for the same nominal font size) that adding 15% to the column width still isn't sufficient to make text reliably fit, so go for 20%.
* Make the tag column in .Bl -tag lists wider:Ingo Schwarze2017-05-121-5/+11
| | | | | | 1. I forgot about the 2n padding between tag and body. 2. The factor 1.1 was too small for bold fold, make it *1.15 + 1n. Ugliness spotted by tb@.
* Minimal support for deep linking into man(7) pages.Ingo Schwarze2017-03-151-2/+25
| | | | | As the man(7) language does not provide semantic markup, only .SH, .SS, and .UR become anchors for now.
* Slightly increase widths calculated from string lengths (mainlyIngo Schwarze2017-03-141-1/+3
| | | | | | | | | | for .Bl -tag lists and SYNOPSIS .Nm blocks), such that the text still fits even if it is printed in bold font. This is an ugly band aid - but implementing font-dependent width measurements would be a major project and even more difficult for HTML than for PostScript. Issue reported by Jan Stary <hans at stare dot cz>.
* Print title="..." in addition to id="..." attributes for macro keysIngo Schwarze2017-03-131-6/+13
| | | | | that can be searched for by apropos(1), such that you see the semantic function in a tooltip when hovering with the mouse.
* mark up .Ar, .Fa, .Va, .Ft, and .Vt with <var> rather than <i>;Ingo Schwarze2017-02-051-1/+2
| | | | suggested by bentley@ long ago, but needed lots of cleanup first
* for .Rs, use <cite>Ingo Schwarze2017-02-051-1/+2
|
* Improve <table> syntax:Ingo Schwarze2017-02-051-2/+2
| | | | | | | | The <col> element can only appear inside <colgroup>, so use <colgroup>. The <tbody> element is optional and useless, so don't use it. Even if we would ever need <thead> or <tfoot>, <tbody> would still be optional and useless; besides, we will likely never need <thead> or <tfoot>, simply because our languages don't support such functionality.