Ingo Schwarze [Fri, 18 Jan 2019 14:36:21 +0000 (14:36 +0000)]
The .UR and .MT blocks in man(7) are represented by <a> elements
which establish phrasing context, but they can contain paragraph
breaks (which is relevant for terminal formatting, so we can't just
change the structure of the syntax tree), which are respresented
by <p> elements and cannot occur inside <a>.
Fix this by prematurely closing the <a> element in the HTML formatter.
This menas that the clickable text in HTML output is shorter than
what is represented as the link text in terminal output, but in
HTML, it is frankly impossible to have the clickable area of a
hyperlink extend across a paragraph break. The difference in
presentation is not a major problem, and besides, paragraph breaks
inside .UR are rather poor style in the first place.
The implementation is quite tricky. Naively closing out the <a>
prematurely would result in accessing a stale pointer when later
reaching the physical end of the .UR block. So this commit separates
visual and structural closing of "struct tag" stack items. Visual
closing means that the HTML element is closed but the "struct tag"
remains on the stack, to avoid later access to a stale pointer and
to avoid closing the same HTML element a second time later.
This also needs reference counting of pointers to "struct tag" stack
items because often more than one child holds a pointer to the same
parent item, and only the outermost child can safely do the physical
closing.
In the whole corpus of nearly half a million manual pages on
man.openbsd.org, this problem occurs in exactly one page: the
groff(1) version 1.20.1 manual contained in DragonFly-3.8.2, which
contains a formatting error triggering the bug.
Ingo Schwarze [Thu, 17 Jan 2019 08:14:38 +0000 (08:14 +0000)]
Delete several entries that were already fixed.
The two entries about dashes, hyphens, and minus signs are no longer
relevant because we decided on a policy that is now documented.
Ingo Schwarze [Tue, 15 Jan 2019 12:16:18 +0000 (12:16 +0000)]
In PostScript and PDF output, one AFM unit is not nearly enough
inter-word spacing, let's try again with 250 AFM units.
Regression caused during my recent term_flushln() reorg in rev. 1.278,
reported by brynet@ (sorry and many thanks for reporting).
Ingo Schwarze [Fri, 11 Jan 2019 17:04:44 +0000 (17:04 +0000)]
Improve error reporting when a file given on the command line
cannot be opened:
* Mention the filename.
* Report the errno for the file itself, not the one with .gz appended.
Ingo Schwarze [Fri, 11 Jan 2019 12:56:42 +0000 (12:56 +0000)]
Remove the HTML title= attributes which harmed accessibility and
violated the principle of separation of content and presentation.
Instead, implement the tooltips purely in CSS.
Thanks to John Gardner <gardnerjohng at gmail dot com> for
suggesting most of the styling in the new ::before rules.
Ingo Schwarze [Thu, 10 Jan 2019 07:40:10 +0000 (07:40 +0000)]
After years of gnashing of teeth, i finally found a way to avoid
having to write empty list elements for non-compact .Bl -tag lists:
1. Add margin-bottom to the <dd>.
Note that margin-top on the <dt> doesn't work because it would put
a short <dt> lower than the <dd>; margin-bottom on the <dt> doesn't
work because it would put vertical space before the <dd> for a long
<dt>; and margin-top on the <dd> doesn't work because it would put
a short <dt> higher than the <dd>. Only margin-bottom on the <dd>
has none of these adverse effects.
2. Of course, margin-bottom on the <dd> fails to take care of the
vertical spacing before the first list element, so implement that
separately by margin-top on the <dl>.
Ingo Schwarze [Thu, 10 Jan 2019 06:29:00 +0000 (06:29 +0000)]
Initializers for file-scope static variables should be compile-time
constants, and while stderr is a compile-time constant in OpenBSD,
Kelvin Sherlock <ksherlock at gmail dot com> reports that it isn't
on some other systems, for example on FreeBSD or Linux.
So do the initialization by calling mandoc_msg_setoutfile()
from main() instead.
Ingo Schwarze [Mon, 7 Jan 2019 07:26:29 +0000 (07:26 +0000)]
Represent mdoc(7) .Pp (and .sp, and some SYNOPSIS and .Rs features)
by the <p> HTML element and use the html_fillmode() mechanism
for .Bd -unfilled, just like it was done for man(7) earlier, finally
getting rid both of the horrible <div class="Pp"></div> hack and
of the worst HTML syntax violations caused by nested displays.
Care is needed because in some situations, paragraphs have to remain
open across several subsequent macros, whereas in other situations,
they must get closed together with a block containing them.
Some implementation details include:
* Always close paragraphs before emitting HTML flow content.
* Let html_close_paragraph() also close <pre> for extra safety.
* Drop the old, now unused function print_paragraph().
* Minor adjustments in the top-level man(7) node formatter for symmetry.
* Bugfix: .Ss heads suspend no-fill mode, even though .Ss doesn't end it.
* Bugfix: give up on .Op semantic markup for now, see the comment.
Ingo Schwarze [Sun, 6 Jan 2019 04:55:09 +0000 (04:55 +0000)]
Finally, represent the man(7) .PP and .HP macros by the natural
choice, which is the <p> HTML element. On top of the previous
fill-mode improvements, the key to making this possible is to
automatically close the <p> when required: before headers, subsequent
paragraphs, lists, indented blocks, synopsis blocks, tbl(7) blocks,
and before blocks using no-fill mode.
In man(7) documents, represent the .sp request by a blank line in
no-fill mode and in the same way as .PP in fill mode.
Ingo Schwarze [Sat, 5 Jan 2019 21:55:11 +0000 (21:55 +0000)]
In no-fill mode, avoid bogus blank lines in two situations:
1. After the last child; the parent will take care of the line break.
2. At the .YS macro; the end of the preceding .SY already broke the line.
Ingo Schwarze [Sat, 5 Jan 2019 20:04:50 +0000 (20:04 +0000)]
Slowly start doing more HTML output tests, in this case for the
interaction of .nf and .RS, related to man_macro.c rev. 1.106.
HTML regression testing is tricky because it is extremely prone to
over-testing, i.e. unintentional testing for volatile formatting
details which are irrelevant for deciding whether the HTML output
is good or bad. Minor changes to the formatter - which is still
heavily under development - might result in the necessity to
repeatedly adjust many test cases.
Then again, HTML syntax rules are so complicated that without
regression testing, the risk is simply too high that later changes
will re-introduce issues that were already fixed earlier. Let's
just try to design the tests very carefully in such a way that
the *.out_html files contain nothing that is likely to change, and
defer testing in cases where the HTML output is not yet clean enough
to allow designing tests in such a way.
Ingo Schwarze [Sat, 5 Jan 2019 18:59:46 +0000 (18:59 +0000)]
In HTML output, man(7) .RS blocks get formatted as <div class="Bd-indent">,
and i can see no reasonable alternative: they do indeed represent indented
displays. They certainly require flow context and make no sense in phrasing
context. Consequently, they have to suspend no-fill mode during their head,
in just the same way as other paragraph-type macros do it.
This fixes HTML syntax errors that resulted from .nf followed by .RS.
Ingo Schwarze [Sat, 5 Jan 2019 09:46:34 +0000 (09:46 +0000)]
minor cleanup, no functional change:
* delete one irrelevant FIXME; no more fixed lengths in HTML, please
* simplify some conditions
* avoid testing pointers as truth values, use "!= NULL"
* sort some declarations
* delete some pointless blank lines
Ingo Schwarze [Sat, 5 Jan 2019 09:14:44 +0000 (09:14 +0000)]
Now that the NODE_NOFILL flag in the syntax tree is accurate,
use it in the man(7) HTML formatter rather than keeping fill mode
state locally, resulting in massive simplification (minus 40 LOC).
Move the html_fillmode() state handler function to the html.c module
such that both the man(7) and the roff(7) formatter (and in the future,
also the mdoc(7) formatter) can use it. Give it a query mode, to be
invoked with TOKEN_NONE.
Ingo Schwarze [Sat, 5 Jan 2019 01:29:32 +0000 (01:29 +0000)]
minor cleanup, no functional change:
* in node type switches, explicitly handle all types, sort them,
and abort() on those that cannot occur
* avoid testing pointers as truth values, use "!= NULL"
* avoid testing "constant == variable", use "variable == constant"
* prefer sizeof(var) over sizeof(type)
* delete one duplicate function
* sort some declarations
* delete some useless blank lines
Ingo Schwarze [Sat, 5 Jan 2019 00:36:50 +0000 (00:36 +0000)]
Some high-level block macros have an effect similar to temporarily
suspending no-fill mode during their head. Model this with an
additional roff parser state flag ROFF_NONOFILL. That is much
simpler than it would be to save and restore the ROFF_NOFILL flag
itself, in particular since the latter can be switched (with lasting
effect) by the .nf and .fi requests even while its effect is
temporarily suspended.
This commit does not change formatting yet, but prepares for future
formatting simplifications and improvements.
Ingo Schwarze [Fri, 4 Jan 2019 04:04:14 +0000 (04:04 +0000)]
Test interaction of low-level roff(7) filling requests with .Bd in general
and filling in .Bd -centered in particular; related to mdoc_term.c rev. 1.372.
Ingo Schwarze [Fri, 4 Jan 2019 03:39:01 +0000 (03:39 +0000)]
Two functional improvements to filling in terminal output.
1. Fully support no-fill mode in mdoc(7), even when invoked with
low-level roff(7) .nf requests. As a side effect, this substantially
simplifies the implementation of .Bd -unfilled and .Bd -literal.
2. Let .Bd -centered fill its text, using the new TERMP_CENTER flag.
That finally fixes the long-standing bug that it used to operate in
no-fill mode, which was known to be wrong for at least five years.
This also simplifies the implementation of .Bd -centered considerably.
Ingo Schwarze [Fri, 4 Jan 2019 03:21:02 +0000 (03:21 +0000)]
Implement centering and adjustment to the right margin directly in
the terminal filling routine, controlled by new flags TERMP_CENTER
and TERMP_RIGHT.
This became possible by the recent term_flushln() rewrite.
No functional change yet, but to be used by upcoming commits.
Ingo Schwarze [Fri, 4 Jan 2019 03:17:36 +0000 (03:17 +0000)]
Oops, i forgot to adjust this file to the changes in roff.h rev. 1.67.
Provide a handler for the new .nf and .fi roff(7) request nodes,
avoiding a potential crash, and correctly restore the former fill
more at .Ed even when there was .nf or .fi inside the block.
Ingo Schwarze [Thu, 3 Jan 2019 19:59:55 +0000 (19:59 +0000)]
Rewrite the line filling function for terminal output yet again.
This function has always been among the most complicated parts of
mandoc, and it repeatedly needed substantial functional enhancements.
The present rewrite is required to prepare for the implementation
of simultaneous filling and centering of output lines.
The previous implementation looked at each word in turn and printed
it to the output stream as soon as it was found to still fit on the
current output line. Obviously, that approach neither allows
centering nor adjustment to the right margin.
The new implementation first decides which part of the paragraph
to put onto the current output line, also measuring the display
width of that part, even if that part consists of multiple words
including intervening whitespace. This will allow moving the whole
output line to the right as desired before printing it, for example
to center it or to adjust it to the right margin.
The function is split into three parts, each much shorter, solving a
better defined task, much easier to understand and better commented:
1. the steering function term_flushln() looping over output lines;
2. the calculation function term_fill() looping over input characters;
3. and the output function term_field() looping over printed characters.
Ingo Schwarze [Tue, 1 Jan 2019 08:18:11 +0000 (08:18 +0000)]
Support taking the -O tag value from apropos(1) key=value search terms;
feature improvement suggested by kn@.
While here, also make "-O value" work from standard input.
OK kn@
Ingo Schwarze [Tue, 1 Jan 2019 07:42:04 +0000 (07:42 +0000)]
Correctly set the ROFF_NOFILL parser flag for .Bd .Ed .Sh, such
that children and later siblings get correct NODE_NOFILL assignments.
This doesn't change rendering yet but prepares for future rendering
improvements.
Ingo Schwarze [Tue, 1 Jan 2019 03:45:29 +0000 (03:45 +0000)]
Now that .nf and .fi are implemented in the roff(7) parser and formatters
rather than in the man(7) parser and formatters, document them in the
roff(7) manual, where they belong, rather than in the man(7) manual.
Mention that they imply an output line break, and mention which macros
imply these requests.
Ingo Schwarze [Mon, 31 Dec 2018 11:01:37 +0000 (11:01 +0000)]
Cleanup, minus 25 LOC, no functional change:
Delete the complicated mechanism keeping fill mode state locally in
the man(7) HTML formatter. Instead, use the state stored in the nodes.
Ingo Schwarze [Mon, 31 Dec 2018 10:35:56 +0000 (10:35 +0000)]
Cleanup, no functional change:
Stop trying to keep fill mode state locally in the mdoc HTML formatter,
rely on the state stored in the nodes instead.
Note that the .Bd -literal code is buggy. Nested literal displays
result in nested <pre> elements, which violates HTML syntax.
But i'm not yet fixing bugs in this commit, i'm merely deleting
code which has no effect.
Ingo Schwarze [Mon, 31 Dec 2018 10:04:39 +0000 (10:04 +0000)]
Cleanup, no functional change:
Since the man(7) and roff(7) validators no longer use the parser
state flag ROFF_NOFILL, we can finally get rid of the function
man_state(), resulting in a better separation of parsing and validation.
Ingo Schwarze [Mon, 31 Dec 2018 08:38:21 +0000 (08:38 +0000)]
Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.
Ingo Schwarze [Mon, 31 Dec 2018 08:18:12 +0000 (08:18 +0000)]
Store the fill mode with a new flag NODE_NOFILL in every node,
like it is already done with NODE_SYNPRETTY, such that the fill
mode becomes more directly available to the formatters.
Not used yet, but will be used by upcoming commits.
Ingo Schwarze [Mon, 31 Dec 2018 08:03:46 +0000 (08:03 +0000)]
For .EX and .EE, set the fill mode parser state directly in the
macro parsing function, in the same way as the roff parser already
does it for the .nf and .fi requests. This is a preparation for
getting rid of the ugly function man_state() later on.
Ingo Schwarze [Mon, 31 Dec 2018 07:46:07 +0000 (07:46 +0000)]
Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.
Ingo Schwarze [Mon, 31 Dec 2018 07:08:12 +0000 (07:08 +0000)]
Move parsing of the .nf and .fi (fill mode) requests from the man(7)
parser to the roff(7) parser. As a side effect, .nf and .fi are
now also parsed in mdoc(7) input, though the mdoc(7) formatters
still ignore most of their effect.
Ingo Schwarze [Mon, 31 Dec 2018 04:55:46 +0000 (04:55 +0000)]
Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.
Ingo Schwarze [Sun, 30 Dec 2018 00:49:54 +0000 (00:49 +0000)]
Cleanup, no functional change:
The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.
Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.
This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.
Ingo Schwarze [Fri, 28 Dec 2018 00:15:11 +0000 (00:15 +0000)]
add some notes about using col(1) and ul(1) to process the ascii markup
since these may not be commonly known utilities;
idea from and joint work with tedu@
CV: ----------------------------------------------------------------------
Ingo Schwarze [Sun, 23 Dec 2018 22:03:32 +0000 (22:03 +0000)]
Finally, stop abusing .Ss and .Sx to mark up macros, use .Ic instead
since these are clearly commands in a domain-specific language. As
a nice side effect, the resulting list allows including the synopsis
for each macro in the item head, reducing some repetitive verbiage.
Ingo Schwarze [Sun, 23 Dec 2018 16:55:34 +0000 (16:55 +0000)]
Simplify and clarify instructions for .Ql, and deprecate .Li.
The macros .Ql, .Dl, and .Bd -literal leave no room for any
valid use case for .Li whatsoever.
General direction discussed with jmc@.
Ingo Schwarze [Fri, 21 Dec 2018 17:15:18 +0000 (17:15 +0000)]
Rename mandoc_getarg() to roff_getarg() and pass it the roff parser
struct as an argument such that after copy-in, it can call roff_expand()
once again, which used to be called roff_res() before this. This
fixes a subtle low-level roff(7) parsing bug reported by Fabio
Scotoni <fabio at esse dot ch> in the 4.4BSD-Lite2 mdoc.samples(7)
manual page, because that page used an escaped escape sequence in
a macro argument.
To expand escaped escape sequences in quoted mdoc(7) arguments, too,
stop bypassing the call to roff_getarg() in mdoc_argv.c, function args()
for this case. This does not solve the case of escaped escape sequences
in quoted .Bl -column phrases yet.
Because roff_expand() can make the string longer, roff_getarg() can no
longer operate in-place but needs to malloc(3) the returned string.
In the high-level parsers, free(3) that string after processing it.
Ingo Schwarze [Thu, 20 Dec 2018 21:30:32 +0000 (21:30 +0000)]
Move the full responsibility for reporting open(2) errors from
mparse_open() to the caller. That is better because only the caller
knows its preferred reporting method and format and only the caller
has access to all the data that should be included - like the column
number in .so processing or the current manpath in makewhatis(8).
Moving the mandoc_msg() call out is possible because the caller can
call strerror(3) just as easily as mparse_open() can.
Move mandoc_msg_setinfilename() closer to the parsing of the file
contents, to avoid problems *with* the file (like non-existence,
lack of permissions, etc.) getting misreported as problems *in*
the file.
Fix the column number reported for .so failure:
let it point to the beginning of the filename.
Taken together, this prevents makewhatis(8) from spewing confusing
messages about .so failures to stderr, a bug reported by
Raf Czlonka <rczlonka at gmail dot com> on ports@.
It also prevents mandoc(1) from issuing *two* messages for every
single .so failure.
Ingo Schwarze [Thu, 20 Dec 2018 18:24:12 +0000 (18:24 +0000)]
Explain what the fields in mandoc messages mean,
rather than merely specifying the message syntax.
Gap in documentation found while looking at a bug
report from Raf Czlonka <rczlonka at gmail dot com>.
Ingo Schwarze [Thu, 20 Dec 2018 03:41:54 +0000 (03:41 +0000)]
Bugfix:
When after a \\, \t, or \a, another \t or \a had to be resolved
in copy mode within the same argument, the argument got corrupted.
Found while working on a loosely related bug report
from Fabio Scotoni <fabio at esse dot ch>.
Ingo Schwarze [Tue, 18 Dec 2018 22:00:02 +0000 (22:00 +0000)]
As a first step towards making roff_res() callable from mandoc_getarg(),
move the function mandoc_getarg() from mandoc.c to roff.c. It was
misplaced in mandoc.c in the first place; that file is intended for
utilities needed both by parsers and by formatters, while reading
macro arguments in copy mode is purely a task of the roff(7) parser.
Needed as a preliminary for an upcoming bugfix.
No code change.
Ingo Schwarze [Sun, 16 Dec 2018 02:21:00 +0000 (02:21 +0000)]
The .HP macro was deprecated by groff, and that makes sense
because it serves no real purpose and works poorly with HTML.
While here, describe the section argument of .TH,
clarify the syntax display of .TP, and polish some wordings.
Ingo Schwarze [Sun, 16 Dec 2018 00:17:02 +0000 (00:17 +0000)]
Yet another round of improvements to manual font selection.
Unify handling of \f and .ft.
Support \f4 (bold+italic).
Support ".ft BI" and ".ft CW" for terminal output.
Support the .ft request in HTML output.
Reject the bogus fonts \f(C1, \f(C2, \f(C3, and \f(CP.
In regress.pl, only strip leading whitespace in math mode.
Ingo Schwarze [Sat, 15 Dec 2018 19:30:25 +0000 (19:30 +0000)]
Several improvements to escape sequence handling.
* Add the missing special character \_ (underscore).
* Partial implementations of \a (leader character)
and \E (uninterpreted escape character).
* Parse and ignore \r (reverse line feed).
* Add a WARNING message about undefined escape sequences.
* Add an UNSUPP message about unsupported escape sequences.
* Mark \! and \? (transparent throughput)
and \O (suppress output) as unsupported.
* Treat the various variants of zero-width spaces as one-byte escape
sequences rather than as special characters, to avoid defining bogus
forms with square brackets.
* For special characters with one-byte names, do not define bogus
forms with square brackets, except for \[-], which is valid.
* In the form with square brackets, undefined special characters do not
fall back to printing the name verbatim, not even for one-byte names.
* Starting a special character name with a blank is an error.
* Undefined escape sequences never abort formatting of the input
string, not even in HTML output mode.
* Document the newly handled escapes, and a few that were missing.
* Regression tests for most of the above.
Ingo Schwarze [Fri, 14 Dec 2018 06:33:14 +0000 (06:33 +0000)]
Cleanup, no functional change:
Now that message handling is properly encapsulated,
remove struct mparse pointers from four structs (roff, roff_man,
tbl_node, eqn_node) and from the argument lists of five functions
(roff_alloc, roff_man_alloc, mandoc_getarg, tbl_alloc, eqn_alloc).
Except for being passed to the main program as an opaque object,
it now only occurs in read.c, as it should, and not across 15 files
like in the past.
Ingo Schwarze [Fri, 14 Dec 2018 05:18:02 +0000 (05:18 +0000)]
Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.
Ingo Schwarze [Fri, 14 Dec 2018 02:16:21 +0000 (02:16 +0000)]
Fold mparse_parse_buffer() into mparse_readfd(), making the code
considerably more readable. This is possible now that i finally
deleted mparse_readmem() from mandoc portable - an unused function
that never existed in OpenBSD.
This cleanup already made me find a minor bug: after a recursive
parse, restoring the line number of the parent file was forgotten.
This is fixed now.
Ingo Schwarze [Fri, 14 Dec 2018 01:24:49 +0000 (01:24 +0000)]
Delete the function mparse_readmem() that has been unused for almost a
decade but regularly makes maintenance harder. Mandoc is not a
general-purpose library, and being as pluggable as possible is not
among the goals of the project.
Ingo Schwarze [Fri, 14 Dec 2018 01:18:25 +0000 (01:18 +0000)]
Major cleanup; may imply minor changes in edge cases of error reporting.
Finally, drop support for the run-time configurable mandocmsg()
callback. It was over-engineered from the start, never used for
anything in a decade, and repeatedly caused maintenance headaches.
Consolidate reporting infrastructure into two files, mandoc.h and
mandoc_msg.c, mopping up the bits and pieces that were scattered
around main.c, read.c, mandoc_parse.h, libmandoc.h, the prototypes
of four parsing-related functions, and both parser structs.
Ingo Schwarze [Thu, 13 Dec 2018 11:55:46 +0000 (11:55 +0000)]
Cleanup, no functional change:
Split the top level parser interface out of the utility header
mandoc.h, into a new header mandoc_parse.h, for use in the main
program and in the main parser only.
Move enum mandoc_os into roff.h because struct roff_man is the
place where it is stored.
This allows removal of mandoc.h from seven files in low-level
parsers and in formatters.
Ingo Schwarze [Thu, 13 Dec 2018 07:28:27 +0000 (07:28 +0000)]
Cleanup, no functional change:
Finally merge the pointless file st.in into st.c.
Nobody should do operating systems dependent changes to standards:
By definition, standards are the same for every operating system.
While here, libmdoc.h no longer requires mdoc.h.
Ingo Schwarze [Thu, 13 Dec 2018 06:18:20 +0000 (06:18 +0000)]
Cleanup, no functional change:
Move the roffhash_*() functions from roff.h to roff_int.h
because they are only intended for use by parsers,
neither by main programs nor by formatters.
Ingo Schwarze [Thu, 13 Dec 2018 05:23:37 +0000 (05:23 +0000)]
Cleanup, no functional change:
No need to expose the eqn(7) syntax tree data structures everywhere.
Move them to their own include file, "eqn.h".
While here, delete the unused enum eqn_pilet.
Ingo Schwarze [Thu, 13 Dec 2018 03:40:13 +0000 (03:40 +0000)]
Cleanup, no functional change:
In libroff.h, nothing was left except the eqn(7) parser interface, which
isn't really part of the roff(7) parser, so rename it to eqn_parse.h.
While here, move struct eqn_def to eqn.c because that's the only
file using it, and let eqn_box_free() and eqn_free() handle NULL.
Ingo Schwarze [Wed, 12 Dec 2018 21:54:35 +0000 (21:54 +0000)]
Cleanup, no functional change:
No need to expose the tbl(7) syntax tree data structures everywhere.
Move them to their own include file, "tbl.h", and improve comments.
Ingo Schwarze [Tue, 4 Dec 2018 05:21:04 +0000 (05:21 +0000)]
Make sure all borders in a table are drawn in the same color.
Required because browsers tend to have inconsistent defaults:
For example, Firefox 62.0.2 sets border-color for tbody, but not for table,
and Pali Rohar reports that Chrome set it for td, but not for tr or tbody.
The td part is from Pali Rohar, the tbody and tr parts from me.
Ingo Schwarze [Tue, 4 Dec 2018 03:28:58 +0000 (03:28 +0000)]
During validation, drop .br before a text line starting with a
blank, rather than teaching each formatter individually to ignore
the .br in such situations. That's simpler and also results in
better diagnostics.
Mark Harris <mark dot hsj at gmail dot com> reported
that -T html got confused in particular.
Ingo Schwarze [Tue, 4 Dec 2018 02:53:51 +0000 (02:53 +0000)]
Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.
Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.
In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.
Ingo Schwarze [Mon, 3 Dec 2018 21:00:10 +0000 (21:00 +0000)]
In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)
to the standard forms (Pp, Ft, PP) up front, such that later code
does not need to look for the obsolete versions.
This reduces the risk of incomplete handling.
Ingo Schwarze [Mon, 3 Dec 2018 16:18:02 +0000 (16:18 +0000)]
Render .br as <br/>, not as an empty <div>.
The element <br/> was already employed for many other purposes,
so there is nothing wrong with using it.
Also, it is safer because <br/> is permitted in phrasing content,
whereas <div> is only allowed in flow content.
This is the first part of the HTML syntax audit which i wanted
to do for a long time. Reminded by a loosely related bug report
from Mark Harris <mark dot hsj at gmail dot com>.
Examples of where this caused HTML nesting syntax errors:
* in man(7) code between .nf and .fi
* in mdoc(7) code between .Bd -unfilled and .Ed
* in mdoc(7) code between .Ql Xo and .Xc
* in mdoc(7) code between .Rs and .Re
Ingo Schwarze [Thu, 29 Nov 2018 23:08:13 +0000 (23:08 +0000)]
Do not draw horizontal lines through vertical spans
which are requested in the data section rather than in the layout.
Mini-feature found in misc/pfm(1).
Ingo Schwarze [Thu, 29 Nov 2018 21:40:53 +0000 (21:40 +0000)]
Now that it is better understood how borders work,
rewrite tbl_hrule() in a simpler way.
Fix several bugs in the process.
No more special flags, just use the existing TBL_OPT_* from mandoc.h.
Reduce the number of tracked rows from three to two, which is more logical:
one above the line and one below is sufficient to figure out crossings.
No more magic quirks, all conditions are readily comprehensible now.
Add comments.
Ingo Schwarze [Thu, 29 Nov 2018 01:55:02 +0000 (01:55 +0000)]
Better handle automatic column width assignments in the presence of
horizontal spans, by implementing a moderately difficult iterative
algoritm. The benefit is that spans containing long text no longer
cause an excessive width of their starting column.
The result is likely not optimal, in particular in the presence
of many spans overlapping in complicated ways nor when spans
interact with equalizing or maximizing colums. But i doubt the
practical usefulness of making this more complicated.
Issue originally reported in synaptics(4), which now looks better,
by tedu@ three years ago, and reminded by Pali Rohar this summer.
Ingo Schwarze [Wed, 28 Nov 2018 14:23:06 +0000 (14:23 +0000)]
Bugfix: never set termp->enc to the ambiguous value TERMENC_LOCALE,
but instead set it to TERMENC_UTF8 or TERMENC_ASCII.
Makes tbl(7) box drawing work under -T locale (that is, by default
when LC_CTYPE is defined appropriately).
Ingo Schwarze [Mon, 26 Nov 2018 21:06:02 +0000 (21:06 +0000)]
Implement tbl(7) lines in -T html output,
as far as they are on the edges of table cells
rather than going through the middle of cells:
* the box, doublebox, and allbox options;
* the | and || layout modifiers;
* and the _ and = data lines;
- but not yet _ and = in individual layout and data cells.
Missing feature reported by Pali dot Rohar at gmail dot com.
Ingo Schwarze [Mon, 26 Nov 2018 17:44:34 +0000 (17:44 +0000)]
When a conditional block is closed by putting "\}" on a text line
by itself (which is somewhat unusual but not invalid; most authors
use the empty macro line ".\}" instead), agree more closely with
groff and do not produce a double space in the output.
Quirk reported by millert@.
While here, tweak the rest of the function body of roff_cond_text()
to more closely match roff_cond_sub(). The subtly different handling
could make people (including myself) wonder whether there is any
point in being different. Testing shows there is not.
Ingo Schwarze [Mon, 26 Nov 2018 17:11:11 +0000 (17:11 +0000)]
Mark Harris pointed out that people might have doubts whether all files
contained in the mandoc toolkit are "code and documentation", and whether
this is of any consequence for licensing, so clarify.
Ingo Schwarze [Mon, 26 Nov 2018 15:02:38 +0000 (15:02 +0000)]
Place mandoc.css into the public domain.
The reason for doing this rather than using the ISC license
is that i guess that in some contexts, a requirement to preserve
a Copyright and license header might be inconvenient, and i really
don't care at all how people use it.
What matters is that they do use it, or something similar - attempts
to use mandoc without any CSS are a constant source of grief and
bogus bug reports because HTML without CSS doesn't look very good:
the more structural and semantic and the less presentational and
old-fashioned the HTML, the more so.
Thanks to Mark Harris <mark dot hsj at gmail dot com> for pointing out
that the permissions on this particular file were unclear.
Ingo Schwarze [Mon, 26 Nov 2018 01:51:46 +0000 (01:51 +0000)]
Simplify writing of tbl(7) cells by using the new feature of passing
a NULL pointer for the value of a style attribute, in which case
the attribute is omitted from the HTML element.
Minus 12 lines of ugly and repetitive code, no functional change.
Ingo Schwarze [Mon, 26 Nov 2018 01:38:23 +0000 (01:38 +0000)]
Support more than one style attribute one the same HTML element.
In fact, this is already required when a table uses non-default
horizontal and vertical alignment in the same cell.
Ingo Schwarze [Sat, 24 Nov 2018 23:03:18 +0000 (23:03 +0000)]
Implement horizontal and vertical alignment of tbl(7) cell content
in -T html output. This does not handle spanned cells yet.
Missing feature reported by Pali dot Rohar at gmail dot com.
Ingo Schwarze [Fri, 23 Nov 2018 19:17:05 +0000 (19:17 +0000)]
When a font escape appears in the middle of a string,
make sure it doesn't cause output of bogus whitespace.
Fixing a bug reported by Pali dot Rohar at gmail dot com.