Ingo Schwarze [Wed, 6 Mar 2019 10:18:58 +0000 (10:18 +0000)]
autoconfiguration test whether less(1) supports the -T option;
needed for Alpine Linux because it uses busybox less(1) by default;
based on a patch from Daniel Sabogal explained to me by Natanael Copa
Ingo Schwarze [Mon, 4 Mar 2019 18:15:06 +0000 (18:15 +0000)]
For TIOCGWINSZ, #include <termios.h> rather than <sys/termios.h>
like almost all other userland programs. This also improves
portability: for example, it looks like <sys/termios.h> does not
work on FreeBSD, or at least bapt@ did the same change over there.
Ingo Schwarze [Mon, 4 Mar 2019 13:01:57 +0000 (13:01 +0000)]
When the -S option is given to man(1) and the requested manual page
name is not found and the requested architecture is unknown, complain
about the architecture rather than about the manual page name:
$ man -S vax cpu
man: Unknown architecture "vax".
$ man -S sparc64 foobar
man: No entry for foobar in the manual.
Friendlier error message suggested by jmc@, who also OK'ed the patch.
Ingo Schwarze [Mon, 4 Mar 2019 11:40:09 +0000 (11:40 +0000)]
Fix the last straggler where the struct roff_node "line" member
was abused to detect an input line break;
instead, use the NODE_LINE flag to improve robustness.
Ingo Schwarze [Sun, 3 Mar 2019 13:02:11 +0000 (13:02 +0000)]
Reset HTML formatter state, in particular the id_unique hash,
after processing each manual page, such that the next page
starts from a clean state and doesn't continue suffix numbering.
Issue found while looking at https://github.com/Debian/debiman/issues/48
which was brought up by Orestis Ioannou <oorestisime at github>.
Ingo Schwarze [Sat, 2 Mar 2019 22:04:40 +0000 (22:04 +0000)]
Do not open a subsection for each and every macro.
Instead, use a tagged list and the canonical .Ic macro
as it is natural for such purposes.
While here, also delete heaps of needless escaping.
Ingo Schwarze [Sat, 2 Mar 2019 16:30:53 +0000 (16:30 +0000)]
Represent multiple subsequent .IP blocks having a consistent
head argument of *, \-, or \(bu as <ul> rather than as <dl>,
using a bit of heuristics.
Basic idea suggested by Dagfinn Ilmari Mannsaker <ilmari at github>
in https://github.com/Debian/debiman/issues/67 and independently by
<Pali dot Rohar at gmail dot com> on <discuss at mandoc dot bsd dot lv>.
Ingo Schwarze [Fri, 1 Mar 2019 10:57:17 +0000 (10:57 +0000)]
Wrap .Sh/.SH sections and .Ss/.SS subsections in HTML <section> elements
as recommended for accessibility by the HTML 5 standard.
Triggered by a similar, but slightly different suggestion
from Laura Morales <lauretas at mail dot com>.
Ingo Schwarze [Thu, 28 Feb 2019 16:36:13 +0000 (16:36 +0000)]
Format multiple subsequent .IP or multiple subsequent .TP/.TQ
as a single <dl> list rather than opening a new list for each item;
feature suggested by Pali dot Rohar at gmail dot com.
Ingo Schwarze [Sat, 23 Feb 2019 18:53:54 +0000 (18:53 +0000)]
Explain the ASCII rendering of single quotes because that repeatedly
caused confusion in the past. People plainly do not expect that
there are limits to the compatibility between Unicode and ASCII,
but there are.
The information belongs here and not into mandoc_char(7) because
it explains how the specific output device (-T ascii) works and
because it has nothing to do with the question of how characters
are represented on the input side.
Ingo Schwarze [Sat, 9 Feb 2019 21:02:47 +0000 (21:02 +0000)]
The horizontal line in a data cell containing only "_" or "="
connects to the horizontally adjacent vertical line or cell;
fixing a bug reported by bentley@.
Ingo Schwarze [Wed, 6 Feb 2019 22:18:59 +0000 (22:18 +0000)]
Remove the misleading statement ".No takes no arguments".
In facts, i works very similarly to .Em and .Sy.
Triggered by a question from Kurt Mosiejczuk <kurt at cranky dot work>.
Ingo Schwarze [Wed, 6 Feb 2019 21:11:43 +0000 (21:11 +0000)]
Let roff_getname() end the roff identifier at a tab character
and audit all its callers whether termination is handled correctly.
Resulting improvements:
* An escape or tab ending the macro name in a macro invocation
is discarded, and argument processing is started after it.
* An escape or tab ending a name in ".if d" and ".if r" is preserved.
* An escape ending a name in ".ds" causes the whole request to be ignored.
* A tab ending a name in ".ds" becomes part of the string.
* An escape or tab ending a name in ".rm"
causes the rest of the line to be ignored.
* An escape or tab ending the first name in ".als", ".rn", or ".nr"
causes the whole request to be ignored.
Kurt Jaeger <pi at FreeBSD> made me aware of
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=235456#c0
and in that bug report, comment 0 item (3) is a special case
of this class of issues.
Yes, the "mh" manual pages are no doubt among the worst on the planet.
Ingo Schwarze [Thu, 31 Jan 2019 23:00:23 +0000 (23:00 +0000)]
Relax overzealous PATH_INFO validation.
URIs like https://man.openbsd.org/OpenBSD-2.2/cat1/cat.0
are still required to work because they result from apropos searches for
old releases (up to 5.0) which used to install preformatted manual pages.
Regression reported by jj@.
Ingo Schwarze [Thu, 31 Jan 2019 16:31:55 +0000 (16:31 +0000)]
Since resetting of offsets works quite differently in man(7) and mdoc(7),
test table centering in an mdoc(7) document as well.
Related to tbl_term.c rev. 1.67.
Ingo Schwarze [Thu, 31 Jan 2019 16:06:22 +0000 (16:06 +0000)]
Fix tbl(7) centering in mdoc(7) documents.
Since resetting of offsets works quite differently in the mdoc(7)
and man(7) formatters, the tbl(7) formatter needs to save the global
offset on entry and restore it on exit. The additional indentation
needed for table centering has to be added to its own offset variable
and applied to each line of the table, rather than only to the first.
Ingo Schwarze [Fri, 18 Jan 2019 14:36:21 +0000 (14:36 +0000)]
The .UR and .MT blocks in man(7) are represented by <a> elements
which establish phrasing context, but they can contain paragraph
breaks (which is relevant for terminal formatting, so we can't just
change the structure of the syntax tree), which are respresented
by <p> elements and cannot occur inside <a>.
Fix this by prematurely closing the <a> element in the HTML formatter.
This menas that the clickable text in HTML output is shorter than
what is represented as the link text in terminal output, but in
HTML, it is frankly impossible to have the clickable area of a
hyperlink extend across a paragraph break. The difference in
presentation is not a major problem, and besides, paragraph breaks
inside .UR are rather poor style in the first place.
The implementation is quite tricky. Naively closing out the <a>
prematurely would result in accessing a stale pointer when later
reaching the physical end of the .UR block. So this commit separates
visual and structural closing of "struct tag" stack items. Visual
closing means that the HTML element is closed but the "struct tag"
remains on the stack, to avoid later access to a stale pointer and
to avoid closing the same HTML element a second time later.
This also needs reference counting of pointers to "struct tag" stack
items because often more than one child holds a pointer to the same
parent item, and only the outermost child can safely do the physical
closing.
In the whole corpus of nearly half a million manual pages on
man.openbsd.org, this problem occurs in exactly one page: the
groff(1) version 1.20.1 manual contained in DragonFly-3.8.2, which
contains a formatting error triggering the bug.
Ingo Schwarze [Thu, 17 Jan 2019 08:14:38 +0000 (08:14 +0000)]
Delete several entries that were already fixed.
The two entries about dashes, hyphens, and minus signs are no longer
relevant because we decided on a policy that is now documented.
Ingo Schwarze [Tue, 15 Jan 2019 12:16:18 +0000 (12:16 +0000)]
In PostScript and PDF output, one AFM unit is not nearly enough
inter-word spacing, let's try again with 250 AFM units.
Regression caused during my recent term_flushln() reorg in rev. 1.278,
reported by brynet@ (sorry and many thanks for reporting).
Ingo Schwarze [Fri, 11 Jan 2019 17:04:44 +0000 (17:04 +0000)]
Improve error reporting when a file given on the command line
cannot be opened:
* Mention the filename.
* Report the errno for the file itself, not the one with .gz appended.
Ingo Schwarze [Fri, 11 Jan 2019 12:56:42 +0000 (12:56 +0000)]
Remove the HTML title= attributes which harmed accessibility and
violated the principle of separation of content and presentation.
Instead, implement the tooltips purely in CSS.
Thanks to John Gardner <gardnerjohng at gmail dot com> for
suggesting most of the styling in the new ::before rules.
Ingo Schwarze [Thu, 10 Jan 2019 07:40:10 +0000 (07:40 +0000)]
After years of gnashing of teeth, i finally found a way to avoid
having to write empty list elements for non-compact .Bl -tag lists:
1. Add margin-bottom to the <dd>.
Note that margin-top on the <dt> doesn't work because it would put
a short <dt> lower than the <dd>; margin-bottom on the <dt> doesn't
work because it would put vertical space before the <dd> for a long
<dt>; and margin-top on the <dd> doesn't work because it would put
a short <dt> higher than the <dd>. Only margin-bottom on the <dd>
has none of these adverse effects.
2. Of course, margin-bottom on the <dd> fails to take care of the
vertical spacing before the first list element, so implement that
separately by margin-top on the <dl>.
Ingo Schwarze [Thu, 10 Jan 2019 06:29:00 +0000 (06:29 +0000)]
Initializers for file-scope static variables should be compile-time
constants, and while stderr is a compile-time constant in OpenBSD,
Kelvin Sherlock <ksherlock at gmail dot com> reports that it isn't
on some other systems, for example on FreeBSD or Linux.
So do the initialization by calling mandoc_msg_setoutfile()
from main() instead.
Ingo Schwarze [Mon, 7 Jan 2019 07:26:29 +0000 (07:26 +0000)]
Represent mdoc(7) .Pp (and .sp, and some SYNOPSIS and .Rs features)
by the <p> HTML element and use the html_fillmode() mechanism
for .Bd -unfilled, just like it was done for man(7) earlier, finally
getting rid both of the horrible <div class="Pp"></div> hack and
of the worst HTML syntax violations caused by nested displays.
Care is needed because in some situations, paragraphs have to remain
open across several subsequent macros, whereas in other situations,
they must get closed together with a block containing them.
Some implementation details include:
* Always close paragraphs before emitting HTML flow content.
* Let html_close_paragraph() also close <pre> for extra safety.
* Drop the old, now unused function print_paragraph().
* Minor adjustments in the top-level man(7) node formatter for symmetry.
* Bugfix: .Ss heads suspend no-fill mode, even though .Ss doesn't end it.
* Bugfix: give up on .Op semantic markup for now, see the comment.
Ingo Schwarze [Sun, 6 Jan 2019 04:55:09 +0000 (04:55 +0000)]
Finally, represent the man(7) .PP and .HP macros by the natural
choice, which is the <p> HTML element. On top of the previous
fill-mode improvements, the key to making this possible is to
automatically close the <p> when required: before headers, subsequent
paragraphs, lists, indented blocks, synopsis blocks, tbl(7) blocks,
and before blocks using no-fill mode.
In man(7) documents, represent the .sp request by a blank line in
no-fill mode and in the same way as .PP in fill mode.
Ingo Schwarze [Sat, 5 Jan 2019 21:55:11 +0000 (21:55 +0000)]
In no-fill mode, avoid bogus blank lines in two situations:
1. After the last child; the parent will take care of the line break.
2. At the .YS macro; the end of the preceding .SY already broke the line.
Ingo Schwarze [Sat, 5 Jan 2019 20:04:50 +0000 (20:04 +0000)]
Slowly start doing more HTML output tests, in this case for the
interaction of .nf and .RS, related to man_macro.c rev. 1.106.
HTML regression testing is tricky because it is extremely prone to
over-testing, i.e. unintentional testing for volatile formatting
details which are irrelevant for deciding whether the HTML output
is good or bad. Minor changes to the formatter - which is still
heavily under development - might result in the necessity to
repeatedly adjust many test cases.
Then again, HTML syntax rules are so complicated that without
regression testing, the risk is simply too high that later changes
will re-introduce issues that were already fixed earlier. Let's
just try to design the tests very carefully in such a way that
the *.out_html files contain nothing that is likely to change, and
defer testing in cases where the HTML output is not yet clean enough
to allow designing tests in such a way.
Ingo Schwarze [Sat, 5 Jan 2019 18:59:46 +0000 (18:59 +0000)]
In HTML output, man(7) .RS blocks get formatted as <div class="Bd-indent">,
and i can see no reasonable alternative: they do indeed represent indented
displays. They certainly require flow context and make no sense in phrasing
context. Consequently, they have to suspend no-fill mode during their head,
in just the same way as other paragraph-type macros do it.
This fixes HTML syntax errors that resulted from .nf followed by .RS.
Ingo Schwarze [Sat, 5 Jan 2019 09:46:34 +0000 (09:46 +0000)]
minor cleanup, no functional change:
* delete one irrelevant FIXME; no more fixed lengths in HTML, please
* simplify some conditions
* avoid testing pointers as truth values, use "!= NULL"
* sort some declarations
* delete some pointless blank lines
Ingo Schwarze [Sat, 5 Jan 2019 09:14:44 +0000 (09:14 +0000)]
Now that the NODE_NOFILL flag in the syntax tree is accurate,
use it in the man(7) HTML formatter rather than keeping fill mode
state locally, resulting in massive simplification (minus 40 LOC).
Move the html_fillmode() state handler function to the html.c module
such that both the man(7) and the roff(7) formatter (and in the future,
also the mdoc(7) formatter) can use it. Give it a query mode, to be
invoked with TOKEN_NONE.
Ingo Schwarze [Sat, 5 Jan 2019 01:29:32 +0000 (01:29 +0000)]
minor cleanup, no functional change:
* in node type switches, explicitly handle all types, sort them,
and abort() on those that cannot occur
* avoid testing pointers as truth values, use "!= NULL"
* avoid testing "constant == variable", use "variable == constant"
* prefer sizeof(var) over sizeof(type)
* delete one duplicate function
* sort some declarations
* delete some useless blank lines
Ingo Schwarze [Sat, 5 Jan 2019 00:36:50 +0000 (00:36 +0000)]
Some high-level block macros have an effect similar to temporarily
suspending no-fill mode during their head. Model this with an
additional roff parser state flag ROFF_NONOFILL. That is much
simpler than it would be to save and restore the ROFF_NOFILL flag
itself, in particular since the latter can be switched (with lasting
effect) by the .nf and .fi requests even while its effect is
temporarily suspended.
This commit does not change formatting yet, but prepares for future
formatting simplifications and improvements.
Ingo Schwarze [Fri, 4 Jan 2019 04:04:14 +0000 (04:04 +0000)]
Test interaction of low-level roff(7) filling requests with .Bd in general
and filling in .Bd -centered in particular; related to mdoc_term.c rev. 1.372.
Ingo Schwarze [Fri, 4 Jan 2019 03:39:01 +0000 (03:39 +0000)]
Two functional improvements to filling in terminal output.
1. Fully support no-fill mode in mdoc(7), even when invoked with
low-level roff(7) .nf requests. As a side effect, this substantially
simplifies the implementation of .Bd -unfilled and .Bd -literal.
2. Let .Bd -centered fill its text, using the new TERMP_CENTER flag.
That finally fixes the long-standing bug that it used to operate in
no-fill mode, which was known to be wrong for at least five years.
This also simplifies the implementation of .Bd -centered considerably.
Ingo Schwarze [Fri, 4 Jan 2019 03:21:02 +0000 (03:21 +0000)]
Implement centering and adjustment to the right margin directly in
the terminal filling routine, controlled by new flags TERMP_CENTER
and TERMP_RIGHT.
This became possible by the recent term_flushln() rewrite.
No functional change yet, but to be used by upcoming commits.
Ingo Schwarze [Fri, 4 Jan 2019 03:17:36 +0000 (03:17 +0000)]
Oops, i forgot to adjust this file to the changes in roff.h rev. 1.67.
Provide a handler for the new .nf and .fi roff(7) request nodes,
avoiding a potential crash, and correctly restore the former fill
more at .Ed even when there was .nf or .fi inside the block.
Ingo Schwarze [Thu, 3 Jan 2019 19:59:55 +0000 (19:59 +0000)]
Rewrite the line filling function for terminal output yet again.
This function has always been among the most complicated parts of
mandoc, and it repeatedly needed substantial functional enhancements.
The present rewrite is required to prepare for the implementation
of simultaneous filling and centering of output lines.
The previous implementation looked at each word in turn and printed
it to the output stream as soon as it was found to still fit on the
current output line. Obviously, that approach neither allows
centering nor adjustment to the right margin.
The new implementation first decides which part of the paragraph
to put onto the current output line, also measuring the display
width of that part, even if that part consists of multiple words
including intervening whitespace. This will allow moving the whole
output line to the right as desired before printing it, for example
to center it or to adjust it to the right margin.
The function is split into three parts, each much shorter, solving a
better defined task, much easier to understand and better commented:
1. the steering function term_flushln() looping over output lines;
2. the calculation function term_fill() looping over input characters;
3. and the output function term_field() looping over printed characters.
Ingo Schwarze [Tue, 1 Jan 2019 08:18:11 +0000 (08:18 +0000)]
Support taking the -O tag value from apropos(1) key=value search terms;
feature improvement suggested by kn@.
While here, also make "-O value" work from standard input.
OK kn@
Ingo Schwarze [Tue, 1 Jan 2019 07:42:04 +0000 (07:42 +0000)]
Correctly set the ROFF_NOFILL parser flag for .Bd .Ed .Sh, such
that children and later siblings get correct NODE_NOFILL assignments.
This doesn't change rendering yet but prepares for future rendering
improvements.
Ingo Schwarze [Tue, 1 Jan 2019 03:45:29 +0000 (03:45 +0000)]
Now that .nf and .fi are implemented in the roff(7) parser and formatters
rather than in the man(7) parser and formatters, document them in the
roff(7) manual, where they belong, rather than in the man(7) manual.
Mention that they imply an output line break, and mention which macros
imply these requests.
Ingo Schwarze [Mon, 31 Dec 2018 11:01:37 +0000 (11:01 +0000)]
Cleanup, minus 25 LOC, no functional change:
Delete the complicated mechanism keeping fill mode state locally in
the man(7) HTML formatter. Instead, use the state stored in the nodes.
Ingo Schwarze [Mon, 31 Dec 2018 10:35:56 +0000 (10:35 +0000)]
Cleanup, no functional change:
Stop trying to keep fill mode state locally in the mdoc HTML formatter,
rely on the state stored in the nodes instead.
Note that the .Bd -literal code is buggy. Nested literal displays
result in nested <pre> elements, which violates HTML syntax.
But i'm not yet fixing bugs in this commit, i'm merely deleting
code which has no effect.
Ingo Schwarze [Mon, 31 Dec 2018 10:04:39 +0000 (10:04 +0000)]
Cleanup, no functional change:
Since the man(7) and roff(7) validators no longer use the parser
state flag ROFF_NOFILL, we can finally get rid of the function
man_state(), resulting in a better separation of parsing and validation.
Ingo Schwarze [Mon, 31 Dec 2018 08:38:21 +0000 (08:38 +0000)]
Use the new flag NODE_NOFILL in the validators, which is sometimes
simpler and always more robust. In particular, move the nesting
warnings for .EX and .EE from man_state(), where they were misplaced,
to the man(7) validator.
Ingo Schwarze [Mon, 31 Dec 2018 08:18:12 +0000 (08:18 +0000)]
Store the fill mode with a new flag NODE_NOFILL in every node,
like it is already done with NODE_SYNPRETTY, such that the fill
mode becomes more directly available to the formatters.
Not used yet, but will be used by upcoming commits.
Ingo Schwarze [Mon, 31 Dec 2018 08:03:46 +0000 (08:03 +0000)]
For .EX and .EE, set the fill mode parser state directly in the
macro parsing function, in the same way as the roff parser already
does it for the .nf and .fi requests. This is a preparation for
getting rid of the ugly function man_state() later on.
Ingo Schwarze [Mon, 31 Dec 2018 07:46:07 +0000 (07:46 +0000)]
Cleanup, no functional change:
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too,
instead of the old MDOC_LITERAL, which was an alias for the
former MAN_LITERAL.
Ingo Schwarze [Mon, 31 Dec 2018 07:08:12 +0000 (07:08 +0000)]
Move parsing of the .nf and .fi (fill mode) requests from the man(7)
parser to the roff(7) parser. As a side effect, .nf and .fi are
now also parsed in mdoc(7) input, though the mdoc(7) formatters
still ignore most of their effect.
Ingo Schwarze [Mon, 31 Dec 2018 04:55:46 +0000 (04:55 +0000)]
Cleanup, minus 15 LOC, no functional change:
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.
Ingo Schwarze [Sun, 30 Dec 2018 00:49:54 +0000 (00:49 +0000)]
Cleanup, no functional change:
The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.
Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.
This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.
Ingo Schwarze [Fri, 28 Dec 2018 00:15:11 +0000 (00:15 +0000)]
add some notes about using col(1) and ul(1) to process the ascii markup
since these may not be commonly known utilities;
idea from and joint work with tedu@
CV: ----------------------------------------------------------------------
Ingo Schwarze [Sun, 23 Dec 2018 22:03:32 +0000 (22:03 +0000)]
Finally, stop abusing .Ss and .Sx to mark up macros, use .Ic instead
since these are clearly commands in a domain-specific language. As
a nice side effect, the resulting list allows including the synopsis
for each macro in the item head, reducing some repetitive verbiage.
Ingo Schwarze [Sun, 23 Dec 2018 16:55:34 +0000 (16:55 +0000)]
Simplify and clarify instructions for .Ql, and deprecate .Li.
The macros .Ql, .Dl, and .Bd -literal leave no room for any
valid use case for .Li whatsoever.
General direction discussed with jmc@.
Ingo Schwarze [Fri, 21 Dec 2018 17:15:18 +0000 (17:15 +0000)]
Rename mandoc_getarg() to roff_getarg() and pass it the roff parser
struct as an argument such that after copy-in, it can call roff_expand()
once again, which used to be called roff_res() before this. This
fixes a subtle low-level roff(7) parsing bug reported by Fabio
Scotoni <fabio at esse dot ch> in the 4.4BSD-Lite2 mdoc.samples(7)
manual page, because that page used an escaped escape sequence in
a macro argument.
To expand escaped escape sequences in quoted mdoc(7) arguments, too,
stop bypassing the call to roff_getarg() in mdoc_argv.c, function args()
for this case. This does not solve the case of escaped escape sequences
in quoted .Bl -column phrases yet.
Because roff_expand() can make the string longer, roff_getarg() can no
longer operate in-place but needs to malloc(3) the returned string.
In the high-level parsers, free(3) that string after processing it.
Ingo Schwarze [Thu, 20 Dec 2018 21:30:32 +0000 (21:30 +0000)]
Move the full responsibility for reporting open(2) errors from
mparse_open() to the caller. That is better because only the caller
knows its preferred reporting method and format and only the caller
has access to all the data that should be included - like the column
number in .so processing or the current manpath in makewhatis(8).
Moving the mandoc_msg() call out is possible because the caller can
call strerror(3) just as easily as mparse_open() can.
Move mandoc_msg_setinfilename() closer to the parsing of the file
contents, to avoid problems *with* the file (like non-existence,
lack of permissions, etc.) getting misreported as problems *in*
the file.
Fix the column number reported for .so failure:
let it point to the beginning of the filename.
Taken together, this prevents makewhatis(8) from spewing confusing
messages about .so failures to stderr, a bug reported by
Raf Czlonka <rczlonka at gmail dot com> on ports@.
It also prevents mandoc(1) from issuing *two* messages for every
single .so failure.
Ingo Schwarze [Thu, 20 Dec 2018 18:24:12 +0000 (18:24 +0000)]
Explain what the fields in mandoc messages mean,
rather than merely specifying the message syntax.
Gap in documentation found while looking at a bug
report from Raf Czlonka <rczlonka at gmail dot com>.
Ingo Schwarze [Thu, 20 Dec 2018 03:41:54 +0000 (03:41 +0000)]
Bugfix:
When after a \\, \t, or \a, another \t or \a had to be resolved
in copy mode within the same argument, the argument got corrupted.
Found while working on a loosely related bug report
from Fabio Scotoni <fabio at esse dot ch>.
Ingo Schwarze [Tue, 18 Dec 2018 22:00:02 +0000 (22:00 +0000)]
As a first step towards making roff_res() callable from mandoc_getarg(),
move the function mandoc_getarg() from mandoc.c to roff.c. It was
misplaced in mandoc.c in the first place; that file is intended for
utilities needed both by parsers and by formatters, while reading
macro arguments in copy mode is purely a task of the roff(7) parser.
Needed as a preliminary for an upcoming bugfix.
No code change.
Ingo Schwarze [Sun, 16 Dec 2018 02:21:00 +0000 (02:21 +0000)]
The .HP macro was deprecated by groff, and that makes sense
because it serves no real purpose and works poorly with HTML.
While here, describe the section argument of .TH,
clarify the syntax display of .TP, and polish some wordings.
Ingo Schwarze [Sun, 16 Dec 2018 00:17:02 +0000 (00:17 +0000)]
Yet another round of improvements to manual font selection.
Unify handling of \f and .ft.
Support \f4 (bold+italic).
Support ".ft BI" and ".ft CW" for terminal output.
Support the .ft request in HTML output.
Reject the bogus fonts \f(C1, \f(C2, \f(C3, and \f(CP.
In regress.pl, only strip leading whitespace in math mode.
Ingo Schwarze [Sat, 15 Dec 2018 19:30:25 +0000 (19:30 +0000)]
Several improvements to escape sequence handling.
* Add the missing special character \_ (underscore).
* Partial implementations of \a (leader character)
and \E (uninterpreted escape character).
* Parse and ignore \r (reverse line feed).
* Add a WARNING message about undefined escape sequences.
* Add an UNSUPP message about unsupported escape sequences.
* Mark \! and \? (transparent throughput)
and \O (suppress output) as unsupported.
* Treat the various variants of zero-width spaces as one-byte escape
sequences rather than as special characters, to avoid defining bogus
forms with square brackets.
* For special characters with one-byte names, do not define bogus
forms with square brackets, except for \[-], which is valid.
* In the form with square brackets, undefined special characters do not
fall back to printing the name verbatim, not even for one-byte names.
* Starting a special character name with a blank is an error.
* Undefined escape sequences never abort formatting of the input
string, not even in HTML output mode.
* Document the newly handled escapes, and a few that were missing.
* Regression tests for most of the above.
Ingo Schwarze [Fri, 14 Dec 2018 06:33:14 +0000 (06:33 +0000)]
Cleanup, no functional change:
Now that message handling is properly encapsulated,
remove struct mparse pointers from four structs (roff, roff_man,
tbl_node, eqn_node) and from the argument lists of five functions
(roff_alloc, roff_man_alloc, mandoc_getarg, tbl_alloc, eqn_alloc).
Except for being passed to the main program as an opaque object,
it now only occurs in read.c, as it should, and not across 15 files
like in the past.
Ingo Schwarze [Fri, 14 Dec 2018 05:18:02 +0000 (05:18 +0000)]
Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.
Ingo Schwarze [Fri, 14 Dec 2018 02:16:21 +0000 (02:16 +0000)]
Fold mparse_parse_buffer() into mparse_readfd(), making the code
considerably more readable. This is possible now that i finally
deleted mparse_readmem() from mandoc portable - an unused function
that never existed in OpenBSD.
This cleanup already made me find a minor bug: after a recursive
parse, restoring the line number of the parent file was forgotten.
This is fixed now.
Ingo Schwarze [Fri, 14 Dec 2018 01:24:49 +0000 (01:24 +0000)]
Delete the function mparse_readmem() that has been unused for almost a
decade but regularly makes maintenance harder. Mandoc is not a
general-purpose library, and being as pluggable as possible is not
among the goals of the project.
Ingo Schwarze [Fri, 14 Dec 2018 01:18:25 +0000 (01:18 +0000)]
Major cleanup; may imply minor changes in edge cases of error reporting.
Finally, drop support for the run-time configurable mandocmsg()
callback. It was over-engineered from the start, never used for
anything in a decade, and repeatedly caused maintenance headaches.
Consolidate reporting infrastructure into two files, mandoc.h and
mandoc_msg.c, mopping up the bits and pieces that were scattered
around main.c, read.c, mandoc_parse.h, libmandoc.h, the prototypes
of four parsing-related functions, and both parser structs.
Ingo Schwarze [Thu, 13 Dec 2018 11:55:46 +0000 (11:55 +0000)]
Cleanup, no functional change:
Split the top level parser interface out of the utility header
mandoc.h, into a new header mandoc_parse.h, for use in the main
program and in the main parser only.
Move enum mandoc_os into roff.h because struct roff_man is the
place where it is stored.
This allows removal of mandoc.h from seven files in low-level
parsers and in formatters.