Ingo Schwarze [Sat, 11 Feb 2017 14:11:17 +0000 (14:11 +0000)]
Do not prematurely close .Nd containing a broken child.
Fixes tree corruption leading to NULL dereference
in insane cases like .Oo Oo .Nd .Pq Oc .Oc Oc
found by tb@ with afl(1).
Ingo Schwarze [Sat, 11 Feb 2017 13:24:12 +0000 (13:24 +0000)]
Do not prematurely mark intermediate blocks as broken while scanning
backwards. Only do so when a block is found that is actually broken.
Logic error found while investigating crashes reported by tb@.
Ingo Schwarze [Fri, 10 Feb 2017 22:19:18 +0000 (22:19 +0000)]
For child macros of block-end macros, only scan backwards for pending
breakers unless the parent of the block is already closed. While
the scanning is needed in cases like ".Ac Bo" for broken Ao, it is
useless and crashy in cases like ".Ac Bc" for non-broken Ao.
This fixes a NULL pointer dereference that tb@ found with afl(1).
Ingo Schwarze [Fri, 10 Feb 2017 16:20:34 +0000 (16:20 +0000)]
In the SYNOPSIS, .Nm blocks can get broken if one of their children
gets broken. In that case, mark them as BROKEN and ENDED and make
sure they get closed out together with the child.
Fixes tree corruption leeding to a NULL dereference found by tb@
with afl(1) in: .Sh SYNOPSIS .Bl .Oo .Nm .Bk .Oc .It (where .Bk is
the child and .Oo is the breaker).
A simpler form of the same corruption (without crash) is visible in:
.Sh SYNOPSIS .Ao .Nm .Bo .Ac .Bc text
where the text ended up inside the .Nm (child .Bo, breaker .Ao).
Ingo Schwarze [Thu, 9 Feb 2017 20:53:33 +0000 (20:53 +0000)]
same as mandocdb.c rev. 1.196:
for portability, use (char *)NULL in execlp(3) as discussed on tech@
OpenBSD (didn't blow up anywhere yet, but better safe than sorry)
Ingo Schwarze [Thu, 9 Feb 2017 18:46:44 +0000 (18:46 +0000)]
Illumos doesn't have O_DIRECTORY. Work around that for now, may
fix it better after the 1.14.1 release. Portability issue reported
by Sevan Janiyan <venture37 at geeklan dot co dot uk>.
Ingo Schwarze [Mon, 6 Feb 2017 03:44:58 +0000 (03:44 +0000)]
The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.
While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.
Ingo Schwarze [Sun, 5 Feb 2017 18:15:39 +0000 (18:15 +0000)]
Improve <table> syntax:
The <col> element can only appear inside <colgroup>, so use <colgroup>.
The <tbody> element is optional and useless, so don't use it.
Even if we would ever need <thead> or <tfoot>, <tbody> would still be
optional and useless; besides, we will likely never need <thead> or <tfoot>,
simply because our languages don't support such functionality.
Ingo Schwarze [Sat, 4 Feb 2017 11:58:09 +0000 (11:58 +0000)]
Do not fix the default indent for all subsequent files; some may use
a different macro language and hence require a different indent.
You can see the effect with "man -a 1 host hostname".
Ingo Schwarze [Fri, 3 Feb 2017 18:18:23 +0000 (18:18 +0000)]
Minor cleanup, no functional change:
We always have a roff parser, so mparse_free() does not need to check
for existence before freeing it.
Also arrange code in struct mparse, mparse_reset(), and mparse_free()
in the same order for readability.
Ingo Schwarze [Fri, 3 Feb 2017 17:56:59 +0000 (17:56 +0000)]
If an application parses multiple files with mparse_readfd(3) but
without using mparse_open(3) to open the files, and if one of the
files includes a gzip'ed file with .so, then the gzip flag remains
set and the next main file will be expected to be gzip'ed.
Fix this by clearing the gzip flag in mparse_reset(3).
Bug found and patch provided by Michael <Stapelberg at debian dot org>.
Ingo Schwarze [Mon, 30 Jan 2017 20:24:02 +0000 (20:24 +0000)]
Rework fill mode handling for -man -Thtml.
Basically, open <pre> whenever printing text in no-fill mode and it is
not already open, and close it whenever printing something that cannot
be inside <pre>.
This fixes a crash reported by Michael <Stapelberg at debian dot org>
in the French Linux chroot(2) manual and also improves rendering
for OpenBSD pages like DPMSGetTimeouts(3) and GLwDrawingArea(3).
These changes also permitted retiring struct mhtml.
Ingo Schwarze [Sat, 28 Jan 2017 23:30:08 +0000 (23:30 +0000)]
Add a warning "new sentence, new line".
This does not attempt to pinpoint each and every offender, but
instead tries very hard to avoid false positives: Currently, there
are only two false positives in the whole OpenBSD base system.
Only do this in mdoc(7), not in man(7), because manuals written
in man(7) typically have much worse problems than this.
OK jmc@ on a previous version of the patch
Ingo Schwarze [Sat, 28 Jan 2017 18:43:00 +0000 (18:43 +0000)]
.Bl -column with zero columns is legal, so don't segfalt on it.
Bug introduced in rev. 1.248 triggered for example in gssapi(3),
analyzed and reported by Michael <Stapelberg at debian dot org>.
Simplify the code a bit more while here.
Ingo Schwarze [Thu, 26 Jan 2017 18:28:18 +0000 (18:28 +0000)]
Fix -man -Thtml formatting after .nf (which has nothing to do
with "literal", by the way, it means "no fill"):
* Use <pre> such that whitespace is preserved.
* Preserve lines breaks.
* For font alternating macros, avoid node recursion which required
scary juggling with the fill state. Instead, simply print the text
children directly.
Missing feature first noticed by kristaps@ in 2011,
the again reported by afresh1@ in 2016,
and finally reported here: https://github.com/Debian/debiman/issues/21 ,
which i only found because of Shane Kerr's comment here:
https://plus.google.com/110314300533310775053/posts/H1eaw9Yskoc
Ingo Schwarze [Wed, 25 Jan 2017 02:14:43 +0000 (02:14 +0000)]
Improve HTML formatting of .Bl -tag.
In particular, when using the style sheet, put the body on the same
line as the head for short heads, or on the next line for long
heads, in a way that preserves both correct indentation and correct
vertical spacing with and without -compact, and with one or more
heads per body (hi, Zaphod) - eight use cases so far - and with and
without -tag, and with and without -offset, 32 use cases grand total.
Using many ideas from zhuk@, from <David dot Dahlberg at fkie dot
fraunhofer dot de>, and from Benny Lofgren <bl dash lists at lofgren
dot biz>, and a few of my own.
This is an excellent demonstration that CSS is an extremely hostile
language, much more trapful and much harder to use than, say, C.
When matthew@ reported this in July 2014 (!), it was already a known
issue, and i no longer remember for how long. My first serious
attempt at fixing it (in November 2015) failed miserably. I'd love
to see simplifications of both the generated HTML code and of the
style sheet, but without breaking any of the 32 use cases, please.
Ingo Schwarze [Thu, 19 Jan 2017 01:00:14 +0000 (01:00 +0000)]
Implement line breaking of the generated HTML code at space characters
in filled text. This does not affect HTML semantics, but makes the
HTML code even more humanly readable.
While here,
- collapse multiple consecutive space characters in filled text
- and insert a blank between style entries.
Ingo Schwarze [Wed, 18 Jan 2017 19:22:21 +0000 (19:22 +0000)]
Make HTML output more human readable by overhauling line break logic
around tags and by introducing some simple indentation.
No change of HTML semantics intended.
Ingo Schwarze [Tue, 17 Jan 2017 15:32:43 +0000 (15:32 +0000)]
Completely delete the buf field of struct html and all the buf*()
interfaces. Such a static buffer was a bad idea in the first place,
causing unfixable truncation that was only prevented by triggering
an assertion failure. Instead, let the small number of remaining
users allocate and free their own, temporary dynamic buffers,
or for the case of .Xr and .In, pass the original data to be
assembled in print_otag().
Ingo Schwarze [Sun, 15 Jan 2017 15:28:55 +0000 (15:28 +0000)]
When looking up macro values while the macro tables are being built
in makewhatis(8), use ohash rather than linear searches.
This was identified as the main makewhatis(8) performance bottleneck
by Baptiste Daroussin <bapt at FreeBSD>, who also suggested part
of the improved algorithm.
This reduces the run time of "makewhatis /usr/share/man" from eleven
to five seconds on my notebook. Note that the changed code is not
used in apropos(1), so don't expect speedups there.
While here, sort macro values asciibetically, to improve reproducibility -
which still isn't perfect, but getting better.
Ingo Schwarze [Thu, 12 Jan 2017 18:02:20 +0000 (18:02 +0000)]
Skipping all escape sequences at the beginning of strings in deroff()
was too aggressive. There are strings that legitimately begin with
an escape sequence. Only skip leading escape sequences representing
whitespace.
Ingo Schwarze [Thu, 12 Jan 2017 15:45:05 +0000 (15:45 +0000)]
Put compiler arguments that may contain -l at the end; according to
the people at Alpine Linux, gcc 6 seems to fail when it's at the
beginning. From Daniel Sabogal via http://git.alpinelinux.org.
Ingo Schwarze [Wed, 11 Jan 2017 17:39:53 +0000 (17:39 +0000)]
Do text production for .Bt, .Ex, .Rv, .Ud at the validation stage
rather than in the formatters. Use NODE_NOSRC flag for .Lb and
NODE_NOSRC and NODE_NOPRT for .St. Results in a more rigorous
syntax tree and in 135 lines less code.
This work was triggered by a question from Abhinav Upadhyay <er dot
abhinav dot upadhyay at gmail dot com> (NetBSD) on discuss@.
Ingo Schwarze [Tue, 10 Jan 2017 21:59:47 +0000 (21:59 +0000)]
For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.
Ingo Schwarze [Tue, 10 Jan 2017 12:53:07 +0000 (12:53 +0000)]
Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.
These will help to make handling of text production macros more rigorous.
Ingo Schwarze [Mon, 9 Jan 2017 17:49:57 +0000 (17:49 +0000)]
Use stdout rather than stdin for controlling the terminal
such that "cat foo.mdoc | man -l" works.
Issue reported by Christian Neukirchen <chneukirchen at gmail dot com>
and also tested by him on Void Linux with both glibc and musl.
The patch makes sense to millert@.
Ingo Schwarze [Mon, 9 Jan 2017 12:48:58 +0000 (12:48 +0000)]
The .No macro is not supposed to produce fixed-width font, it is not
the same as .Li, so don't use <code>.
Bug reported by <Anton dot Lindqvist at gmail dot com> on tech@.
Ingo Schwarze [Mon, 9 Jan 2017 01:37:03 +0000 (01:37 +0000)]
Warnings and errors that occur during mdoc_validate()
or during man_validate() have to affect the mandoc(1) EXIT STATUS.
Many thanks to <Yuri dot Pankov at gmail dot com> (illumos developer)
for reporting this regression.
Ingo Schwarze [Sun, 8 Jan 2017 22:51:55 +0000 (22:51 +0000)]
Indentation must be measured in units of the surrounding text,
not in units of the contained text. Consequently, "display"
and "lit" class tags must not be on the same element: First,
"display" must set up the indentation, still using the outer
units, and only after that, "lit" may change the font.
This fixes .Bd -literal which got the wrong indentation.
Bug reported by tb@.
Ingo Schwarze [Sun, 8 Jan 2017 02:01:17 +0000 (02:01 +0000)]
Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.
Ingo Schwarze [Sun, 8 Jan 2017 00:11:23 +0000 (00:11 +0000)]
Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd