Ingo Schwarze [Wed, 22 Feb 2017 08:54:41 +0000 (08:54 +0000)]
Handle an odd edge case where .It is preceded by .Sm.
NULL dereference in man.cgi reported by Gabriel Guzman <gabe at
guzman dash nunez dot com> on misc@.
Ingo Schwarze [Fri, 17 Feb 2017 19:15:41 +0000 (19:15 +0000)]
Use typographic quotes rather than '"' for .Rs %T (no change for -Tascii
output, of course). Patch from bentley@ in November 2014. This can be
committed now because groff merged Anthony's patch yesterday.
Ingo Schwarze [Fri, 17 Feb 2017 18:28:06 +0000 (18:28 +0000)]
Many people have been complaining for a long time that ``...'' looks
ugly in -Tascii output. For that reason, bentley@ submitted patches
to render "..." instead to groff in November 2014 (yes, more than
two years ago). Carsten Kunze yesterday merged them for the upcoming
groff-1.22.4 release. Yay!
Consequently, do the same in mandoc: Render \(Lq and \(Rq (which
are used for .Do, .Dq, .Lb, and .St) as '"' in -Tascii output.
All other output modes including -Tutf8 remain unchanged.
Ingo Schwarze [Fri, 17 Feb 2017 14:40:28 +0000 (14:40 +0000)]
Make the directory explicit where source files are located.
This is simple and seems to help the NetBSD build infrastructure.
From Christos Zoulas <christos at NetBSD>.
Ingo Schwarze [Fri, 17 Feb 2017 03:03:03 +0000 (03:03 +0000)]
Fix a read buffer overrun that copied random data from memory into
text nodes when a string passed to deroff() ended in a backslash
and the byte after the terminating NUL was non-NUL, found by tb@
with afl(1).
Invalid bytes so copied with the high bit set could later sometimes
trigger another out of bounds read access to static memory in
roff_strdup(), so add an assertion there to abort safely in case
of similar data corruption.
Ingo Schwarze [Thu, 16 Feb 2017 14:38:12 +0000 (14:38 +0000)]
Surprisingly, groff does not support scaling units in .Bl -column
column width specifiers, so stop supporting them, too.
As a side effect, this fixes an assertion failure that tb@ found
with afl(1), triggered by: .Bl -column -4n
Ingo Schwarze [Thu, 16 Feb 2017 10:56:07 +0000 (10:56 +0000)]
Fix rev. 1.280: -O syntax is different in default apropos(1) output
mode and in other output modes, so do not error out prematurely.
Also sort local variables in main() while here.
Ingo Schwarze [Thu, 16 Feb 2017 09:47:31 +0000 (09:47 +0000)]
Fix block scoping error if an explicit block is broken by two
implicit blocks (.Aq Bq Po .Pc) that left the outer breaker open
and could in exceptional cases, like between .Bl and .It, cause
tree corruption leading to NULL dereference.
Found by tb@ with afl(1).
While here, do not mark intermediate ENDBODY markers as broken.
Ingo Schwarze [Wed, 15 Feb 2017 15:58:46 +0000 (15:58 +0000)]
Style improvement, no functional change.
As reported by Yuri Pankov, some versions of GCC whine that "tmp"
might be used uninitialized in fts_open(3). Clearly, that cannot
actually happen, but explicitly setting it to NULL is safer anyway.
While here, rename the badly named variable "tmp" and make the
inner "if" easier to understand.
Ingo Schwarze [Wed, 15 Feb 2017 14:10:08 +0000 (14:10 +0000)]
Fix previous: I forgot that i had to change the convention how
a node is marked as "not a macro" when unifying the parsers.
Confirmed to work by Sevan Janiyan.
Ingo Schwarze [Sat, 11 Feb 2017 21:49:50 +0000 (21:49 +0000)]
Do not read one element past the end of the static const termacts array.
Bug found by Sevan Janiyan <venture37 at geeklan dot co dot uk>
who ran the OpenBSD mandoc test suite on Ubuntu on POWER8 (sic!)
and reported that mdoc/Sh/before.in failed in -Tman mode.
If that isn't power testing, i don't know...
Ingo Schwarze [Sat, 11 Feb 2017 17:53:33 +0000 (17:53 +0000)]
Disable three UTF-8 tests that expose bugs in wcwidth(3) in the
native C libraries of illumos, Oracle Solaris 11, and SunOS 5.10.
While it is useful to catch wcwidth(3) regressions on OpenBSD, the
purpose of the *portable* mandoc regression suite is not to check
the C library of the host system; that would just hide genuine
mandoc portability issues in the noise. The remaining UTF-8 tests
are still sufficient to establish that mandoc does the right thing.
Issues reported by Sevan Janiyan <venture37 at geeklan dot co dot uk>
after testing on OmniOS.
Ingo Schwarze [Sat, 11 Feb 2017 15:47:16 +0000 (15:47 +0000)]
Never look for broken blocks inside blocks that are already closed.
Fixes the last the of tree corruptions sometimes causing NULL dereference
reported by tb@; this one triggered in cases like: .Bl -column .It Pq Ta
Ingo Schwarze [Sat, 11 Feb 2017 14:11:17 +0000 (14:11 +0000)]
Do not prematurely close .Nd containing a broken child.
Fixes tree corruption leading to NULL dereference
in insane cases like .Oo Oo .Nd .Pq Oc .Oc Oc
found by tb@ with afl(1).
Ingo Schwarze [Sat, 11 Feb 2017 13:24:12 +0000 (13:24 +0000)]
Do not prematurely mark intermediate blocks as broken while scanning
backwards. Only do so when a block is found that is actually broken.
Logic error found while investigating crashes reported by tb@.
Ingo Schwarze [Fri, 10 Feb 2017 22:19:18 +0000 (22:19 +0000)]
For child macros of block-end macros, only scan backwards for pending
breakers unless the parent of the block is already closed. While
the scanning is needed in cases like ".Ac Bo" for broken Ao, it is
useless and crashy in cases like ".Ac Bc" for non-broken Ao.
This fixes a NULL pointer dereference that tb@ found with afl(1).
Ingo Schwarze [Fri, 10 Feb 2017 16:20:34 +0000 (16:20 +0000)]
In the SYNOPSIS, .Nm blocks can get broken if one of their children
gets broken. In that case, mark them as BROKEN and ENDED and make
sure they get closed out together with the child.
Fixes tree corruption leeding to a NULL dereference found by tb@
with afl(1) in: .Sh SYNOPSIS .Bl .Oo .Nm .Bk .Oc .It (where .Bk is
the child and .Oo is the breaker).
A simpler form of the same corruption (without crash) is visible in:
.Sh SYNOPSIS .Ao .Nm .Bo .Ac .Bc text
where the text ended up inside the .Nm (child .Bo, breaker .Ao).
Ingo Schwarze [Thu, 9 Feb 2017 20:53:33 +0000 (20:53 +0000)]
same as mandocdb.c rev. 1.196:
for portability, use (char *)NULL in execlp(3) as discussed on tech@
OpenBSD (didn't blow up anywhere yet, but better safe than sorry)
Ingo Schwarze [Thu, 9 Feb 2017 18:46:44 +0000 (18:46 +0000)]
Illumos doesn't have O_DIRECTORY. Work around that for now, may
fix it better after the 1.14.1 release. Portability issue reported
by Sevan Janiyan <venture37 at geeklan dot co dot uk>.
Ingo Schwarze [Mon, 6 Feb 2017 03:44:58 +0000 (03:44 +0000)]
The .Nm macro does not only use the default name when it has no
argument, but also when the first argument is a child macro.
Arcane issue found in the FreeBSD cxgbetool(8) manual that Baptiste
Daroussin <bapt at FreeBSD> sent me long ago for a different reason.
While solving this, switch to the new technique of doing text
production in the validator, reducing code duplication in the
formatters, which also makes -Ttree output clearer.
Ingo Schwarze [Sun, 5 Feb 2017 18:15:39 +0000 (18:15 +0000)]
Improve <table> syntax:
The <col> element can only appear inside <colgroup>, so use <colgroup>.
The <tbody> element is optional and useless, so don't use it.
Even if we would ever need <thead> or <tfoot>, <tbody> would still be
optional and useless; besides, we will likely never need <thead> or <tfoot>,
simply because our languages don't support such functionality.
Ingo Schwarze [Sat, 4 Feb 2017 11:58:09 +0000 (11:58 +0000)]
Do not fix the default indent for all subsequent files; some may use
a different macro language and hence require a different indent.
You can see the effect with "man -a 1 host hostname".
Ingo Schwarze [Fri, 3 Feb 2017 18:18:23 +0000 (18:18 +0000)]
Minor cleanup, no functional change:
We always have a roff parser, so mparse_free() does not need to check
for existence before freeing it.
Also arrange code in struct mparse, mparse_reset(), and mparse_free()
in the same order for readability.
Ingo Schwarze [Fri, 3 Feb 2017 17:56:59 +0000 (17:56 +0000)]
If an application parses multiple files with mparse_readfd(3) but
without using mparse_open(3) to open the files, and if one of the
files includes a gzip'ed file with .so, then the gzip flag remains
set and the next main file will be expected to be gzip'ed.
Fix this by clearing the gzip flag in mparse_reset(3).
Bug found and patch provided by Michael <Stapelberg at debian dot org>.
Ingo Schwarze [Mon, 30 Jan 2017 20:24:02 +0000 (20:24 +0000)]
Rework fill mode handling for -man -Thtml.
Basically, open <pre> whenever printing text in no-fill mode and it is
not already open, and close it whenever printing something that cannot
be inside <pre>.
This fixes a crash reported by Michael <Stapelberg at debian dot org>
in the French Linux chroot(2) manual and also improves rendering
for OpenBSD pages like DPMSGetTimeouts(3) and GLwDrawingArea(3).
These changes also permitted retiring struct mhtml.
Ingo Schwarze [Sat, 28 Jan 2017 23:30:08 +0000 (23:30 +0000)]
Add a warning "new sentence, new line".
This does not attempt to pinpoint each and every offender, but
instead tries very hard to avoid false positives: Currently, there
are only two false positives in the whole OpenBSD base system.
Only do this in mdoc(7), not in man(7), because manuals written
in man(7) typically have much worse problems than this.
OK jmc@ on a previous version of the patch
Ingo Schwarze [Sat, 28 Jan 2017 18:43:00 +0000 (18:43 +0000)]
.Bl -column with zero columns is legal, so don't segfalt on it.
Bug introduced in rev. 1.248 triggered for example in gssapi(3),
analyzed and reported by Michael <Stapelberg at debian dot org>.
Simplify the code a bit more while here.
Ingo Schwarze [Thu, 26 Jan 2017 18:28:18 +0000 (18:28 +0000)]
Fix -man -Thtml formatting after .nf (which has nothing to do
with "literal", by the way, it means "no fill"):
* Use <pre> such that whitespace is preserved.
* Preserve lines breaks.
* For font alternating macros, avoid node recursion which required
scary juggling with the fill state. Instead, simply print the text
children directly.
Missing feature first noticed by kristaps@ in 2011,
the again reported by afresh1@ in 2016,
and finally reported here: https://github.com/Debian/debiman/issues/21 ,
which i only found because of Shane Kerr's comment here:
https://plus.google.com/110314300533310775053/posts/H1eaw9Yskoc
Ingo Schwarze [Wed, 25 Jan 2017 02:14:43 +0000 (02:14 +0000)]
Improve HTML formatting of .Bl -tag.
In particular, when using the style sheet, put the body on the same
line as the head for short heads, or on the next line for long
heads, in a way that preserves both correct indentation and correct
vertical spacing with and without -compact, and with one or more
heads per body (hi, Zaphod) - eight use cases so far - and with and
without -tag, and with and without -offset, 32 use cases grand total.
Using many ideas from zhuk@, from <David dot Dahlberg at fkie dot
fraunhofer dot de>, and from Benny Lofgren <bl dash lists at lofgren
dot biz>, and a few of my own.
This is an excellent demonstration that CSS is an extremely hostile
language, much more trapful and much harder to use than, say, C.
When matthew@ reported this in July 2014 (!), it was already a known
issue, and i no longer remember for how long. My first serious
attempt at fixing it (in November 2015) failed miserably. I'd love
to see simplifications of both the generated HTML code and of the
style sheet, but without breaking any of the 32 use cases, please.