Ingo Schwarze [Wed, 7 Jan 2015 12:19:46 +0000 (12:19 +0000)]
Bugfix: When the invocation of a user-defined macro follows a roff
conditional request on the same input line, don't skip the first few
bytes of its content.
Ingo Schwarze [Sat, 3 Jan 2015 12:55:25 +0000 (12:55 +0000)]
Fix a potential NULL pointer access in an error message after waitpid()
failure; found using detailed information provided by Ulrich Spoerlein
<uqs at FreeBSD> about FreeBSD Coverity CID 1261304.
Ingo Schwarze [Sat, 3 Jan 2015 00:59:13 +0000 (00:59 +0000)]
Given the excessively technical description in the old mdoc_samples(7)
manual and its successor groff_mdoc(7), i always considered .Ql as
purely physical markup, but it turns out describing it better allows
to give it a semantic meaning (in-line literal display) that doesn't
contradict existing usage. One less physical, one more semantic
macro, yay!
Found in a discussion with Steffen Nurpmeso <sdaoden at yandex dot com>.
Ingo Schwarze [Fri, 2 Jan 2015 17:02:19 +0000 (17:02 +0000)]
Explicitly set the *data member of struct ohash_info to NULL.
It is never dereferenced, but it gets copied around, which worries
static analysis tools and might also confuse human auditors.
FreeBSD Coverity CID 1261298, 1261299, 1261300, reported by
Pedro Giffuni and Ulrich Spörlein <pfg@ and uqs@ at FreeBSD>.
Ingo Schwarze [Thu, 1 Jan 2015 19:28:49 +0000 (19:28 +0000)]
Fix a buffer overrun triggered by a trailing backslash at EOF in
an unclosed conditional body. If the memory contained the byte
sequence "\}" after the end of the buffer before the next NUL, this
could even write beyond the end of the buffer, specifically '&' to
the location of the '}'. Found by jsg@ with afl.
Ingo Schwarze [Thu, 1 Jan 2015 13:20:38 +0000 (13:20 +0000)]
If man(1) only has one single argument, always interpret it as a name,
never as a section. Who would have thought that people call their
manual pages 7z(1), 9c(1), 9p(1), and 9p(3)...
Patch from Sebastien Marie <semarie dash openbsd at latrappe dot fr>.
Ingo Schwarze [Wed, 31 Dec 2014 16:52:39 +0000 (16:52 +0000)]
When showing more than one formatted manual page, insert horizontal lines
between pages. Suggested by Theo Buehler <theo at math dot ethz dot ch>.
Even in UTF-8 output mode, do not use fancy line drawing characters such
that you can easily use /^--- to skip to the next manual in your pager.
Ingo Schwarze [Tue, 30 Dec 2014 20:41:00 +0000 (20:41 +0000)]
When a file is given on the command line, actually exists, and its name
relative to the respective manual tree is longer than PATH_MAX, do not
leak the memory allocated to hold the name. Not sure that can actually
happen, but better safe than sorry.
FreeBSD Coverity Scan CID 1261303, reported by Pedro Giffuni <pfg@>.
Ingo Schwarze [Sun, 28 Dec 2014 15:23:33 +0000 (15:23 +0000)]
Improve documentation of the header/footer macros .Dt, .Os, .TH:
* State the defaults for .Os and the fourth .TH argument.
* Sync the section titles, and stop advertising obscure sections that
aren't actually fully supported and certainly not recommended for use.
Ingo Schwarze [Sun, 28 Dec 2014 14:42:27 +0000 (14:42 +0000)]
mdoc(7) already uses the mandoc(1) -Ios argument in the footer line
when .Os has no argument, so do the same for man(7) when .TH has less
than four arguments; there is no reason to treat both differently.
Issue found following a question from Thomas Klausner <wiz at NetBSD>.
Ingo Schwarze [Thu, 25 Dec 2014 17:23:32 +0000 (17:23 +0000)]
Reduce memory and time consumption on certain malformed input files
by limiting the length of expanded input lines during the
(usually recursive) expansion of user defined strings.
Resource hogging found by jsg@ with afl.
Ingo Schwarze [Wed, 24 Dec 2014 23:32:42 +0000 (23:32 +0000)]
Support negative indentations for mdoc(7) displays and lists.
Not exactly recommended for use, rather for groff compatibility.
While here, introduce similar SHRT_MAX limits as in man(7),
fixing a few cases of infinite output found by jsg@ with afl.
Ingo Schwarze [Wed, 24 Dec 2014 18:04:10 +0000 (18:04 +0000)]
For .RS, we need to save the information how much we actually indented
because negative indents can get truncated, in which case we no longer
know how to restore the original indent at the end of the block.
This also solves another case of effectively infinite output found
by jsg@ with afl, triggered by very large negative indents.
Ingo Schwarze [Wed, 24 Dec 2014 15:38:55 +0000 (15:38 +0000)]
Prevent unsigned integer underflow when a number is too wide
for a table cell with an "nz" layout specification,
causing essentially infinite output as found by jsg@ with afl.
Ingo Schwarze [Wed, 24 Dec 2014 09:58:35 +0000 (09:58 +0000)]
When a man(7) document contains unreasonably large numbers for
indentations or paragraph distances, large output may be generated,
which is practically the same as an endless loop; found by jsg@
with afl.
Reject such unreasonably large numbers beyond arbitrary limits
similar to those used by groff (max. 65 blank lines between paragraphs
and max. SHRT_MAX characters per output line) and fall back to
defaults when exceeded. Having the limits behave in exactly the
same way is not relevant.
Ingo Schwarze [Tue, 23 Dec 2014 06:16:46 +0000 (06:16 +0000)]
Fix vertical scaling. Obviously, nobody ever had a serious look at this.
Basic units, centimeters, points, ens, ems, and the rounding algorithm
were all wrong, only inches, pica, and the default vertical span worked.
Ingo Schwarze [Tue, 23 Dec 2014 03:28:01 +0000 (03:28 +0000)]
In a2roffsu(), do not parse the number twice.
Gets rid of 25 lines of code and one static buffer.
No functional change for numbers shorter than BUFSIZ characters.
Ingo Schwarze [Mon, 22 Dec 2014 23:27:32 +0000 (23:27 +0000)]
The code already pays attention not to close the same block twice.
Similarly, avoid having the same block break two other blocks.
In some situations, this could lead to an endless loop in rew_sub()
found by jsg@ with afl.
Minimal example: .Po Ao Pc Bo Pc Ac Bc
Ingo Schwarze [Sun, 21 Dec 2014 14:49:28 +0000 (14:49 +0000)]
Use -m for macro set selection in mandoc(1) mode only, not in man(1)
and apropos(1) mode. While here, put a space character between
options and option arguments in error messages.
Both reported by Alessandro DE LAURENZIS <just22 dot adl at gmail dot com>.
Ingo Schwarze [Sat, 20 Dec 2014 02:26:57 +0000 (02:26 +0000)]
Fix two issues causing a class of assertion failures found by jsg@ with afl.
1) rew_sub(): Make sure REWIND_MORE is acted upon even when followed by
REWIND_NONE. This prevents .It from ending up inside other children of .Bl.
2) blk_exp_close(): Only allow extension of .Bl when it has at least
one .It. Otherwise, a broken child block could be moved in front of
the .Bl, effectively resulting in a .Bl that ended before it began.
Ingo Schwarze [Fri, 19 Dec 2014 17:12:04 +0000 (17:12 +0000)]
Enforcing an arbitrary, implementation dependent, undocumented limit
by calling assert() when valid user input exceeds it is a bad idea.
Allocate the terminal font stack dynamically instead of crashing
above 10 entries. Issue found by jsg@ with afl.
Ingo Schwarze [Fri, 19 Dec 2014 04:58:35 +0000 (04:58 +0000)]
Rewrite the low-level UTF-8 parser from scratch.
It accepted invalid byte sequences like 0xc080-c1bf, 0xe08080-e09fbf,
0xeda080-edbfbf, and 0xf0808080-f08fbfbf, produced valid roff Unicode
escape sequences from them, and the algorithm contained strong
defenses against any attempt to fix it.
This cures an assertion failure in the terminal formatter caused
by sneaking in ASCII 0x08 (backspace) by "encoding" it as an (invalid)
multibyte UTF-8 sequence, found by jsg@ with afl.
As a bonus, the new algorithm also reduces the code in the function
by about 20%.
Ingo Schwarze [Thu, 18 Dec 2014 20:15:56 +0000 (20:15 +0000)]
Only keep leading .Sm inside a list when it immediately precedes
the first .It. Otherwise, move it out together with whatever
follows. Fixing an assertion failure found by jsg@ with afl.
Ingo Schwarze [Thu, 18 Dec 2014 19:23:41 +0000 (19:23 +0000)]
When the head of a list item is extended with a partial explicit
macro (for example .Xo) and never closed again, the item ends up
without a body block. This can even happen for list types that
usually don't have heads in the first place. So even in this
case, check for the existence of the body before accessing it.
NULL pointer access found by jsg@ with afl.
Ingo Schwarze [Thu, 18 Dec 2014 03:10:11 +0000 (03:10 +0000)]
The code is already careful to not add items to lists that were
already closed. In this respect, also consider lists closed
that have broken another block, their closure pending until the
end of the broken block. This avoids syntax tree corruption
leading to a NULL pointer access found by jsg@ with afl.
Ingo Schwarze [Wed, 17 Dec 2014 18:45:35 +0000 (18:45 +0000)]
Be a bit more lenient in what to accept for section names given
as the first man(1) command line argument without -s:
Accept digits like "1", "2"; digit+letter like "3p", "1X"; and "n".
Issue reported by Svyatoslav Mishyn <juef at openmailbox dot org> (Crux Linux).
Ingo Schwarze [Tue, 16 Dec 2014 19:50:03 +0000 (19:50 +0000)]
correct -Tutf8 and -Thtml rendering of \(~=
and change the name of \(-~ to \(|= to agree with groff;
difference found by Carsten dot Kunze at arcor dot de
Ingo Schwarze [Tue, 16 Dec 2014 17:26:00 +0000 (17:26 +0000)]
Explicit block closure macros clobber next-line block head scope,
just like explicit block macros themselves.
Fixing an assertion failure jsg@ found with afl.
Ingo Schwarze [Tue, 16 Dec 2014 03:53:43 +0000 (03:53 +0000)]
When a string comparison condition contains no mismatching character
but ends without the final delimiter, the parse point was advanced
one character too far and the invalid pointer returned to the
caller of roff_parseln(). Later use could potentially advance
the pointer even further and maybe even write to it.
Fixing a buffer overrun found by jsg@ with afl (the most severe so far).
Ingo Schwarze [Tue, 16 Dec 2014 01:22:59 +0000 (01:22 +0000)]
When a numerical condition errors out after consuming at least one
character of input, treat it as false, do not retry it as a string
comparison condition. This also fixes a read buffer overrun that
happened when the numerical condition advanced to the end of the
input line before erroring out, found by jsg@ with afl.
Ingo Schwarze [Mon, 15 Dec 2014 23:43:26 +0000 (23:43 +0000)]
Empty conditions count as false.
When negated, they still count as false.
Found when investigating crashes jsg@ found with afl.
Not completely fixing the crashes yet.
Ingo Schwarze [Mon, 15 Dec 2014 18:05:57 +0000 (18:05 +0000)]
Let "man n open" do the same as "man -s n open" again, that is,
show the open(n) Tcl manual, as documented in man(1). Issue reported
by Svyatoslav Mishyn <juef at openmailbox dot org> (Crux Linux).
Ingo Schwarze [Thu, 11 Dec 2014 18:20:07 +0000 (18:20 +0000)]
Make this work on illumos:
* define MAX()
* ignore O_DIRECTORY if it isn't defined
* garbage collect two unused variables
Issues reported and fix tested by wiz@NetBSD.
Ingo Schwarze [Tue, 9 Dec 2014 07:29:42 +0000 (07:29 +0000)]
Integrate the makewhatis binary into the mandoc binary
just like we do it on OpenBSD. Smaller and neater.
While here, let ./configure set INSTALL_TARGETS.
Ingo Schwarze [Tue, 9 Dec 2014 06:11:35 +0000 (06:11 +0000)]
Install "man" as a hardlink to "mandoc" during db-install.
Install man(1) manual in db-install, not base-install.
Get rid of the useless variables BASEBIN, DBBIN, CGIBIN.
Ingo Schwarze [Fri, 5 Dec 2014 14:26:40 +0000 (14:26 +0000)]
Render text before, not after accumulating flag bits, such that flags
for different representations of the same string end up in the same
database entry. Improves name classification for 500 manuals.
Ingo Schwarze [Thu, 4 Dec 2014 02:05:42 +0000 (02:05 +0000)]
fix handling of roff requests having a default scale other than "n",
in particular .sp which uses "v", when the scale is not specified;
cures groff-mandoc differences in about a dozen Xenocara manuals
Ingo Schwarze [Thu, 4 Dec 2014 01:33:42 +0000 (01:33 +0000)]
Ignore macros that never produce any text when deciding whether
vertical whitespace is needed before a section or subsection.
Cures groff-mandoc differences in more than 300 manuals,
mostly Xenocara, some curses, a few GNU.
Ingo Schwarze [Tue, 2 Dec 2014 11:31:51 +0000 (11:31 +0000)]
Switch the default output mode from -Tascii to -Tlocale.
This doesn't change anything unless LC_CTYPE is set,
but it helps when running with LC_TYPE=something.UTF-8.
OK tedu@ and earlier positive feedback from:
bentley@ deraadt@ naddy@ stsp@ uqs@freebsd wiz@netbsd
Ingo Schwarze [Tue, 2 Dec 2014 10:08:06 +0000 (10:08 +0000)]
Fix the implementation and documentation of \c (continue text input line).
In particular, make it work in no-fill mode, too.
Reminded by Carsten dot Kunze at arcor dot de (Heirloom roff).
Ingo Schwarze [Mon, 1 Dec 2014 04:34:06 +0000 (04:34 +0000)]
The header libmandoc.h is part of the internal parser interface,
but html.c is not part of the parser at all, so it cannot include
that header, and actually, it doesn't need it.
Found while auditing includes after Theo's recent *.h commit.
Ingo Schwarze [Mon, 1 Dec 2014 04:14:14 +0000 (04:14 +0000)]
The file read.c is part of the parser, so it cannot include main.h,
which is not part of the parser. Besides, the parser *does* modify
the input buffer, so marking it "const" in the mparse_readmem()
interface is an outright lie. Fix all this by killing the const,
the UNCONST, and the bogus inclusion.
Ingo Schwarze [Sun, 30 Nov 2014 05:29:00 +0000 (05:29 +0000)]
Multiple fixes with respect to .Pf:
* The first argument of .Pf is not parsed.
* Normal delimiter handling does not apply to the first argument of .Pf.
* Warn if nothing follows a prefix (inspired by groff_mdoc(7)).
* In that case, do not suppress spacing.
Ingo Schwarze [Sat, 29 Nov 2014 03:37:44 +0000 (03:37 +0000)]
Provide a helper function macro_or_word() and use it to prune the
same chunk of argument parsing code out of five of the eight callback
functions. The other three have too much special handling to
participate.
As a bonus, let lookup() and mdoc_args() deal with line macros and
retire the lookup_raw() helper and the mdoc_zargs() internal interface
function.
No functional change, minus 40 lines of code.
Ingo Schwarze [Fri, 28 Nov 2014 23:21:32 +0000 (23:21 +0000)]
Fold the loop around mdoc_argv() into the function itself,
it was the same in all four cases. As a bonus, get rid
of one enum type that was used for internal communication.
No functional change, minus 40 lines of code.
Ingo Schwarze [Fri, 28 Nov 2014 18:57:31 +0000 (18:57 +0000)]
AT&T is unlikely to release an new version of Research UNIX any time soon.
So, it's pointless to make adding version strings easy for downstream.
One source file less to maintain.
Ingo Schwarze [Fri, 28 Nov 2014 18:36:35 +0000 (18:36 +0000)]
Retire support for CSRG supplementary document titles. These are
long obsolete and were never written in mdoc(7) in the first place.
Removes 100 lines from source files.
Ingo Schwarze [Fri, 28 Nov 2014 18:09:01 +0000 (18:09 +0000)]
Drop useless architecture table. Validating architecture names
is a job for makewhatis(8)/mandoc.db(5), not for the parser.
Removes 150 lines from source files and 4k (1%) from the binary.
Bloat found by deraadt@.