Ingo Schwarze [Tue, 10 Feb 2015 17:47:45 +0000 (17:47 +0000)]
Be more careful to not generate empty .In, .St, and .Xr nodes.
That could happen when their first argument was another called macro,
causing a NULL pointer access in .St validation found by jsg@ with afl.
Make in_line_argn() easier to understand by using one state
variable rather than two.
Ingo Schwarze [Tue, 10 Feb 2015 11:03:13 +0000 (11:03 +0000)]
Do not read past the end of the buffer if an "f" layout font modifier
is followed by the end of the input line instead of a font specifier.
Found by jsg@ with afl, test case #591.
While here, improve functionality as well:
* There is no "r" font modifier.
* Font specifiers (as opposed to font modifiers) are case sensitive.
* One-character font specifiers require trailing whitespace.
* Ignore parenthised and two-letter font specifiers.
Ingo Schwarze [Sat, 7 Feb 2015 16:42:33 +0000 (16:42 +0000)]
Closing a block validates it, which may end up deleting it,
so if we are in a loop over blocks, cleanly restart the loop
rather than risking use after free; found by jsg@ with afl.
Ingo Schwarze [Fri, 6 Feb 2015 07:13:14 +0000 (07:13 +0000)]
Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.
Ingo Schwarze [Tue, 3 Feb 2015 21:16:02 +0000 (21:16 +0000)]
Enable the integrated man(1) even when database support is disabled,
using the file system lookup fallback code, also reducing the number
of preprocessor conditional directives.
Hopefully, it will make some small Linux distros happy.
Ingo Schwarze [Tue, 3 Feb 2015 01:14:12 +0000 (01:14 +0000)]
Finally delete the kitchensink functions rew_sub() and rew_dohalt().
They were a maintenance and auditing nightmare because if you changed
one bit in there, stuff tended to break at seemingly unrelated places.
No functional change except getting rid of one bogus error message,
but minus 80 lines of code.
Ingo Schwarze [Mon, 2 Feb 2015 19:23:23 +0000 (19:23 +0000)]
Simplify and reindent make_pending(). No functional change
except that some error messages become less confusing.
Now the function is almost readable (but still requires
nineteen lines of comments for fourteen lines of code).
Ingo Schwarze [Mon, 2 Feb 2015 18:26:32 +0000 (18:26 +0000)]
Simplify: Do not call rew_dohalt() from make_pending(),
the calling macro handler already found the breaking block.
No functional change except tiny variations in error messages.
Ingo Schwarze [Mon, 2 Feb 2015 15:02:49 +0000 (15:02 +0000)]
Get rid of all remaining calls to rew_sub() where the target block
is known. This only leaves three that do actual searching.
No functional change, minus 30 lines of code.
Ingo Schwarze [Mon, 2 Feb 2015 04:26:44 +0000 (04:26 +0000)]
Get rid of all calls to rew_sub() in blk_exp_close(); only ten calls
remain in other functions. As a bonus, this fixes an assertion failure
jsg@ found some time ago with afl (test case 982) and improves minor
details in error reporting.
Ingo Schwarze [Mon, 2 Feb 2015 04:04:45 +0000 (04:04 +0000)]
When a full block macro gets closed out by a mismatching
block closure macro it calls, do not attempt to open its body.
This can for example happen for (nonsensical) constructions like
.Fo
.Nm Fc
in the SYNOPSIS. Fixing an assertion failure jsg@ found with afl
some time ago (test case number 731).
Ingo Schwarze [Sun, 1 Feb 2015 17:30:45 +0000 (17:30 +0000)]
Simplify blk_part_exp(), no functional change.
* Replace calls to rew_sub() with rew_last() - two less out of 18.
* No need to keep track of the body, it's always opened right after
the head and never used for anything in this function.
Ingo Schwarze [Sun, 1 Feb 2015 16:47:39 +0000 (16:47 +0000)]
The function rew_sub() tries to rewind any all all kinds of blocks
and elements under any and all circumstances, even handling some
bad block nesting now and then. Little surprisingly, this ends up
in excessive complexity and has caused many bugs in the past.
Start to slowly disentangle this mess by replacing calls to rew_sub()
immediately following mdoc_head_alloc() by the much simpler rew_last().
Gets rid of the first two rew_sub() calls out of twenty.
No functional change.
Ingo Schwarze [Sat, 31 Jan 2015 00:12:41 +0000 (00:12 +0000)]
Use relative offsets instead of absolute pointers for the terminal
font stack. The latter fail after the stack is grown with realloc().
Fixing an assertion failure found by jsg@ with afl some time ago
(test case number 51).
Ingo Schwarze [Fri, 30 Jan 2015 17:32:16 +0000 (17:32 +0000)]
Delete the redundant tbl span flags, just inspect the actual data
where needed, which is less fragile.
This fixes a subtle NULL pointer access to tp->tbl.cols:
Due to a bug in the man(7) parser, the first span of a table can
end up in a .TP head, in which case tblcalc() was never called.
Found by jsg@ with afl.
Ingo Schwarze [Fri, 30 Jan 2015 02:09:04 +0000 (02:09 +0000)]
Auditing the tbl(7) code for more NULL pointer accesses, i came out
empty-handed; so this is just KNF and some code simplifications,
no functional change.
Ingo Schwarze [Thu, 29 Jan 2015 00:33:57 +0000 (00:33 +0000)]
Radical cleanup of COMPATIBILITY sections:
Remove lots of lies, dozens of irrelevant implementation details,
and all references to groff versions older than 1.17. Move relevant
information to the pages where it belongs, and out of mandoc(1) in
particular. Add some missing general remarks to roff(7), where it
fits the character and purpose of the page much better.
Ingo Schwarze [Wed, 28 Jan 2015 21:11:53 +0000 (21:11 +0000)]
Clean up eqn(7) error handling:
* When "define" fails, do not drop the whole equation.
* Free memory after "undef".
* Use standard mandoc error types instead of rolling our own.
* Delete obfuscating EQN_MSG() macro.
* Add function prototypes while here.
Ingo Schwarze [Wed, 28 Jan 2015 17:32:07 +0000 (17:32 +0000)]
* Polish tbl(7) error reporting.
* Do not print out macro names in tbl(7) data blocks.
* Like with GNU tbl, let empty tables cause a blank line.
* Avoid producing empty tables in -Tman.
Ingo Schwarze [Wed, 28 Jan 2015 15:03:45 +0000 (15:03 +0000)]
For now, it can't be helped that mandoc tbl(7) ignores high-level macros,
but stop throwing away their arguments. This fixes information loss in a
handful of Xenocara manuals, at the price of a small amount of formatting
noise creeping through.
Ingo Schwarze [Tue, 27 Jan 2015 05:21:44 +0000 (05:21 +0000)]
Multiple parser and formatter fixes for line drawing in tbl(7).
* Allow mixing vertical line bars with the layout options
of the preceding layout cell.
* Correctly combine box options with layout lines.
* Correctly print vertical lines in data rows, with the right spacing.
* Correctly print cross markers and left and right ends of
horizontal lines even if vertical lines differ above and below.
* Avoid the bogus error message "no table data cells"
when a table data section starts with a horizontal line.
No increase in code size.
Ingo Schwarze [Mon, 26 Jan 2015 18:42:30 +0000 (18:42 +0000)]
Rework tbl(7) layout parsing:
* Continue parsing even if part of the input is invalid.
* Do not require whitespace between cell specifications.
* Allow tabs as well as blanks between modifiers.
* Mark the 'm' modifier as unsupported.
* Parse and ignore the 'p' and 'v' modifiers.
* Better warning and error messages.
* Get rid of a static buffer.
Improved functionality but minus 50 lines of code.
Ingo Schwarze [Mon, 26 Jan 2015 13:03:48 +0000 (13:03 +0000)]
More improvements regarding tbl(7) options.
* Treat "allbox" as an alias for "box" for now.
* Parse and ignore the GNU tbl "nowarn" option.
* For separation, allow spaces, tabs, and commas only.
* Mark eqn(7) within tbl(7) as unsupported.
* Simplify the option table.
* Improve and sort documentation.
Ingo Schwarze [Mon, 26 Jan 2015 00:57:22 +0000 (00:57 +0000)]
Improve (or rather, rewrite) tbl(7) option parsing.
* Allow the layout to start after the semicolon on the options line.
* Ignore leading commas.
* Option arguments cannot contain closing parentheses.
* Avoid needless UNSUPP messages.
* Better ERROR reporting.
* Delete unused "linesize" field in struct tbl_opts.
* No need for static buffers.
* Garbage collect one almost empty wrapper function.
Improved functionality, but minus 40 lines of code.
Ingo Schwarze [Sat, 24 Jan 2015 02:41:49 +0000 (02:41 +0000)]
Strangely, ignoring the roff(7) .na request was implemented in the man(7)
parser. Simplify the code by moving it into the roff(7) parser, also
making it work for mdoc(7).
Ingo Schwarze [Fri, 23 Jan 2015 20:18:40 +0000 (20:18 +0000)]
While ignoring the .ta (set tab stops) and .ti (temp indent) requests
is sometimes harmless, it often causes seriously ugly output,
so flag these requests as unsupported rather than ignoring them.
Discussed with naddy@.
Ingo Schwarze [Fri, 23 Jan 2015 14:21:01 +0000 (14:21 +0000)]
Let .Aq/.Ao/.Ac print "<>" instead of the normal "\(la\(ra"
when the only child is .Mt, not when the preceding node is .An,
to improve robustness. Triggered by a question from Svyatoslav
Mishyn <juef at openmailbox dot org> (Crux Linux).
Ingo Schwarze [Fri, 23 Jan 2015 00:42:00 +0000 (00:42 +0000)]
Wonders of roff(7): Integer numbers in numerical expressions can carry
scaling units, and some manuals (e.g. in devel/grcs) actually use that,
so let's support it. Missing feature reported by naddy@.
Ingo Schwarze [Thu, 22 Jan 2015 22:51:43 +0000 (22:51 +0000)]
Slightly improve \w width measurements:
Count special characters with the same width as ASCII characters
and treat all other escape sequences as if they had a width of 0.
Certainly not perfect, but a bit better.
For example, GNU RCS ci(1) needs this; reported by naddy@.
Ingo Schwarze [Thu, 22 Jan 2015 21:38:16 +0000 (21:38 +0000)]
Traditional roff(7) explicitly allows certain control characters
in the input stream (SOH, STX, ETX, ENQ, ACK, BEL, BS) for specific
purposes (leaders, backspace, delimiters, .tr), but making sure
these don't leak through to the output is tricky, so mark them as
unsupported for now.
Ingo Schwarze [Thu, 22 Jan 2015 19:26:50 +0000 (19:26 +0000)]
Don't let a failing mparse_open() clobber the filename pointer;
fixes error message content and a use after free
for .so with non-existent target when -Wall or -Tlint is given.
Ingo Schwarze [Wed, 21 Jan 2015 20:33:25 +0000 (20:33 +0000)]
Rudimentary implementation of the roff(7) \o escape sequence (overstrike).
This is of some relevance because the pod2man(1) preamble abuses it
for the icelandic letter Thorn, instead of simply using \(TP and \(Tp.
Missing feature found by sthen@ in DateTime::Locale::is_IS(3p).
Ingo Schwarze [Wed, 21 Jan 2015 19:40:54 +0000 (19:40 +0000)]
Improve overstriking. When overstriking a wider character with a
narrower one, center the latter horizontally. After a group of
characters printed in the same position, advance by the width of
the widest one among them.
Ingo Schwarze [Tue, 20 Jan 2015 21:16:51 +0000 (21:16 +0000)]
Split the -Werror message level into -Werror (broken manual, probably
using mandoc is better than using groff) and -Wunsupp (manual using
unsupported low-level roff(7) feature, probably using groff is better
than using mandoc). Once this feature is complete, it is intended
to help porting, making the decision whether to USE_GROFF easier.
As a first step, distinguish four classes of roff(7) requests:
1. Supported (currently 24 requests)
2. Currently ignored because unimportant (120) -> no message
3. Ignored for good because insecure (14) -> -Werror
4. Currently unsupported (68) -> these trigger the new -Wunsupp messages
Ingo Schwarze [Tue, 20 Jan 2015 18:21:18 +0000 (18:21 +0000)]
Make the man(1) and apropos(1) options -s and -S much less expensive:
Do not append an SQL clause looking into the large "keys" table.
Instead, filter the result of the SQL query in buildnames() where
equivalent data from the much smaller "mlinks" table is already
available for free.
This is relevant because man(1) uses the equivalent of "-S ${MACHINE}"
by default since main.c rev. 1.216, to make sure that manuals for
the current architecture are shown. With many ports installed, this
patch can speed up man(1) by a factor of more than a hundred.
Slowness reported by Theo Buehler <theo at math dot ethz dot ch>, thanks!
Ingo Schwarze [Fri, 16 Jan 2015 21:15:05 +0000 (21:15 +0000)]
Let man(1) show manuals for the current architecture by default,
and support the MACHINE environment variable as documented in man(1).
Missing feature reported by pascal@.
Ingo Schwarze [Fri, 16 Jan 2015 16:53:49 +0000 (16:53 +0000)]
Parse and ignore .IX (generate index entry) macros because pod2man(1)
emits them, by default without defining them, relying on the roff(7)
quirk that undefined macros have no effect.
Ingo Schwarze [Thu, 15 Jan 2015 04:26:39 +0000 (04:26 +0000)]
Fatal errors no longer exist.
If a file can be opened, mandoc will produce some output;
at worst, the output may be almost empty.
Simplifies error handling and frees a message type for future use.
Ingo Schwarze [Wed, 14 Jan 2015 22:02:49 +0000 (22:02 +0000)]
To get rid of SYSERR entries in enum mandocerr, downgrade problems with
missing and unreadable files from SYSERR to ERROR.
Needed for upcoming work.
As a bonus, this minimally simplifies code and documentation.
Ingo Schwarze [Wed, 14 Jan 2015 17:49:15 +0000 (17:49 +0000)]
Simplify handling of system errors: just exit(3).
We already do the same for malloc(3) failure.
The is no virtue in trying to survive failure of fork(2) and the like.
Ingo Schwarze [Wed, 7 Jan 2015 12:19:46 +0000 (12:19 +0000)]
Bugfix: When the invocation of a user-defined macro follows a roff
conditional request on the same input line, don't skip the first few
bytes of its content.
Ingo Schwarze [Sat, 3 Jan 2015 12:55:25 +0000 (12:55 +0000)]
Fix a potential NULL pointer access in an error message after waitpid()
failure; found using detailed information provided by Ulrich Spoerlein
<uqs at FreeBSD> about FreeBSD Coverity CID 1261304.
Ingo Schwarze [Sat, 3 Jan 2015 00:59:13 +0000 (00:59 +0000)]
Given the excessively technical description in the old mdoc_samples(7)
manual and its successor groff_mdoc(7), i always considered .Ql as
purely physical markup, but it turns out describing it better allows
to give it a semantic meaning (in-line literal display) that doesn't
contradict existing usage. One less physical, one more semantic
macro, yay!
Found in a discussion with Steffen Nurpmeso <sdaoden at yandex dot com>.
Ingo Schwarze [Fri, 2 Jan 2015 17:02:19 +0000 (17:02 +0000)]
Explicitly set the *data member of struct ohash_info to NULL.
It is never dereferenced, but it gets copied around, which worries
static analysis tools and might also confuse human auditors.
FreeBSD Coverity CID 1261298, 1261299, 1261300, reported by
Pedro Giffuni and Ulrich Spörlein <pfg@ and uqs@ at FreeBSD>.
Ingo Schwarze [Thu, 1 Jan 2015 19:28:49 +0000 (19:28 +0000)]
Fix a buffer overrun triggered by a trailing backslash at EOF in
an unclosed conditional body. If the memory contained the byte
sequence "\}" after the end of the buffer before the next NUL, this
could even write beyond the end of the buffer, specifically '&' to
the location of the '}'. Found by jsg@ with afl.
Ingo Schwarze [Thu, 1 Jan 2015 13:20:38 +0000 (13:20 +0000)]
If man(1) only has one single argument, always interpret it as a name,
never as a section. Who would have thought that people call their
manual pages 7z(1), 9c(1), 9p(1), and 9p(3)...
Patch from Sebastien Marie <semarie dash openbsd at latrappe dot fr>.
Ingo Schwarze [Wed, 31 Dec 2014 16:52:39 +0000 (16:52 +0000)]
When showing more than one formatted manual page, insert horizontal lines
between pages. Suggested by Theo Buehler <theo at math dot ethz dot ch>.
Even in UTF-8 output mode, do not use fancy line drawing characters such
that you can easily use /^--- to skip to the next manual in your pager.