It turns out the man(7) parser suffers from unintelligible handling
of block rewinding, just like then mdoc(7) parser did.
First step in getting rid of rew_scope():
Replace the only call where the target block is known.
This commit is analogous to mdoc_macro.c rev. 1.167.
One down, three to go.
No need to hardcode /usr/bin/ as the path to more(1); helps portability.
We don't hardcode the paths to gunzip(1) and cmp(1) either.
Discussed with ajacoutot@.
Third step towards parser unification:
Replace struct mdoc_meta and struct man_meta by a unified struct roff_meta.
Written of the train from London to Exeter on the way to p2k15.
Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.
First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.
Let man(1) and apropos(1) work even when the current directory
is unusable: Only change back to the current directory when the
directory was changed before and the next path is relative.
This is now more similar to what makewhatis(8) does.
Issue reported by espie@.
Ingo Schwarze [Mon, 30 Mar 2015 16:06:14 +0000 (16:06 +0000)]
Escape punctuation characters that have a different meaning in -Tpdf.
~, `, and ' get translated to non-ASCII characters by most troff
implementations when generating PostScript/PDF output. When the
original ASCII character is meant, it needs to be manually escaped.
Ingo Schwarze [Fri, 27 Mar 2015 16:36:31 +0000 (16:36 +0000)]
Modernize documentation by inserting blanks between option letters
and option arguments, except for -m because "-m an" and "-m andoc"
look just too weird. Of course, the traditional form without the
blank will continue to work.
Ingo Schwarze [Fri, 27 Mar 2015 00:57:28 +0000 (00:57 +0000)]
Document that certain stand-alone accents need escaping in rare cases to
prevent them from being converted to Unicode replacements in PDF output.
Issue found by bentley@, OK jmc@ bentley@.
Ingo Schwarze [Fri, 27 Mar 2015 00:18:14 +0000 (00:18 +0000)]
Add man.conf(5). After adding some additional functionality,
one of the next steps will be to use it in addition to manpath(1)
rather than as an alternative to it.
Ingo Schwarze [Thu, 26 Mar 2015 22:42:32 +0000 (22:42 +0000)]
Add a new directive "manpath path"
to replace the legacy "_whatdb path/whatis.db".
Keep _whatdb support for backward compat, for now.
Discussed with many, jmc@ and ajacoutot@ agree with the general direction.
Ingo Schwarze [Fri, 20 Mar 2015 15:25:12 +0000 (15:25 +0000)]
Patch from Christian Neukirchen <chneukirchen at gmail dot com>:
He reports that on some platforms, it is not possible to use the
same va_list twice. So use va_copy(3) for additional safety.
Ingo Schwarze [Fri, 20 Mar 2015 12:54:22 +0000 (12:54 +0000)]
Simplify by almost halving the number of macro flags:
1. MAN_EXPLICIT was used iff fp == blk_exp, so just test fp.
2. MAN_FSCOPED was used only for TP, so just test for TP.
3. MAN_NOCLOSE was completely unused.
No functional change.
Ingo Schwarze [Thu, 19 Mar 2015 14:57:29 +0000 (14:57 +0000)]
Compat glue needed for Solaris 9 and 10.
Thanks to Sevan Janiyan <venture37 at geeklan dot co dot uk> for
reporting the Solaris 10 issues, to Jan Holzhueter <jh at opencsw
dot org> for some additional insight, and to OpenCSW in general for
providing me with a Solaris 9/10/11 testing environment.
Ingo Schwarze [Wed, 18 Mar 2015 19:29:48 +0000 (19:29 +0000)]
We always use FTS_NOCHDIR, so delete the directory changing code.
This not only simplifies matters, but also helps operating systems
lacking dirfd(3), for example Solaris 10. Solaris dirfd issue
reported by Sevan Janiyan <venture37 at geeklan dot co dot uk>.
Ingo Schwarze [Tue, 17 Mar 2015 07:33:07 +0000 (07:33 +0000)]
When the user exits the pager before the pager has drained all input
from man(1), man(1) dies from SIGPIPE. Exiting man(1) is fine in this
case, generating more output would be pointless, but without handling
SIGPIPE, the exit code from man(1) was wrong and csh(1) printed an
ugly message "Broken pipe". Fix this by handling SIGPIPE explicitly.
Issue noticed by deraadt@.
Ingo Schwarze [Sun, 15 Mar 2015 16:53:41 +0000 (16:53 +0000)]
Avoid off-by-one read access to the termacts array, which could
sometimes result in missing line breaks before subsection headers.
Found by carsten dot kunze at arcor dot de on SuSE 13.2.
Ingo Schwarze [Fri, 13 Mar 2015 20:20:07 +0000 (20:20 +0000)]
Remove the first comma from constructs like ", and," and ", or,":
You can use "and" and "or" to join sentence clauses,
and you can use commas, but both hinders reading;
patch from jmc@.
Ingo Schwarze [Fri, 13 Mar 2015 00:19:41 +0000 (00:19 +0000)]
Fix hardlink detection on platforms having padding in struct inodev,
typically 64bit platforms. This was basically broken since forever.
Not only is the padding used, but it was used uninitialized.
Problem reported by jmc@.
Ingo Schwarze [Wed, 11 Mar 2015 13:15:44 +0000 (13:15 +0000)]
When manpath(1) is available, enable HAVE_MANPATH even when building
without database support. Required now that we have man(1) even
without database support.
Ingo Schwarze [Tue, 10 Mar 2015 13:50:03 +0000 (13:50 +0000)]
We can keep track of the pager PID without additional complexity.
No functional change for now, but more robust in case anybody should
ever add additional child processes.
Ingo Schwarze [Tue, 10 Mar 2015 03:02:28 +0000 (03:02 +0000)]
Fix a regression caused in rev. 1.212, reported by kristaps@:
When using a pager and the first manual shown is gzip'ed,
the gunzip(1) process ended up as a child of the pager process
such that the man(1) process couldn't wait for it, preventing
proper display of the manual.
Solve this by making the pager a child of the man(1) process
(instead of the other way round), which requires being a bit
more careful about properly closing file descriptors after use
and waiting for the pager before exiting man(1).
Ingo Schwarze [Fri, 6 Mar 2015 15:48:52 +0000 (15:48 +0000)]
Fix vertical spacing at the beginning of tables.
man(7) always prints a blank line, mdoc(7) doesn't.
Problem in mdoc(7) reported by kristaps@.
mdoc(7) part of the patch tested by kristaps@.
Ingo Schwarze [Fri, 6 Mar 2015 11:03:03 +0000 (11:03 +0000)]
Flush the line preceding a table before clearing the right margin,
such that that line isn't output with unlimited width.
Problem reported and fix OK by kristaps@.
Ingo Schwarze [Mon, 2 Mar 2015 14:50:17 +0000 (14:50 +0000)]
If a non-gz manual is read after a gzipped manual, refrain
from throwing a bogus error "wait: No child processes".
As reported by Baptiste Daroussin <bapt at FreeBSD dot org>,
clearing the state variable curp->child after use was forgotten.
Ingo Schwarze [Fri, 27 Feb 2015 16:22:09 +0000 (16:22 +0000)]
When makewhatis(8) scans a tree, ignore trailing garbage on filenames.
This is relevant because some ports install files like man1/xsel.1x,
as reported by patrick keshishian <pkeshish at gmail dot com> on misc@.
We can probably improve functionality and simplify the code by ignoring
file name extensions altogether; we already know the section number from
the name of the directory. But so close to lock, i'm keeping the fix
minimal.
Ingo Schwarze [Fri, 27 Feb 2015 16:02:10 +0000 (16:02 +0000)]
When man(1) and apropos(1) look for a file man1/foo.1 but it's unavailable,
fall back to glob(man1/foo.*), which is more like what old man(1) did.
Do this both for file names from the database and for fs_lookup().
This is relevant because some ports install files like man1/xset.1x.
Regression reported by patrick keshishian <pkeshish at gmail dot com>.
Ingo Schwarze [Fri, 20 Feb 2015 23:55:10 +0000 (23:55 +0000)]
For selecting a two-digit font size, support the historic syntax \s12
in addition to the classic syntax \s(12, the modern syntax \s[12],
and the alternative syntax \s'12'. The historic syntax only works
for the font sizes 10-39.
Real-world usage found by naddy@ in plan9/rc.
Ingo Schwarze [Fri, 20 Feb 2015 22:40:38 +0000 (22:40 +0000)]
Completely delete all carriage return characters from the input.
No change to messages about them (ignore them right before line feeds,
report errors elsewhere).
naddy@ found a manual in the wild containing lots of these (ysm(1)),
and i can't imagine a situation where dropping them could be problematic.
Ingo Schwarze [Tue, 17 Feb 2015 20:37:16 +0000 (20:37 +0000)]
Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering
of .Do/.Dc, .Dq, .Lb, and .St untouched.
Reduces groff-mandoc differences in OpenBSD base by about 7%.
Reminded of the issue by naddy@.
Ingo Schwarze [Tue, 17 Feb 2015 18:09:14 +0000 (18:09 +0000)]
Cope with another one of the many kinds of DocBook stupidity:
Instead of just using .br, DocBook sometimes fiddles with the
utterly unportable internal register \n[an-break-flag] that is
only available in the GNU implementation of man(7) and then arms
an input line trap to call the equally unportable internal macro
.an-trap that, in the GNU implementation, inspects that variable;
all the world is GNU, isn't it?
Since naddy@ reports that quite a few ports manuals suffer from
this insanity, let's just translate it to the intended .br.
Ingo Schwarze [Tue, 17 Feb 2015 17:16:52 +0000 (17:16 +0000)]
Let .it accept numerical expressions, not just numerical constants.
For .it, ignore scaling units in roff_getnum().
Inside parentheses, skip whitespace after a sign in roff_getnum().
Parse and ignore unary plus in roff_getnum().
As a bonus, get rid of the only call to mandoc_strntoi() in roff.c.
Ingo Schwarze [Mon, 16 Feb 2015 16:23:54 +0000 (16:23 +0000)]
Delete the -V option. It serves no purpose but keeps confusing people.
Keeping track of the versions of installed software is the job of
the package manager, not of the individual binaries. If individual
binaries include version numbers, that tends to goad people into
writing broken configuration tests that inspect version numbers
instead of properly testing for features.
Ingo Schwarze [Sun, 15 Feb 2015 17:57:45 +0000 (17:57 +0000)]
Tweak the wording to avoid the possible misunderstanding that .In
could only be used in the SYNOPSIS section. It is fine anywhere.
Issue noticed by bentley@.
Ingo Schwarze [Thu, 12 Feb 2015 13:54:50 +0000 (13:54 +0000)]
After almost five years and 99 revisions, mdoc_macro.c rev. 1.182
finally fixed the four issues explained in the mdoc_macro.c rev. 1.83
commit message.
Ingo Schwarze [Thu, 12 Feb 2015 13:00:52 +0000 (13:00 +0000)]
Do not confuse .Bl -column lists that just broken another block
with newly opened .Bl -column lists;
fixing an assertion failure jsg@ found with afl:
test case #481, Bl It Bl -column It Bd El text text El
Ingo Schwarze [Thu, 12 Feb 2015 12:24:33 +0000 (12:24 +0000)]
Delete the mdoc_node.pending pointer and the function calculating
it, make_pending(), which was the most difficult function of the
whole mdoc(7) parser. After almost five years of maintaining this
hellhole, i just noticed the pointer isn't needed after all.
Blocks are always rewound in the reverse order they were opened;
that even holds for broken blocks. Consequently, it is sufficient
to just mark broken blogs with the flag MDOC_BROKEN and breaking
blocks with the flag MDOC_ENDED. When rewinding, instead of iterating
the pending pointers, just iterate from each broken block to its
parents, rewinding all that are MDOC_ENDED and stopping after
processing the first ancestor that it not MDOC_BROKEN. For ENDBODY
markers, use the mdoc_node.body pointer in place of the former
mdoc_node.pending.
This also fixes an assertion failure found by jsg@ with afl,
test case #467 (Bo Bl It Bd Bc It), where (surprise surprise)
the pending pointer got corrupted.
Improved functionality, minus one function, minus one struct field,
minus 50 lines of code.
Ingo Schwarze [Tue, 10 Feb 2015 17:47:45 +0000 (17:47 +0000)]
Be more careful to not generate empty .In, .St, and .Xr nodes.
That could happen when their first argument was another called macro,
causing a NULL pointer access in .St validation found by jsg@ with afl.
Make in_line_argn() easier to understand by using one state
variable rather than two.
Ingo Schwarze [Tue, 10 Feb 2015 11:03:13 +0000 (11:03 +0000)]
Do not read past the end of the buffer if an "f" layout font modifier
is followed by the end of the input line instead of a font specifier.
Found by jsg@ with afl, test case #591.
While here, improve functionality as well:
* There is no "r" font modifier.
* Font specifiers (as opposed to font modifiers) are case sensitive.
* One-character font specifiers require trailing whitespace.
* Ignore parenthised and two-letter font specifiers.
Ingo Schwarze [Sat, 7 Feb 2015 16:42:33 +0000 (16:42 +0000)]
Closing a block validates it, which may end up deleting it,
so if we are in a loop over blocks, cleanly restart the loop
rather than risking use after free; found by jsg@ with afl.
Ingo Schwarze [Fri, 6 Feb 2015 07:13:14 +0000 (07:13 +0000)]
Delete the legacy generic warning type MANDOCERR_ARGCWARN,
replacing the last instances by more specific warnings.
Improved functionality, minus 50 lines of code.
Ingo Schwarze [Tue, 3 Feb 2015 21:16:02 +0000 (21:16 +0000)]
Enable the integrated man(1) even when database support is disabled,
using the file system lookup fallback code, also reducing the number
of preprocessor conditional directives.
Hopefully, it will make some small Linux distros happy.
Ingo Schwarze [Tue, 3 Feb 2015 01:14:12 +0000 (01:14 +0000)]
Finally delete the kitchensink functions rew_sub() and rew_dohalt().
They were a maintenance and auditing nightmare because if you changed
one bit in there, stuff tended to break at seemingly unrelated places.
No functional change except getting rid of one bogus error message,
but minus 80 lines of code.