Ingo Schwarze [Sat, 29 Aug 2015 15:28:13 +0000 (15:28 +0000)]
Including <ohash.h> requires including <stdint.h> before,
and "config.h" was missing as well.
Patch from Svyatoslav Mishyn <juef and openmailbox dot org>, Crux Linux.
Remove the hack of scrolling forward and backward with +G1G that
many (jmc@, millert@, espie@, deraadt@) considered revolting.
Instead, when using a pager, since we are using a temporary file
for tags anyway, use another temporary file for the formatted
page(s), as suggested by millert@ and similar to what the traditional
BSD man(1) did, except that we use only one single temporary output
file rather than one for each formatted manual page, such that
searching (both with / and :t) works across all the displayed files.
Simplify and make tag_put() more efficient by integrating tag_get()
into it and by only handling NUL-terminated strings.
Minus 25 lines of code, no functional change.
When creation of the temporary tags file fails, call the pager
without the -T option, because otherwise the pager won't even start.
Fixing a bug reported by jca@.
While here, shorten the code by two lines
and delete one internal interface function.
Do not fork and exec gunzip(1), just link with libz instead.
As discussed with deraadt@, that's cleaner and will help tame(2).
Something like this was also suggested earlier by bapt at FreeBSD.
Minus 50 lines of code, deleting one interface function (mparse_wait),
no functional change intended.
Insist that manual page file name extensions must begin with a digit,
lest pkg.conf(5) be shown when pkg(5) is asked for;
issue reported by Michael Reed <m dot reed at mykolab dot com>.
Initial, still somewhat experimental implementation to leverage
less(1) -T and :t ctags(1)-like functionality to jump to the
definitions of various terms inside manual pages.
To be polished in the tree, so bear with me and report issues.
Technically, if less(1) is used as a pager, information is collected
by the mdoc(7) terminal formatter, first stored using the ohash
library, then ultimately written to a temporary file which is passed
to less via -T. No change intended for other output formatters or
when running without a pager.
Based on an idea from Kristaps using feedback from many, in particular
phessler@ nicm@ millert@ halex@ doug@ kspillner@ deraadt@.
Fix the "depend" target and regenerate Makefile.depend:
* do not process the test-*.c files, they are not built via make
* add the missing compat_stringlist.c and soelim.c
* read.c now uses roff_int.h
* roff.c no longer uses libmdoc.h
Ingo Schwarze [Thu, 7 May 2015 12:08:13 +0000 (12:08 +0000)]
Do not let the -m option or MANPATH with leading, trailing, or double
colon override the default manpath, let them add to the default manpath.
Only override the default manpath by the -M option, by MANPATH without
leading, trailing, or double colon, or by "manpath" in man.conf(5).
Problem reported by Jan Stary <hans at stare dot cz>.
Patch OK'ed by millert@.
Ingo Schwarze [Fri, 1 May 2015 16:58:33 +0000 (16:58 +0000)]
mdoc_valid_post() may indirectly call roff_node_unlink() which may
set ROFF_NEXT_CHILD, which is desirable for the final call to
mdoc_valid_post() - in case the target itself gets deleted, the
parse point may need this adjustment - but not for the intermediate
calls - if intermediate nodes get deleted, that mustn't clobber the
parse point. So move setting ROFF_NEXT_SIBLING to the proper place
in rew_last().
This fixes the assertion failure in jsg@'s afl test case 108/Apr27.
Ingo Schwarze [Fri, 1 May 2015 16:02:47 +0000 (16:02 +0000)]
Setting the "last" member of struct roff_node was done at an extremely
weird place. Move it to the obviously correct place.
Surprisingly, this didn't cause any misformatting in the test suite
or in any base system manuals, but i cannot believe the code was
really correct for all conceivable input, and it would be very hard
to verify. At the very least, it cannot have worked for man(7).
Ingo Schwarze [Fri, 1 May 2015 15:27:54 +0000 (15:27 +0000)]
Minor bug fix: When .Pp rewinds .Nm, rewind the whole block,
not just the body. In some unusual edge cases, this caused
the .Pp to become a sibling of the .Nm body inside the .Nm block.
If a block body gets broken, that's no good reason to extend the
scope of the end macro. Instead, only keep the tail scope open if
the end macro macro calls an explicit macro and actually breaks
that. This corrects syntax tree structure and fixes an assertion
found by jsg@ with afl (test case 098/Apr27).
Replace the kludge for the \z escape sequence by an actual
implementation. As a side effect, minus ten lines of code.
As another side effect, this also fixes the assertion failure that
used to be triggered by "\z\o'ab'c" at the beginning of an output
line, found by jsg@ with afl (test case 022/Apr27).
Do not mark a block with the MDOC_BROKEN flag if it merely contains
a mismatching explicit end macro without actually being broken.
Avoids a subsequent upward search for the non-existent breaker
ending up in a NULL pointer access; afl test case 005/Apr27 from jsg@.
When the last line of a table layout turns out to be empty, it is deleted.
Do not just free the struct tbl_row but also make sure that no pointer
to it remains. Fixing a use after free found by jsg@ with afl.
Unify mdoc_deroff() and man_deroff() into a common function deroff().
No functional change except that for mdoc(7), it now skips leading
escape sequences just like it already did for man(7).
Escape sequences rarely occur in mdoc(7) code and if they do,
skipping them is an improvement in this context.
Minus 30 lines of code.
Avoid out-of-bounds read access before the beginning of the
mdoc_macros[] array. This sometimes prevented proper warnings
about text nodes preceding the first section header.
More than one data field may follow T} on the same input line.
Issue found by Christian Neukirchen <chneukirchen at gmail dot com>
in the socket(2) manual on Linux.
Also fixes major rendering bugs (including partial loss of content)
in XkbChangeControls(3), XkbFreeClientMap(3), XkbGetMap(3),
XkbKeyNumGroups(3), and XkbSetMap(3).
If an explicit line break request (.br or .sp) occurs within an .HP block,
the next line doesn't hang, but is simply indented.
Issue found by Christian Neukirchen <chneukirchen at gmail dot com>
in the dmsetup(8) manual on Linux.
This patch also improves the indentation of XDGA(3) and XrmGetResource(3).
Unify trickier node handling functions.
* man_elem_alloc() -> roff_elem_alloc()
* man_block_alloc() -> roff_block_alloc()
The functions mdoc_elem_alloc() and mdoc_block_alloc() remain for
now because they need to do mdoc(7)-specific argument processing.
Decouple the token code for "no request or macro" from the individual
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.
Unify node handling functions:
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.
Delete the wrapper functions mdoc_meta(), man_meta(), mdoc_node(),
man_node() from the mandoc(3) semi-public interface and the internal
wrapper functions print_mdoc() and print_man() from the HTML formatters.
Minus 60 lines of code, no functional change.
Unify {mdoc,man}_{alloc,reset,free}() into roff_man_{alloc,reset,free}().
Minus 80 lines of code, no functional change.
Written on the train from Koeln to Wolfsburg returning from p2k15.
Move mdoc_hash_init() and man_hash_init() to libmandoc.h
and call them from mparse_alloc() and choose_parser(),
preparing unified allocation of struct roff_man.
Profit from the unified struct roff_man and reduce the number of
arguments of mparse_result() by one. No functional change.
Written on the ICE Bruxelles-Koeln on the way back from p2k15.
Replace the structs mdoc and man by a unified struct roff_man.
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.
On a new RS nesting level, the saved width starts from the default
width, not from the saved width of the previous level.
Improves xterm(1) and XSetEventQueueOwner(3); found in transcode_filter(1).
Use the default width for .RS without arguments.
Reduces groff-mandoc differences in base and Xenocara by about 4%.
Found while looking at wpa_supplicant(8).
If a partial explicit block extending to the next input line follows
the end macro of a broken block, put all of it into the breaking block.
Needed for example by mutella(1).
Reduce code duplication, no functional change:
Both partial and full implicit blocks can break explicit blocks.
Put the code to handle both cases into a common function.
Arguments to end macros of broken partial explicit blocks
must go inside the breaking block. For example, in
.It Ic cmd Oo
.Ar optional_arg Oc Ar mandatory_arg
the mandatory_arg is still inside the .It block.
Used for example by mutella(1).
Give man(7) section and subsection headers hanging indentation.
Reduces groff-mandoc differences in base by about 2.5% due to
various Perl manuals having long section titles.
Quirk found in argtable2(3).
Rounding rules for horizontal scaling widths are more complicated.
There is a first rounding to basic units on the input side.
After that, rounding rules differ between requests and macros.
Requests round to the nearest possible character position.
Macros round to the next character position to the left.
Implement that by changing the return value of term_hspan()
to basic units and leaving the second scaling and rounding stage
to the formatters instead of doing it in the terminal handler.
Don't allow breaking the output line after hyphens following escape
sequences. Improves tic(1), sxpm(1), and a few Perl manuals.
Quirk found by naddy@ in milter-greylist(8).
Fix a quirk with respect to empty .HP.
Found while writing a regression test for man_macro.c rev. 1.66.
Incidentally, this brings rendering of XFreeEventData(3) closer to groff.
Vastly simplify man(7) block unwinding, similar to mdoc_macro.c 1.171.
Drop one enum type, two static functions, 70 lines of code.
Also fixes the mpeg_encode(1) manual reported broken by naddy@.
It turns out the man(7) parser suffers from unintelligible handling
of block rewinding, just like then mdoc(7) parser did.
First step in getting rid of rew_scope():
Replace the only call where the target block is known.
This commit is analogous to mdoc_macro.c rev. 1.167.
One down, three to go.
No need to hardcode /usr/bin/ as the path to more(1); helps portability.
We don't hardcode the paths to gunzip(1) and cmp(1) either.
Discussed with ajacoutot@.
Third step towards parser unification:
Replace struct mdoc_meta and struct man_meta by a unified struct roff_meta.
Written of the train from London to Exeter on the way to p2k15.
Second step towards parser unification:
Replace struct mdoc_node and struct man_node by a unified struct roff_node.
To be able to use the tok member for both mdoc(7) and man(7) without
defining all the macros in roff.h, sacrifice a tiny bit of type safety
and make tok an int rather than an enum.
Almost mechanical, no functional change.
Written on the Eurostar from Bruxelles to London on the way to p2k15.
First step towards parser unification:
Replace enum mdoc_type and enum man_type by a unified enum roff_type.
Almost mechanical, no functional change.
Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.
Let man(1) and apropos(1) work even when the current directory
is unusable: Only change back to the current directory when the
directory was changed before and the next path is relative.
This is now more similar to what makewhatis(8) does.
Issue reported by espie@.
Ingo Schwarze [Mon, 30 Mar 2015 16:06:14 +0000 (16:06 +0000)]
Escape punctuation characters that have a different meaning in -Tpdf.
~, `, and ' get translated to non-ASCII characters by most troff
implementations when generating PostScript/PDF output. When the
original ASCII character is meant, it needs to be manually escaped.
Ingo Schwarze [Fri, 27 Mar 2015 16:36:31 +0000 (16:36 +0000)]
Modernize documentation by inserting blanks between option letters
and option arguments, except for -m because "-m an" and "-m andoc"
look just too weird. Of course, the traditional form without the
blank will continue to work.
Ingo Schwarze [Fri, 27 Mar 2015 00:57:28 +0000 (00:57 +0000)]
Document that certain stand-alone accents need escaping in rare cases to
prevent them from being converted to Unicode replacements in PDF output.
Issue found by bentley@, OK jmc@ bentley@.
Ingo Schwarze [Fri, 27 Mar 2015 00:18:14 +0000 (00:18 +0000)]
Add man.conf(5). After adding some additional functionality,
one of the next steps will be to use it in addition to manpath(1)
rather than as an alternative to it.
Ingo Schwarze [Thu, 26 Mar 2015 22:42:32 +0000 (22:42 +0000)]
Add a new directive "manpath path"
to replace the legacy "_whatdb path/whatis.db".
Keep _whatdb support for backward compat, for now.
Discussed with many, jmc@ and ajacoutot@ agree with the general direction.
Ingo Schwarze [Fri, 20 Mar 2015 15:25:12 +0000 (15:25 +0000)]
Patch from Christian Neukirchen <chneukirchen at gmail dot com>:
He reports that on some platforms, it is not possible to use the
same va_list twice. So use va_copy(3) for additional safety.
Ingo Schwarze [Fri, 20 Mar 2015 12:54:22 +0000 (12:54 +0000)]
Simplify by almost halving the number of macro flags:
1. MAN_EXPLICIT was used iff fp == blk_exp, so just test fp.
2. MAN_FSCOPED was used only for TP, so just test for TP.
3. MAN_NOCLOSE was completely unused.
No functional change.
Ingo Schwarze [Thu, 19 Mar 2015 14:57:29 +0000 (14:57 +0000)]
Compat glue needed for Solaris 9 and 10.
Thanks to Sevan Janiyan <venture37 at geeklan dot co dot uk> for
reporting the Solaris 10 issues, to Jan Holzhueter <jh at opencsw
dot org> for some additional insight, and to OpenCSW in general for
providing me with a Solaris 9/10/11 testing environment.
Ingo Schwarze [Wed, 18 Mar 2015 19:29:48 +0000 (19:29 +0000)]
We always use FTS_NOCHDIR, so delete the directory changing code.
This not only simplifies matters, but also helps operating systems
lacking dirfd(3), for example Solaris 10. Solaris dirfd issue
reported by Sevan Janiyan <venture37 at geeklan dot co dot uk>.