Two minor improvements:
1. If mktemp(3) fails, do not overwrite the errno because
all errors mktemp(3) might return are also valid for mkdtemp(3).
2. If mkdir(2) fails, always put back the Xes, even if
the error is fatal and the function is about to return NULL.
Quirk-compatibility with GNU tbl(1):
With the "nospaces" option, skip space characters before and after "T{",
in addition to skipping those at the beginning and end of data cells.
Minor issue reported by <Oliver dot Corff at email dot de>.
In a tbl(7) having the "nospaces" option, skip space characters
not only at the end of data cells, but also after "T}",
aligning the behaviour of the parser with GNU tbl(1).
Issue reported by <Oliver dot Corff at email dot de>.
In HTML output, in cells with an "n" (number) layout, pad numbers
on the right side with UTF-8 punctuation and figure spaces such
that numbers in different tbl(7) rows align at the decimal point.
The exact HTML output format was suggested
by <Oliver dot Corff at email dot de>;
the implementation in C is mine.
If the layout or data of an individual cell in a tbl(7) contains
only "_", "-", or "=", requesting a horizontal line to be drawn
across the middle of the cell, print <hr/> in that cell in HTML
output.
That is arguably slightly ugly because HTML 5 regards <hr/> as
semantic markup, meaning "thematic break". If somebody knowns
a better way to render a horizontal line across the middle of a
table cell with pure HTML and CSS, and without implying a specific
meaning, please tell me.
Missing feature reported by <Oliver dot Corff at email dot de>.
Fix an infinite loop that could occur during some cases of horizontally
overlapping horizontal spans. One span would calculate a desired
target width and start preparations for applying it to some columns,
then the other span would overwrite the target width with a different
value and also start preparations for applying that one to some
columns, which could sometimes confuse the code doing the final
distribution to the point of not doing anything at all before
entering the next iteration.
Fix this by making sure the distribution is done step by step, doing
one step at a time rather than allowing multiple steps to conflict.
Specifically, always do the smallest useful step first. This change
also simplifies the code. For example, the local "colwidth" array
is no longer needed.
Note that the algorithm still differs from the one implemented in
GNU tbl(1), which appears to not even try to harmonize column widths
but seems to simply distribute the same amount to all constituent
columns, no matter whether their intrinsic width is narrow or wide.
Adopting a GNU-compatible algorithm might allow further simplifiction
in addition to yielding even more similar output, but i do not want
to implement any major changes of the algorithm at this time.
The infinite loop was reported by <Oliver dot Corff at email dot de>.
Correctly calculate required column widths for tables containing
cells that horizontally span columns which contains "n" (number)
formatted cells on other rows. This requires updating total column
widths from "n" formatted cells before starting width distribution
from the spanning cells to their constituent columns.
during prioritization for man(1), correctly extract the section name
from the file name extension of gzipped manual page files; bug found
on Alpine Linux by Soeren Tempel <soeren at soeren hyphen tempel dot net>,
who also tested this patch
The official designation by AT&T was "UNIX/32V", so use that in the output.
That also makes sense because "system/architecture" is a widespread
convention to refer to the port of an operating system to a specific
architecture, in this case 32V (32bit DEC VAX).
The former wording "Version 32V AT&T UNIX" was misleading
because 32V is not a version number.
Even though UNIX/32V was not officially designated as Version 7 by AT&T,
prepend "Version 7" because it was in fact a straightforward port of
Version 7 AT&T UNIX. That makes it easier to understand for 21st
century readers of manual pages.
Suggested by nabijaczleweli at nabijaczleweli dot xyz.
Same change as in GNU troff commit 21d30728.
OK G dot Branden dot Robinson at gmail dot com (gbranden@ in groff)
In the fallback code to look for manual pages without using mandoc.db(5),
accept files "man<one-digit-section>/<name>.<full-section>"
in addition to the already supported "man<full-section>/name.[01-9]*".
Needed for example on Alpine Linux which puts its Perl manuals
into "man3/<name>.3pm" and the POSIX manuals into "man3/<name>.3p".
While here, allow the glob(3) at the end of fs_lookup() to add multiple
matches to the result set. This improves man -w output and may also
help some cases of plain man(1), allowing main() to prioritize properly
rather than fs_lookup() picking a random match.
Issue reported and patch tested
by Soeren Tempel <soeren at soeren hyphen tempel dot net>.
Ingo Schwarze [Thu, 19 Aug 2021 16:55:31 +0000 (16:55 +0000)]
do not crash when a manpath directory contains a symbolic link
that points to a directory rather than to a regular file;
bug reported by Lukas Epple <sternenseemann at systemli dot org>,
and my patch also tested by him on NixOS
Ingo Schwarze [Tue, 10 Aug 2021 12:55:03 +0000 (12:55 +0000)]
Support two-character font names (BI, CW, CR, CB, CI)
in the tbl(7) layout font modifier.
Get rid of the TBL_CELL_BOLD and TBL_CELL_ITALIC flags and use
the usual ESCAPE_FONT* enum mandoc_esc members from mandoc.h instead,
which simplifies and unifies some code.
While here, also support CB and CI in roff(7) \f escape sequences
and in roff(7) .ft requests for all output modes. Using those is
certainly not recommended because portability is limited even with
groff, but supporting them makes some existing third-party manual
pages look better, in particular in HTML output mode.
Bug-compatible with groff as far as i'm aware, except that i consider
font names starting with the '\n' (ASCII 0x0a line feed) character
so insane that i decided to not support them.
Missing feature reported by nabijaczleweli dot xyz in
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=992002.
I used none of the code from the initial patch submitted by
nabijaczleweli, but some of their ideas.
Final patch tested by them, too.
Ingo Schwarze [Sat, 7 Aug 2021 13:02:10 +0000 (13:02 +0000)]
Rename the compile-time configuration variable $HOMEBREWDIR to
$READ_ALLOWED_PATH, allow it to contain more than one directory,
and explain how to use it for NixOS and for GNU Guix Linux.
Feature improvement based on observations, input, and earlier patches
from Lukas Epple <sternenseemann at systemli dot org>, and final
patch also tested by Lukas.
Improve the description of .Fl in multiple respects and in paricular
improve the .Fl examples in multiple respects, including better
demonstrating long options.
Prompted by a question from espie@.
Feedback and OK jmc@.
This combination is somewhat rare because few libraries expose so many
global variables that they need a list to enumerate them, but when the
idiom does occur, tagging the variable names is generally useful.
For example, this helps awk(1), dc(1), make(1), rc.subr(8), ...
Missing feature reported and patch reviewed, tested, and OK'ed by kn@.
The mandoc(1) manual already mentions that -T man output mode
neither supports tbl(7) nor eqn(7) input.
If an input file contains such code anyway, tell the user
rather than failing an assert(3)ion.
Fixing a crash reported by Bjarni Ingi Gislason <bjarniig at rhi dot hi dot is>
in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=901636 which the
Debian maintainer of mandoc, Michael at Stapelberg dot ch, forwarded to me.
Ingo Schwarze [Mon, 28 Jun 2021 19:50:15 +0000 (19:50 +0000)]
In terminal output of man(7) documents, stop printing two extra blank
lines before the NAME section and before the page footer. While these
blank lines had a long tradition, they didn't really serve any purpose
and merely wasted screen real estate. Besides, this makes output from
man(7) more similar to output from mdoc(7).
This commit keeps mandoc compatible with groff-current,
where G. Branden Robinson committed the same change
on June 16 (groff commit 2278d6ed).
Ingo Schwarze [Wed, 2 Jun 2021 18:28:19 +0000 (18:28 +0000)]
In -W style mode, check .Xr links along the full manpath because
that is more useful for validating manuals of non-base software.
Nothing changes in -W all mode: by default for -T lint, we still
assume we want to check base system conventions, including usually
not wanting to link to non-base manual pages.
The use case, a partial idea how to handle it, and a preliminary
patch was originally presented by kn@, then refined by me.
Final patch tested and OK'ed by kn@.
Ingo Schwarze [Wed, 2 Jun 2021 17:51:38 +0000 (17:51 +0000)]
In revision 1.157 of cgi.c, a meta viewport element was added to
the HTML output. Let `mandoc -Thtml' behave the same, making the
generated HTML a bit more pleasant to view on a mobile device.
Patch from anton@.
Ingo Schwarze [Wed, 2 Jun 2021 16:38:29 +0000 (16:38 +0000)]
Cleanup:
1. Move invalid two-byte sequences after valid ones
and make their descriptions easier to understand.
2. Replace the wrong and confusing expression "middle byte"
with the correct term "start byte".
3. Add test lines for U+EFFFF and U+F0000.
4. Replace the unhelpful word "strange" with more descriptive terms.
Arguably, nothing about this (or maybe everything?) is strange.
Ingo Schwarze [Tue, 18 May 2021 13:22:43 +0000 (13:22 +0000)]
When looking for column separators on tbl(7) data lines, properly skip
escape sequences; do not misinterpret bytes from the middle of escape
sequence names or arguments as column separators.
Bug reported and patch tested by Oliver dot Corff at email dot de.
Ingo Schwarze [Sun, 16 May 2021 23:18:35 +0000 (23:18 +0000)]
Implement the layout specification "a" (left justify with 1em indentation)
in HTML output mode; before this patch, the indentation was missing.
Terminal output already supported the "a" specifier since 2010.
Issue reported and patch tested by Oliver dot Corff at email dot de.
Ingo Schwarze [Sun, 16 May 2021 18:11:20 +0000 (18:11 +0000)]
implement the tbl(7) layout modifiers "b" (bold) and "i" (italic)
in HTML output mode, similar to tbl_term.c, function tbl_word();
issue reported by Oliver dot Corff at email dot de
Ingo Schwarze [Sat, 15 May 2021 17:19:04 +0000 (17:19 +0000)]
When looking for the last layout row used, we need to look at the layout
row used for the previous data line containing data, not at the previous
data line outright, which might be a horizontal ruler. If it is, do not
restart from the first layout row but still proceed to the next data row,
which may have been just read from T&.
Bug originally reported by Oliver dot Corff at email dot de
on groff at gnu dot org:
https://lists.gnu.org/archive/html/groff/2021-03/msg00003.html
and forwarded to me by bentley@.
Ingo Schwarze [Tue, 30 Mar 2021 19:26:20 +0000 (19:26 +0000)]
In HTML output, correctly render .Bd -unfilled in proportionally-spaced
font, rather than with the monospace font appropriate for .Bd -literal.
This fixes a minibug reported by anton@.
Implemented by no longer relying on the typical browser default of
"pre { font-family: monospace }" but instead letting <pre> elements
inherit the font family from their parent, then adding an explicit CSS .Li
class only for those displays where the manual page author requested it
by using the -literal option on the .Bd macro.
Ingo Schwarze [Tue, 30 Mar 2021 17:16:55 +0000 (17:16 +0000)]
Append .html suffix to temporary files enabling browsers to recognise it.
Occasionally one might read a manual page in a webbrowser, e.g.
"MANPAGER=firefox man -T html jq", however temporary files created for
pagers lack file extensions and most web browsers are unable to detect a
file's content without it.
Special case mandoc(1)'s HTML output format by appending the ".html" suffix
to file names such that browsers will actually render HTML as such instead
of showing it as plain text.
Ingo Schwarze [Mon, 21 Dec 2020 15:13:09 +0000 (15:13 +0000)]
Rename syntax test of the \O escape sequence (suppress output groff
extension; mandoc only implements syntax checking but ignores the
sequence) to please Bill Gates and didickman@: avoid path names that
only differ by case, like o.in vs. O.in.
Ingo Schwarze [Sat, 31 Oct 2020 11:45:16 +0000 (11:45 +0000)]
Delete a sentence pointing to "the Predefined Strings subsection
of the roff(7) manual." Such a subsection does not exist, and i
do not see why it should. Predefined strings are an obsolete
feature of macro packages, not a feature of the roff language.
Ingo Schwarze [Fri, 30 Oct 2020 21:34:30 +0000 (21:34 +0000)]
Finally get rid of the "overflow: auto" property of ".Bl-tag > dd"
which has long been know to cause ugly and pointless scroll bars.
Matthew Martin <phy1729 at gmail dot com>
helpfully explained the following two points to me:
1. What we need to do here is establish a new block formatting
context such that the first line of the <dd> content moves down
rather than to the right if the preceding <dt> is wide.
2. A comprehensive list of methods
to establish block formatting context is available in:
https://developer.mozilla.org/en-US/docs/Web/Guide/CSS/Block_formatting_context
In that list, i found that "column-count: 1" does the job.
It is part of CSS Multi-column Layout Level 1.
While that is still in Working Draft status according to
https://www.w3.org/Style/CSS/current-work ,
it is fully supported by all browsers according to
https://developer.mozilla.org/en-US/docs/Web/CSS/column-count ,
probably because it was already part of the second draft of this
standard almost 20 years ago: WD-css3-multicol-20010118.
Ingo Schwarze [Fri, 30 Oct 2020 13:24:33 +0000 (13:24 +0000)]
Promote section headers that can can be used unmodified as fragment
identifiers from TAG_WEAK to TAG_STRONG,
such that for example ...#DESCRIPTION always works.
Suggested by Aman Verma on the discuss@ list.
Ingo Schwarze [Wed, 28 Oct 2020 15:31:37 +0000 (15:31 +0000)]
Improve the HISTORY and AUTHORS sections, using information
received from Douglas McIlroy in private mail:
https://manpages.bsd.lv/history/mcilroy_26_10_2020.txt
Ingo Schwarze [Sun, 25 Oct 2020 18:28:23 +0000 (18:28 +0000)]
The GNU tbl(1) program contained in the groff package internally
uses roff(7) tabulator settings to implement tables, and it used
to leak the changed tabulator settings from tables to the subsequent
roff(7) code. In mandoc/tbl_term.c rev. 1.54 (June 17, 2017), code
was added to be bug-compatible with groff.
In commit d0e03cf6 (Oct 20, 2020), GNU tbl(1) changed behaviour
to save the tabulator settings before starting a table and restore
them afterwards. Adjust mandoc for compatibility.
Since mandoc implements tables without using roff(7) tabulator
settings, saving and restoring tabulator settings is not needed in
mandoc. Simply deleting the code that changed tabulator settings
by reverting tbl_term.c rev. 1.54 is sufficient in mandoc.
Also adjust the desired output of the regression tests
to match the new behaviour of both groff and mandoc.
Ingo Schwarze [Sat, 24 Oct 2020 22:57:39 +0000 (22:57 +0000)]
Treat \*[.T] in the same way as \*(.T rather than calling abort(3).
Bug found because the groff-current manual pages started using the
variant form of this predefined string.
Ingo Schwarze [Fri, 16 Oct 2020 17:22:43 +0000 (17:22 +0000)]
In HTML output, avoid printing a newline right after <pre>
and right before </pre> because that resulted in vertical
whitespace not requested by the manual page author.
Formatting bug reported by
Aman Verma <amanraoverma plus vim at gmail dot com> on discuss@.
Element next-line scopes can nest. Consequently, even when closing
one element next-line scope, the MAN_ELINE flag must not yet be
cleared if the parent macro is another element macro having next-line
scope, or an assertion failure is caused if all this is wrapped in
another macro that has block next-line scope, for example .TP.
Bug found in an afl run performed by Jan Schreiber <jes at posteo dot de>.
Do not abuse assert(3) to react to absurd input; the purpose of assert(3)
only is to catch internal inconsistencies in the program itself.
Issue found in an afl run performed by Jan Schreiber <jes at posteo dot de>.
Instead, just cut down unreasonably wide spacing requested by the document
to a narrower width.
After .ti, there are many reasons why the offset may change, so setting
it back later requires a guard against underflow, or subsequent assertions
may fail.
Issue found in an afl run performed by Jan Schreiber <jes at posteo dot de>.
Fix two issues with .po (page offset) formatting:
1. Truncate excessive offsets to a width reasonable in the context
of manual pages instead of printing excessively long lines
and sometimes causing assertion failures;
found in an afl run performed by Jan Schreiber <jes at posteo dot de>.
2. Remember both the requested and the applied page offset; otherwise,
subtracting an excessive width, then adding it again, would end up
with an incorrectly large offset.
While here, simplify the code by reverting the previous offset up front,
and also add some comments to make the general ideas easier to understand.
If .ti had an excessive argument, using it was attempted, in some
cases resulting in an assertion failure. Instead, truncate the
temporary indent to a width reasonable in a manual page.
I found the issue in an afl run
that was performed by Jan Schreiber <jes at posteo dot de>.
Do not indent by SIZE_MAX/2 when .ce occurs inside explicit no-fill mode.
While here, drop two unused arguments from the function term_field();
the related work was already done by term_fill() before this commit.
I found the bug in an afl run
that was performed by Jan Schreiber <jes at posteo dot de>.
Ignore unreasonably large spacing modifiers in tbl layouts.
Jan Schreiber <jes at posteo dot de> ran afl on mandoc and it turned
out mandoc tried to use spacing modifiers so large that they would
trigger assertion failures in term_ascii.c, function locale_advance().
Ingo Schwarze [Thu, 27 Aug 2020 15:55:34 +0000 (15:55 +0000)]
Remove a lie reported by Jamie Landeg-Jones <jamie at catflap dot org>:
The times when -T man may have expanded .so requests are long gone,
nor would such a feature be useful. Use soelim(1) if you need that.
Ingo Schwarze [Thu, 27 Aug 2020 14:59:47 +0000 (14:59 +0000)]
Fix a regression caused by the insertion of two new tokens,
which unintentionally made the -O tag= argument mandatory,
breaking commands like "man -akO tag Ic=ulimit".
Noticed while answering questions from Ian Ropers.
Ingo Schwarze [Thu, 27 Aug 2020 14:28:11 +0000 (14:28 +0000)]
Make it more explicit that the statement "-O tag does not work with less(1)"
only applies to -T html output mode, and why. Of course, -O tag works
just fine with less(1) in the -T ascii and -T utf8 output modes.
Potential for confusion pointed out by Ian Ropers.
Ingo Schwarze [Thu, 27 Aug 2020 12:59:02 +0000 (12:59 +0000)]
Avoid artifacts in the most common case of closing conditional blocks
when no arguments follow the closing brace, \}.
For example, the line "'br\}" contained in the pod2man(1) preamble
would throw a bogus "escaped character not allowed in a name" error.
This issue was originally reported by Chris Bennett on ports@,
and afresh1@ noticed it came from the pod2man(1) preamble.
Ingo Schwarze [Mon, 3 Aug 2020 11:02:57 +0000 (11:02 +0000)]
Put the code handling \} into a new function roff_cond_checkend()
and call that function not only from both places where copies
existed - when processing text lines and when processing request/macro
lines in conditional block scope - but also when closing a macro
definition request, such that this construction works:
This fixes a bug reported by John Gardner <gardnerjohng at gmail dot com>.
While here, avoid a confusing decrement of the line scope counter
in roffnode_cleanscope() for conditional blocks that do not have
line scope in the first place (no functional change for this part).
Also improve validation of an internal invariant in roff_cblock()
and polish some comments.
Switch the default pager from "more -s" to "less".
POSIX explicitly allows using a different default pager if that is
documented. Nowadays, the pager provided in most operating systems
is less(1). Our man(1) implementation uses less(1) features that
traditional more(1) did not provide, in particular tagging. Besides,
as noted by deraadt@, the user interface of less(1) is slightly
more refined and preferable over the user inferface of more(1).
This switch was originally suggested by Ian Ropers.
In ./configure, test whether less(1) is available. If not, fall
back to more(1). In ./configure.local, support overriding the
automatic test by setting BINM_PAGER.
As explained by jmc@ and deraadt@, the -s flag was added a very
long time ago when an antique version of groff(1) had an annoying
bug in terminal output that would randomly display blank lines in
the middle of pages. Clearly, -s has no longer been needed for
many years, so drop it from the default pager invocation.
OK deraadt@ jmc@ martijn@ job@ on the OpenBSD version of this patch.
Ingo Schwarze [Thu, 25 Jun 2020 20:45:09 +0000 (20:45 +0000)]
Briefly mention groff_mdoc(7) below SEE ALSO. While both authoritative
manual pages document the same content, comparing can occasionally help
in cases of doubt, and some people may prefer one style, some the other.
While here, modernize a few .Lks from http:// to https://.
OK jmc@
Ingo Schwarze [Mon, 22 Jun 2020 20:00:38 +0000 (20:00 +0000)]
Provide a real feature test for __attribute__().
Looking at version numbers like __GNUC__ is always a bad idea.
Believe it or not, this even makes ./configure shorter by one line.
Ingo Schwarze [Mon, 22 Jun 2020 19:20:40 +0000 (19:20 +0000)]
Because mandoc_aux.h and mandoc.h use __attribute__, all files that
include mandoc_aux.h or mandoc.h need to include config.h, too.
It is suspected that for example IRIX needs this, or it is likely
to throw errors in these files because the system compiler doesn't
understand __attribute__.
Issue reported by Kazuo Kuroi <kazuo at irixnet dot org>.