Ingo Schwarze [Thu, 20 Nov 2014 13:56:20 +0000 (13:56 +0000)]
Prevent negative arguments to the .ll request from causing integer
underflow. Found while preparing an audit of termp.rmargin.
Overflow can also happen, but i see no sane way to deal with it,
so just let it happen. It doesn't happen for any sane input anyway,
groff behaviour is undefined, and the resulting values are legal,
even though they are useless.
Ingo Schwarze [Thu, 20 Nov 2014 00:31:28 +0000 (00:31 +0000)]
Fix two minibugs reported by Thomas Klausner <wiz at NetBSD>:
1. The first argument of .Fn is not supposed to be parsed.
2. The .Fn macro is not supposed to reopen its scope after punctuation.
Ingo Schwarze [Wed, 19 Nov 2014 20:40:51 +0000 (20:40 +0000)]
Three fixes with respect to the names table:
1. Do not mask out NAME_FIRST before its first use.
2. Avoid duplicate NAME_FILE entries.
3. Correctly mask NAME_FILE for .so links.
Ingo Schwarze [Wed, 19 Nov 2014 03:08:17 +0000 (03:08 +0000)]
Escape sequences terminate high-level macro names, and when doing so,
they are ignored, just in the same way as for request names
and for low-level macro names.
This also cures a warning in the pod2man(1) preamble.
Ingo Schwarze [Wed, 19 Nov 2014 01:20:25 +0000 (01:20 +0000)]
Support the ".if v" conditional operator (vroff mode, always false)
for groff compatibility because pod2man(1) uses it that way.
Weirdly, groff documents it as "for compatibility with other
troff versions" but neither Heirloom nor Plan 9 have it.
Issue reported by giovanni@ via sthen@.
Ingo Schwarze [Tue, 18 Nov 2014 19:41:47 +0000 (19:41 +0000)]
Ignore invalid directories in man.conf(5) and MANPATH, even if their
parent directories exist, but complain about invalid directories
given on the command line.
Intended to fix an oddity reported by sthen@.
Ingo Schwarze [Tue, 18 Nov 2014 01:15:21 +0000 (01:15 +0000)]
In man(1) mode, prefer file name matches over .Dt name matches over
first .Nm entries over other NAME .Nm entries over SYNOPSIS .Nm entries.
For example, this makes sure "man ypbind" does not return yp(8).
Re-run "makewhatis" to profit from this change.
Ingo Schwarze [Mon, 17 Nov 2014 06:44:58 +0000 (06:44 +0000)]
Multiple fixes with respect to in-line macros:
* .No selects the default font; relevant e.g. in .Bf blocks
* no need to force empty .Li elements
* closing delimiters as leading macro arguments do not suppress space
* opening delimiters at the end of a macro line do not suppress space
* correctly handle delimiter spacing in -Tman
As a side effect, these fixes let mandoc warn about empty .No macros
as requested by bentley@.
Ingo Schwarze [Sun, 16 Nov 2014 21:29:35 +0000 (21:29 +0000)]
When a line (in the sense of term_flushln()) contains white space only,
the `vbl' variable includes the left margin, but `vis' does not.
Prevent a `vis' underflow that caused a bogus blank line.
Bug reported by Carsten Kunze, found in less(1): .Bl -tag ... .It " "
Ingo Schwarze [Sun, 16 Nov 2014 20:46:21 +0000 (20:46 +0000)]
Delete five standards that are:
* not supported by groff
* not used in any OpenBSD, NetBSD, DragonFly or FreeBSD base manual
* superseded or retracted
* and more than ten years old
Triggered by a question from Carsten Kunze (Heirloom troff).
OK guenther@ jmc@
Ingo Schwarze [Fri, 14 Nov 2014 04:24:04 +0000 (04:24 +0000)]
Remove needless and harmful byte swapping on big endian architectures.
Problem found and patch provided by Martin Natano at bitrig, thanks!
Tested on macppc by natano@ and on i386, amd64, and sparc64 myself.
While here, sync with OpenBSD by removing some trailing whitespace.
Ingo Schwarze [Tue, 11 Nov 2014 19:04:55 +0000 (19:04 +0000)]
In man(1) mode without -a, stop searching after the first manual tree
that contained at least one match in order to not prefer mdoc(1) from
ports over mdoc(7). As a bonus, this results in a speedup.
Ingo Schwarze [Tue, 11 Nov 2014 02:43:41 +0000 (02:43 +0000)]
Let -h imply -c (that is, not use the pager).
Usually, -h output is short, so the pager is just a nuisance.
Also, traditional man(1) does not use a pager for -h.
Triggered by a remark of deraadt@ on ICB.
Ingo Schwarze [Mon, 10 Nov 2014 21:56:43 +0000 (21:56 +0000)]
add required vertical spacing before lists that begin
at the beginning of the first item of an enclosing list
right at the beginning of a new section or subsection;
minibug reported by Steffen Nurpmeso <sdaoden at yandex dot com>
Ingo Schwarze [Mon, 3 Nov 2014 23:18:39 +0000 (23:18 +0000)]
Allow the five man(7) font macros to concatenate their line arguments,
the same way the mdoc(7) macros marked MDOC_JOIN do it.
In -Thtml, this removes bogus <br/> when the font macros are used
in no-fill mode; issue found by jsg@ in the Xcursor(3) SYNOPSIS.
As a bonus, this slightly reduces the size of the syntax tree.
Ingo Schwarze [Sat, 1 Nov 2014 06:03:13 +0000 (06:03 +0000)]
Use struct buf in libroff, it is very natural there
and reduces the number of arguments of many functions.
While here, sprinkle some KNF.
No functional change.
Ingo Schwarze [Sat, 1 Nov 2014 04:08:43 +0000 (04:08 +0000)]
Refactor, no functional change: Remove the parse point from struct buf.
Some functions need multiple parse points, some none at all,
and it varies whether any of them need to be passed around.
So better pass them as a separate argument, and only when needed.
Ingo Schwarze [Thu, 30 Oct 2014 20:10:02 +0000 (20:10 +0000)]
Major bugsquashing with respect to -offset and -width:
1. Support specifying the .Bd and .Bl -offset as a macro default width;
while here, simplify the code handling the same for .Bl -width.
2. Correct handling of .Bl -offset arguments: unlike .Bd -offset, the
arguments "left", "indent", and "indent-two" have no special meaning.
3. Fix the scaling of string length -offset and -width arguments in -Thtml.
Triggered by an incomplete documentation patch from bentley@.
Ingo Schwarze [Wed, 29 Oct 2014 03:35:09 +0000 (03:35 +0000)]
Some fine tuning of console rendering of named special characters.
Correct ASCII rendering: \(lb \(<> \(sd
Make ASCII rendering agree with groff, using backspace overstrike:
\(da \(ua \(dA \(uA \(fa \(c* \(c+ \(ib \(ip \(/_ \(pp \(is \(dd \(dg
Ingo Schwarze [Wed, 29 Oct 2014 00:17:43 +0000 (00:17 +0000)]
In terminal output, unify handling of Unicode and numbered character
escape sequences just like it was earlier implemented for -Thtml.
Do not let control characters other than ASCII 9 (horizontal tab)
propagate to the output, even though groff allows them; but that
really doesn't look like a great idea.
Let mchars_num2char() return int such that we can distinguish invalid \N
syntax from \N'0'. This also reduces the danger of signed char issues
popping up.
Ingo Schwarze [Tue, 28 Oct 2014 18:49:33 +0000 (18:49 +0000)]
In -Tascii mode, print "<?>" only for Unicode escapes of unknown
representation, not for character escapes with unknown names.
According to groff, the latter produce no output, and we now warn
about them.
Ingo Schwarze [Tue, 28 Oct 2014 17:36:19 +0000 (17:36 +0000)]
Make the character table available to libroff so it can check the
validity of character escape names and warn about unknown ones.
This requires mchars_spec2cp() to report unknown names again.
Fortunately, that doesn't require changing the calling code because
according to groff, invalid character escapes should not produce
output anyway, and now that we warn about them, that's fine.
Ingo Schwarze [Tue, 28 Oct 2014 02:43:59 +0000 (02:43 +0000)]
Refine -Tascii rendering of Unicode characters, mostly to better agree
with groff, in particular in cases where groff uses backspace overstrike.
In two cases, agreement is impossible because groff clobbers the
previous line: \(*G \(*S
In a number of cases, groff rendering is so misleading that i chose
to render differently: \(Sd \(TP \(Tp \(Po \(ps \(sc \(r! \(r? \(de
While here, also correct the \(la and \(ra Unicode code points.
Ingo Schwarze [Mon, 27 Oct 2014 20:41:58 +0000 (20:41 +0000)]
Support overstriking by backspace in PostScript and PDF output.
Of course, this is only a minor improvement; it would be much better
to support non-ASCII characters in these output modes, but that
would require major changes that i'm not going to work on right now.
The main reason for doing this is that it allows to get ASCII output
closer to groff.
Ingo Schwarze [Mon, 27 Oct 2014 16:29:06 +0000 (16:29 +0000)]
Handle output encoding for unicode, numbered and named escape sequences
in one common, safe way instead of three different ways. In particular,
* skip NUL, it is used to mean "no output desired"
* deny 0x01-0x1F and 0x7F-0x9F, print REPLACEMENT CHARACTER instead
* print 0x20-0x7E literally or name-encoded, as required
* print characters above 0x9F numerically
Ingo Schwarze [Mon, 27 Oct 2014 13:31:04 +0000 (13:31 +0000)]
Fix a regression in term.c rev. 1.229 reported by bentley@:
In UTF-8 output, do not print anything if mchars_spec2cp() returns 0.
In particular, this repairs handling of zero-width spaces (\&).
While here, let mchars_spec2cp() return 0xFFFD instead of -1
if the character is not found, simplifying the using code.
In HTML output, do not print obfuscated ASCII characters and
do not test for one-char escapes, mchars_spec2cp() already does that.
Ingo Schwarze [Sun, 26 Oct 2014 18:07:28 +0000 (18:07 +0000)]
In -Tascii mode, provide approximations even for some Unicode escape
sequences above codepoint 512 by doing a reverse lookup in the
existing mandoc_char(7) character table.
Again, groff isn't smart enough to do this and silently discards such
escape sequences without printing anything.
Ingo Schwarze [Sun, 26 Oct 2014 17:12:03 +0000 (17:12 +0000)]
Improve -Tascii output for Unicode escape sequences: For the first 512
code points, provide ASCII approximations. This is already much better
than what groff does, which prints nothing for most code points.
A few minor fixes while here:
* Handle Unicode escape sequences in the ASCII range.
* In case of errors, use the REPLACEMENT CHARACTER U+FFFD for -Tutf8
and the string "<?>" for -Tascii output.
* Handle all one-character escape sequences in mchars_spec2{cp,str}()
and remove the workarounds on the higher level.
Ingo Schwarze [Sat, 25 Oct 2014 15:23:56 +0000 (15:23 +0000)]
With the current architecture, we can't support inline equations
inside tables, sorry. So don't even try to parse tbl(7) blocks for
eqn(7) delimiters.
Broken table layout found in glPixelMap(3) while investigating
a bug report by Theo Buehler <theo at math dot ethz dot ch>.
Ingo Schwarze [Sat, 25 Oct 2014 14:35:37 +0000 (14:35 +0000)]
Report arguments to .EQ as an error, and simplify the code:
* drop trivial wrapper function roff_openeqn()
* drop unused first arg of function eqn_alloc()
* drop usused member "name" of struct eqn_node
While here, sync to OpenBSD by killing some trailing blanks.
Ingo Schwarze [Mon, 20 Oct 2014 01:43:48 +0000 (01:43 +0000)]
show the {MDOC,MAN}_EQN node, it contains interesting information,
in particular line and column numbers and flags;
but hide the uninteresting EQN_ROOT box
Ingo Schwarze [Thu, 16 Oct 2014 01:11:20 +0000 (01:11 +0000)]
Implement in-line equations, much needed by Xenocara manuals.
Put the steering into the roff parser rather than into the mdoc
parser such that it works for all macro languages and on both text
and macro lines.
Line breaks and blank characters generated before and after in-line
equations are not perfect yet, but let's do one thing at a time.
Ingo Schwarze [Tue, 14 Oct 2014 02:16:06 +0000 (02:16 +0000)]
Rudimentary implementation of the e, x, and z table layout modifiers
to equalize, maximize, and ignore the width of columns.
Does not yet take vertical rulers into account,
and does not do line breaks within table cells.
Considerably improves the lftp(1) manual; issue noticed by sthen@.
Ingo Schwarze [Mon, 13 Oct 2014 22:00:47 +0000 (22:00 +0000)]
Properly scale string length measurements for PostScript and PDF output;
this doesn't change anything for ASCII and UTF-8.
Problem reported by bentley@.
Ingo Schwarze [Mon, 13 Oct 2014 17:17:45 +0000 (17:17 +0000)]
Stricter syntax checking of Unicode character names:
Require exactly 4, 5 or 6 hex digits and allow nothing else.
This avoids mishandling stuff like \[ua] and \C'uA' as Unicode
and also fixes underlining in eqn(7) -Thtml output which uses \[ul].
Problem found and semantics suggested by kristaps@.
Ingo Schwarze [Sun, 12 Oct 2014 19:31:41 +0000 (19:31 +0000)]
Improve error handling in the eqn(7) parser.
Get rid of the first fatal error, MANDOCERR_EQNSYNT.
In eqn(7), there is no need to be bug-compatible with groff, so there
is no need to abondon the whole equation in case of a syntax error.
In particular:
* Skip "back", "delim", "down", "fwd", "gfont", "gsize", "left",
"right", "size", and "up" without arguments.
* Skip "gsize" and "size" with a non-numeric argument.
* Skip closing delimiters that are not open.
* Skip "above" outside piles.
* For diacritic marks and binary operators without a left operand,
default to an empty box.
* Let piles and matrices take one argument rather than insisting
on a braced list. Let HTML output handle that, too.
* When rewinding, if the root box is guaranteed to match
the termination condition, no error handling is needed.
Ingo Schwarze [Sat, 11 Oct 2014 21:14:16 +0000 (21:14 +0000)]
warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@
Ingo Schwarze [Fri, 10 Oct 2014 12:19:25 +0000 (12:19 +0000)]
Make eqn(7) -Ttree output more useful:
* Reduce noise by not printing default attributes.
* Print missing "top" and "bottom" attributes.
* Print mnemonics, not code numbers for expression positions.
* Do not print unused "pile" attribute.
Re-write of eqn(7) parser and MathML output.
This adds parser-level support for the grammar described by the eqn
second-edition technical paper, "Typesetting Mathematics — User's Guide"
(Kernighan, Cherry).
The reason for this re-write is the grouping rules, which were not
possible given the existing implementation.
The re-write has also considerably simplified the HTML (and, if it ever
is completed, terminal) front-end.
Ingo Schwarze [Tue, 7 Oct 2014 14:07:03 +0000 (14:07 +0000)]
If a tbl(7) layout contains unknown font modifiers, fall back to the
default font rather than failing the whole table.
Needed by some pages in books/man-pages-posix.
Written on the plane back from EuroBSDCon in Sofia.
Crudely accomodate for matrices by way of adjacent tables. We don't do this
nicely right now because eqn uses column ordering.
Also add from/to support and to support.
Support a decent subset of eqn(7) in MathML.
This has basic support for positions (under, sup, sub, sub/sup) and piles.
It *does not* support right-left grouping (among many other things), e.g.,
Remove <p> in favour of <div class="spacer">.
This is good because <p> is brittle: it can't appear within other block
macros.
This fixes a regression of the original HTML5 patch as noted by schwarze@
on the tech@ list, 14/8/2014.
First, add space for default styling for HTML5 (non-fragment) output.
This uses a <style /> block right before the <link /> for the stylesheet.
Use this to kick out hardcoded header and footer table widths.