Ingo Schwarze [Sun, 26 Oct 2014 17:12:03 +0000 (17:12 +0000)]
Improve -Tascii output for Unicode escape sequences: For the first 512
code points, provide ASCII approximations. This is already much better
than what groff does, which prints nothing for most code points.
A few minor fixes while here:
* Handle Unicode escape sequences in the ASCII range.
* In case of errors, use the REPLACEMENT CHARACTER U+FFFD for -Tutf8
and the string "<?>" for -Tascii output.
* Handle all one-character escape sequences in mchars_spec2{cp,str}()
and remove the workarounds on the higher level.
Ingo Schwarze [Sat, 25 Oct 2014 15:23:56 +0000 (15:23 +0000)]
With the current architecture, we can't support inline equations
inside tables, sorry. So don't even try to parse tbl(7) blocks for
eqn(7) delimiters.
Broken table layout found in glPixelMap(3) while investigating
a bug report by Theo Buehler <theo at math dot ethz dot ch>.
Ingo Schwarze [Sat, 25 Oct 2014 14:35:37 +0000 (14:35 +0000)]
Report arguments to .EQ as an error, and simplify the code:
* drop trivial wrapper function roff_openeqn()
* drop unused first arg of function eqn_alloc()
* drop usused member "name" of struct eqn_node
While here, sync to OpenBSD by killing some trailing blanks.
Ingo Schwarze [Mon, 20 Oct 2014 01:43:48 +0000 (01:43 +0000)]
show the {MDOC,MAN}_EQN node, it contains interesting information,
in particular line and column numbers and flags;
but hide the uninteresting EQN_ROOT box
Ingo Schwarze [Thu, 16 Oct 2014 01:11:20 +0000 (01:11 +0000)]
Implement in-line equations, much needed by Xenocara manuals.
Put the steering into the roff parser rather than into the mdoc
parser such that it works for all macro languages and on both text
and macro lines.
Line breaks and blank characters generated before and after in-line
equations are not perfect yet, but let's do one thing at a time.
Ingo Schwarze [Tue, 14 Oct 2014 02:16:06 +0000 (02:16 +0000)]
Rudimentary implementation of the e, x, and z table layout modifiers
to equalize, maximize, and ignore the width of columns.
Does not yet take vertical rulers into account,
and does not do line breaks within table cells.
Considerably improves the lftp(1) manual; issue noticed by sthen@.
Ingo Schwarze [Mon, 13 Oct 2014 22:00:47 +0000 (22:00 +0000)]
Properly scale string length measurements for PostScript and PDF output;
this doesn't change anything for ASCII and UTF-8.
Problem reported by bentley@.
Ingo Schwarze [Mon, 13 Oct 2014 17:17:45 +0000 (17:17 +0000)]
Stricter syntax checking of Unicode character names:
Require exactly 4, 5 or 6 hex digits and allow nothing else.
This avoids mishandling stuff like \[ua] and \C'uA' as Unicode
and also fixes underlining in eqn(7) -Thtml output which uses \[ul].
Problem found and semantics suggested by kristaps@.
Ingo Schwarze [Sun, 12 Oct 2014 19:31:41 +0000 (19:31 +0000)]
Improve error handling in the eqn(7) parser.
Get rid of the first fatal error, MANDOCERR_EQNSYNT.
In eqn(7), there is no need to be bug-compatible with groff, so there
is no need to abondon the whole equation in case of a syntax error.
In particular:
* Skip "back", "delim", "down", "fwd", "gfont", "gsize", "left",
"right", "size", and "up" without arguments.
* Skip "gsize" and "size" with a non-numeric argument.
* Skip closing delimiters that are not open.
* Skip "above" outside piles.
* For diacritic marks and binary operators without a left operand,
default to an empty box.
* Let piles and matrices take one argument rather than insisting
on a braced list. Let HTML output handle that, too.
* When rewinding, if the root box is guaranteed to match
the termination condition, no error handling is needed.
Ingo Schwarze [Sat, 11 Oct 2014 21:14:16 +0000 (21:14 +0000)]
warn about parentheses in function names after .Fn and .Fo;
particularly useful when converting from other languages to mdoc(7);
feature suggested by bentley@
Ingo Schwarze [Fri, 10 Oct 2014 12:19:25 +0000 (12:19 +0000)]
Make eqn(7) -Ttree output more useful:
* Reduce noise by not printing default attributes.
* Print missing "top" and "bottom" attributes.
* Print mnemonics, not code numbers for expression positions.
* Do not print unused "pile" attribute.
Re-write of eqn(7) parser and MathML output.
This adds parser-level support for the grammar described by the eqn
second-edition technical paper, "Typesetting Mathematics — User's Guide"
(Kernighan, Cherry).
The reason for this re-write is the grouping rules, which were not
possible given the existing implementation.
The re-write has also considerably simplified the HTML (and, if it ever
is completed, terminal) front-end.
Ingo Schwarze [Tue, 7 Oct 2014 14:07:03 +0000 (14:07 +0000)]
If a tbl(7) layout contains unknown font modifiers, fall back to the
default font rather than failing the whole table.
Needed by some pages in books/man-pages-posix.
Written on the plane back from EuroBSDCon in Sofia.
Crudely accomodate for matrices by way of adjacent tables. We don't do this
nicely right now because eqn uses column ordering.
Also add from/to support and to support.
Support a decent subset of eqn(7) in MathML.
This has basic support for positions (under, sup, sub, sub/sup) and piles.
It *does not* support right-left grouping (among many other things), e.g.,
Remove <p> in favour of <div class="spacer">.
This is good because <p> is brittle: it can't appear within other block
macros.
This fixes a regression of the original HTML5 patch as noted by schwarze@
on the tech@ list, 14/8/2014.
First, add space for default styling for HTML5 (non-fragment) output.
This uses a <style /> block right before the <link /> for the stylesheet.
Use this to kick out hardcoded header and footer table widths.
Five year old typo reported by Theo Buehler at math dot ethz dot ch, thanks.
I nearly asked: ``What's wrong with it? It formats as "intended".''
(However, what Kristaps intended to write was "indented".)
Support backslash-escaping of white space in the query expression,
to be more similar to apropos(1) called from the shell.
Missing feature reported by Marcus MERIGHI <mcmer dash openbsd at
tor dot at> on misc@.
If a manpath directory (for example, a _whatdb entry from man.conf(5)
or an entry in the MANPATH environment variable) does not exist,
silently skip it. This brings makewhatis(8) back closer to the
behaviour of espie@'s version and ought to shut up the weekly(8)
whining observed by henning@ on machines not having xbase installed.
Also, don't error out after the first unusable manpath entry, still
try the others.
Of course, still complain about non-existent directories specified
on the command line and about any directories failing for other
reasons than ENOENT.
Do not report a page as arch=any merely because .Dt lacks the third argument.
Pages found outside arch-specific dirs still get arch=any, of course.
Issue reported by justinhenryhaynes at gmail dot com on misc@, thanks!
Simplify by handling empty request lines at the one logical place
in the roff parser instead of in three other places in other parsers.
No functional change.
Move main format autodetection from the parser dispatcher to the
roff parser where .Dd and .TH are already detected, anyway. This
improves robustness because it correctly handles whitespace or an
alternate control character before Dd. In the parser dispatcher,
provide a fallback looking ahead in the input buffer instead of
always assuming man(7). This corrects autodetection when Dd is
preceded by other macros or macro-like handled requests like .ll.
Triggered by reports from Daniel Levai about issues on Slackware Linux.
If a manual page is installed gzip(1)ed, let makewhatis(8) take
note in mandoc.db(5), such that man(1) -w and apropos(1) -w can
report the correct filename.
This is a prerequisite for letting apropos -a and man support
gzip'ed manuals in the future, which doesn't work yet.
Implement the traditional -h option for man(1): show the SYNOPSIS only.
As usual, we get mandoc -h and apropos -h for free.
Try stuff like "apropos -h In=dirent" or "apropos -h Fa=timespec".
Only useful for terminal output, so -Tps, -Tpdf, -Thtml ignore -h for now.
When makewhatis(8) finds an .so link after the manual being pointed to
has already been processed, add the file names to the names table, too,
not just to the mlinks table.
This fixes a bug where apropos(1) and the new man(1) wouldn't find some
of the Xenocara manuals via some of their .so links. After rebuilding,
run "makewhatis /usr/X11R6/man" or just wait for weekly(8).
In man(1) mode, change to the right directory before starting the parser,
just like traditional man(1) does, such that .so links have a chance to
work. After this point, we don't need the current directory for anything
else before exit, so we don't need to worry about getting back and we can
safely ignore failure.
Ingo Schwarze [Sat, 30 Aug 2014 18:08:10 +0000 (18:08 +0000)]
Introduce a man(1) -l option as an alias for mandoc -a.
Basically, this does the same as man -l in Linux man-db.
The point is that now all functionality of the combined tool
is reachable from the man(1) command name:
apropos = man -k, whatis = man -f, mandoc = man -cl.
Originally suggested by Carsten dot Kunze at arcor dot de,
current maintainer of the Heirloom Documentation Tools.
While here, add various missing information to the usage()
and to the manuals.
Ingo Schwarze [Thu, 28 Aug 2014 10:38:06 +0000 (10:38 +0000)]
On Linux, wcwidth() needs _XOPEN_SOURCE, or just _GNU_SOURCE for simplicity.
Besides, signedness of wchar_t and wint_t may differ, it i only
guaranteed that each wchar_t can be represented as a wint_t.
A problem report by Daniel Levai reminded me to fix this.
Ingo Schwarze [Sun, 24 Aug 2014 23:43:13 +0000 (23:43 +0000)]
When support for bold italic font was added to the parsers and to the
generic parts of the formatters some time ago, the PostScript- and
PDF-specific part of the formatters was neglected.
Now pascal@ reports that mandoc -Tps throws an assertion on perl(1),
apparently because that manual actually uses bold italic font.
So here is an overdue implementation of bold italic font support for
PostScript and PDF output.
Ingo Schwarze [Sat, 23 Aug 2014 00:34:59 +0000 (00:34 +0000)]
Let man(1) display preformatted manuals by simply reading them
from the file and copying them to the standard output.
This works even for mixed formats: "man -a groff mandoc" displays
groff(1) [formatted], mandoc(1) [unformatted], groff(7) [formatted],
and mandoc(7) [unformatted] in that order.
Ingo Schwarze [Fri, 22 Aug 2014 03:42:18 +0000 (03:42 +0000)]
mandoc -a, man, apropos -a, whatis -a now paginate by default
but provide an option -c to not paginate;
taking inspiration from manpage.c, hence adding (c) 2012 kristaps@
Ingo Schwarze [Thu, 21 Aug 2014 20:29:07 +0000 (20:29 +0000)]
Bugfix: make whatis(1) case-insensitive again.
The traditional whatis(1) was case-insensitve and it's still documented
that way, but that apparently got broken with or after the switch.
Ingo Schwarze [Thu, 21 Aug 2014 12:57:17 +0000 (12:57 +0000)]
Right after .Fl, a middle delimiter triggers an empty scope,
just like a closing delimiter. This didn't work in groff-1.15,
but it now works in groff-1.22.
After being closed by delimiters, .Nm scopes do not reopen.
Do not suppress white space after .Fl if the next node is a text node
on the same input line; that can happen for middle delimiters.