Ingo Schwarze [Fri, 10 Oct 2014 12:19:25 +0000 (12:19 +0000)]
Make eqn(7) -Ttree output more useful:
* Reduce noise by not printing default attributes.
* Print missing "top" and "bottom" attributes.
* Print mnemonics, not code numbers for expression positions.
* Do not print unused "pile" attribute.
Re-write of eqn(7) parser and MathML output.
This adds parser-level support for the grammar described by the eqn
second-edition technical paper, "Typesetting Mathematics — User's Guide"
(Kernighan, Cherry).
The reason for this re-write is the grouping rules, which were not
possible given the existing implementation.
The re-write has also considerably simplified the HTML (and, if it ever
is completed, terminal) front-end.
Ingo Schwarze [Tue, 7 Oct 2014 14:07:03 +0000 (14:07 +0000)]
If a tbl(7) layout contains unknown font modifiers, fall back to the
default font rather than failing the whole table.
Needed by some pages in books/man-pages-posix.
Written on the plane back from EuroBSDCon in Sofia.
Crudely accomodate for matrices by way of adjacent tables. We don't do this
nicely right now because eqn uses column ordering.
Also add from/to support and to support.
Support a decent subset of eqn(7) in MathML.
This has basic support for positions (under, sup, sub, sub/sup) and piles.
It *does not* support right-left grouping (among many other things), e.g.,
Remove <p> in favour of <div class="spacer">.
This is good because <p> is brittle: it can't appear within other block
macros.
This fixes a regression of the original HTML5 patch as noted by schwarze@
on the tech@ list, 14/8/2014.
First, add space for default styling for HTML5 (non-fragment) output.
This uses a <style /> block right before the <link /> for the stylesheet.
Use this to kick out hardcoded header and footer table widths.
Five year old typo reported by Theo Buehler at math dot ethz dot ch, thanks.
I nearly asked: ``What's wrong with it? It formats as "intended".''
(However, what Kristaps intended to write was "indented".)
Support backslash-escaping of white space in the query expression,
to be more similar to apropos(1) called from the shell.
Missing feature reported by Marcus MERIGHI <mcmer dash openbsd at
tor dot at> on misc@.
If a manpath directory (for example, a _whatdb entry from man.conf(5)
or an entry in the MANPATH environment variable) does not exist,
silently skip it. This brings makewhatis(8) back closer to the
behaviour of espie@'s version and ought to shut up the weekly(8)
whining observed by henning@ on machines not having xbase installed.
Also, don't error out after the first unusable manpath entry, still
try the others.
Of course, still complain about non-existent directories specified
on the command line and about any directories failing for other
reasons than ENOENT.
Do not report a page as arch=any merely because .Dt lacks the third argument.
Pages found outside arch-specific dirs still get arch=any, of course.
Issue reported by justinhenryhaynes at gmail dot com on misc@, thanks!
Simplify by handling empty request lines at the one logical place
in the roff parser instead of in three other places in other parsers.
No functional change.
Move main format autodetection from the parser dispatcher to the
roff parser where .Dd and .TH are already detected, anyway. This
improves robustness because it correctly handles whitespace or an
alternate control character before Dd. In the parser dispatcher,
provide a fallback looking ahead in the input buffer instead of
always assuming man(7). This corrects autodetection when Dd is
preceded by other macros or macro-like handled requests like .ll.
Triggered by reports from Daniel Levai about issues on Slackware Linux.
If a manual page is installed gzip(1)ed, let makewhatis(8) take
note in mandoc.db(5), such that man(1) -w and apropos(1) -w can
report the correct filename.
This is a prerequisite for letting apropos -a and man support
gzip'ed manuals in the future, which doesn't work yet.
Implement the traditional -h option for man(1): show the SYNOPSIS only.
As usual, we get mandoc -h and apropos -h for free.
Try stuff like "apropos -h In=dirent" or "apropos -h Fa=timespec".
Only useful for terminal output, so -Tps, -Tpdf, -Thtml ignore -h for now.
When makewhatis(8) finds an .so link after the manual being pointed to
has already been processed, add the file names to the names table, too,
not just to the mlinks table.
This fixes a bug where apropos(1) and the new man(1) wouldn't find some
of the Xenocara manuals via some of their .so links. After rebuilding,
run "makewhatis /usr/X11R6/man" or just wait for weekly(8).
In man(1) mode, change to the right directory before starting the parser,
just like traditional man(1) does, such that .so links have a chance to
work. After this point, we don't need the current directory for anything
else before exit, so we don't need to worry about getting back and we can
safely ignore failure.
Ingo Schwarze [Sat, 30 Aug 2014 18:08:10 +0000 (18:08 +0000)]
Introduce a man(1) -l option as an alias for mandoc -a.
Basically, this does the same as man -l in Linux man-db.
The point is that now all functionality of the combined tool
is reachable from the man(1) command name:
apropos = man -k, whatis = man -f, mandoc = man -cl.
Originally suggested by Carsten dot Kunze at arcor dot de,
current maintainer of the Heirloom Documentation Tools.
While here, add various missing information to the usage()
and to the manuals.
Ingo Schwarze [Thu, 28 Aug 2014 10:38:06 +0000 (10:38 +0000)]
On Linux, wcwidth() needs _XOPEN_SOURCE, or just _GNU_SOURCE for simplicity.
Besides, signedness of wchar_t and wint_t may differ, it i only
guaranteed that each wchar_t can be represented as a wint_t.
A problem report by Daniel Levai reminded me to fix this.
Ingo Schwarze [Sun, 24 Aug 2014 23:43:13 +0000 (23:43 +0000)]
When support for bold italic font was added to the parsers and to the
generic parts of the formatters some time ago, the PostScript- and
PDF-specific part of the formatters was neglected.
Now pascal@ reports that mandoc -Tps throws an assertion on perl(1),
apparently because that manual actually uses bold italic font.
So here is an overdue implementation of bold italic font support for
PostScript and PDF output.
Ingo Schwarze [Sat, 23 Aug 2014 00:34:59 +0000 (00:34 +0000)]
Let man(1) display preformatted manuals by simply reading them
from the file and copying them to the standard output.
This works even for mixed formats: "man -a groff mandoc" displays
groff(1) [formatted], mandoc(1) [unformatted], groff(7) [formatted],
and mandoc(7) [unformatted] in that order.
Ingo Schwarze [Fri, 22 Aug 2014 03:42:18 +0000 (03:42 +0000)]
mandoc -a, man, apropos -a, whatis -a now paginate by default
but provide an option -c to not paginate;
taking inspiration from manpage.c, hence adding (c) 2012 kristaps@
Ingo Schwarze [Thu, 21 Aug 2014 20:29:07 +0000 (20:29 +0000)]
Bugfix: make whatis(1) case-insensitive again.
The traditional whatis(1) was case-insensitve and it's still documented
that way, but that apparently got broken with or after the switch.
Ingo Schwarze [Thu, 21 Aug 2014 12:57:17 +0000 (12:57 +0000)]
Right after .Fl, a middle delimiter triggers an empty scope,
just like a closing delimiter. This didn't work in groff-1.15,
but it now works in groff-1.22.
After being closed by delimiters, .Nm scopes do not reopen.
Do not suppress white space after .Fl if the next node is a text node
on the same input line; that can happen for middle delimiters.
Ingo Schwarze [Thu, 21 Aug 2014 01:35:43 +0000 (01:35 +0000)]
* remove pointless separate -f and -k synopses, they take almost all args
* fix up descriptions of -f and -k
* remove excessive example for -k
* remove explicit BSD references
* add CVS Id
Ingo Schwarze [Thu, 21 Aug 2014 00:42:38 +0000 (00:42 +0000)]
Now that we have man(1) functionality, add a man(1) manual page.
I'm importing the totally unchanged OpenBSD version
such that all changes can easily be tracked in CVS.
Ingo Schwarze [Thu, 21 Aug 2014 00:32:15 +0000 (00:32 +0000)]
Implement classic man(1) output mode showing only one manual even
if there is more than one match, using traditional section priorities,
and implement man(1) -a (show all) output mode, not just for man(1),
but also for apropos(1) and whatis(1).
Control reading off the edge of our buffer in term_flushln().
This happens in specific conditions (trailing whitespace in certain
terminal modes), but in practise, it happens quite often (as reported by
valgrind).
In short, "Nothing about term_flushln() is simple. Srsly!" (schwarze@)
Discussed on tech@, ok schwarze@.
Ingo Schwarze [Mon, 18 Aug 2014 16:36:54 +0000 (16:36 +0000)]
When the first child of the node being validated gets deleted during
validation, man_node_unlink() switches to MAN_NEXT_CHILD. After
that, we have to switch back to MAN_NEXT_SIBLING after completing
validation, or subsequent parsing would add content into an already
closed node, clobbering potentially existing children, causing
information loss and a memory leak. Bug found by kristaps@ with
valgrind in groff(7) on Mac OS X.
Note that the switch back must be conditional, for if the node being
validated itself gets deleted, we must *not* go to MAN_NEXT_SIBLING,
which would not only yield wrong results in general but also crash
in malformed manuals having an empty paragraph before the first .SH,
for example OpenBSD c++filt(1).
Fix a corner case where \H<nil> (where <nil> is the \0 character) would
cause mandoc_escape() to read past the end of an allocated string.
Found when a script scanning of all Mac OSX manual accidentally also
scanned binary (gzip'd) files, discussed with schwarze@ on tech@.
Ingo Schwarze [Sun, 17 Aug 2014 22:10:29 +0000 (22:10 +0000)]
While all current callers pass valid data to ascii_hspan() only,
it's safer to assume incoming enum data might be invalid
and catch it instead of happily returning an unitialized int.
No functional change right now.
Ingo Schwarze [Sun, 17 Aug 2014 20:53:50 +0000 (20:53 +0000)]
Do not require getsubopt() to provide extern char *suboptarg.
We don't use it anyway in mandoc. Like this, fewer systems need
the compat implementation. In particular, we can now use the stock
getsubopt() on glibc and musl.
Besides, the comment in the BSD getsubopt.c that error messages are
tricky without *suboptarg is massively overblown. If you simply
save a copy of the pointer you pass into getsubopt(), that's quite
usable for an error message.
People start campaigning for the addition of *suboptarg to C libraries
on the grounds that mandoc wants it, but actually, i consider library
functions manipulating global data quite ugly, so stop pushing people
into that questionable direction.
While here, add an explicit Copyright header to the test file.
While it's obviously to me what Kristaps intended, others might
consider this file copyrightable and wonder what's up.
Ingo Schwarze [Sun, 17 Aug 2014 16:44:41 +0000 (16:44 +0000)]
KNF: fix indentation of previous commit, see style(9):
"Indentation is an 8 character tab. Second level indents are four spaces."
All the rest of this file already conforms.
Protect against accessing "n->next->child" by first checking "n->next".
Noticed in a crash against ".It Nm Fo" with no closing "Fc".
Original patch expanded by schwarze@ then extended even more.
Ingo Schwarze [Sun, 17 Aug 2014 03:24:47 +0000 (03:24 +0000)]
Fully integrate apropos(1) into mandoc(1).
Switch the argmode on the progname, including man(1).
Provide -f and -k options to switch the argmode.
Store the argmode inside struct search, generalizing the flags.
Derive the deftype from the argmode when needed instead of storing it.
Store the outkey inside struct search instead of passing it alone.
While here, get rid of the trailing blanks in Makefile.depend.
Ingo Schwarze [Sat, 16 Aug 2014 23:04:25 +0000 (23:04 +0000)]
When BUILD_DB is active, link apropos(1) into the mandoc binary.
This is the first step on the way to a man(1) implementation.
The new ./configure is flexible enough to make this step quite easy.
Ingo Schwarze [Sat, 16 Aug 2014 19:50:37 +0000 (19:50 +0000)]
If a stray .It follows .El, we are no longer in the list,
even though the list is still the last processed macro.
This fixes a regression introduced in mdoc_macro.c rev. 1.138:
Ulrich Spoerlein <uqs at FreeBSD> reports that various of their
kernel manuals trigger assertions.
Ingo Schwarze [Sat, 16 Aug 2014 19:00:01 +0000 (19:00 +0000)]
Improve build system and autodetection.
* Make ./configure standalone, that's what people expect.
* Let people write a ./configure.local from scratch, not edit existing files.
* Autodetect wchar, sqlite3, and manpath and act accordingly.
* Autodetect the need for -L/usr/local/lib and -lutil.
* Get rid of config.h.p{re,ost}, let ./configure only write what's needed.
* Let ./configure write a Makefile.local snippet, that's quite flexible.
Ingo Schwarze [Thu, 14 Aug 2014 22:33:10 +0000 (22:33 +0000)]
Some compilers apparently worry that abort() might return
and then throw a "may be used uninitialized" warning, so
sprinkle some /* NOTREACHED */. No functional change.
Noticed by Thomas Klausner <wiz at NetBSD dot org>.
Ingo Schwarze [Thu, 14 Aug 2014 00:31:43 +0000 (00:31 +0000)]
Revert previous, as requested by kristaps@.
The .Bf block can contain subblocks, so it has to render as an
element that can contain flow content. But <em> cannot contain
flow content, only phrasing content. Rendering .Em and .Bf differently
would by unfortunate, and closing out .Bf before subblocks and
re-opening it afterwards would merely complicate both the C code
of the program and the generated HTML code. Besides, converting
.Em to semantic HTML markup would require some content to be put
into <em> and some into <i>, but we cannot automatically distinguish
which is which, so strictly speaking, we can't use semantic HTML
here but have to fall back to physical markup. Wonders of HTML...
Begin cleaning up scaling units.
Start with the horizontal terminal specifiers, making sure that they match
up with troff.
Then move on to PS, PDF, and HTML, noting that we stick to the terminal
default width for "u".
Lastly, fix some completely-wrong documentation and note that we diverge
from troff w/r/t "u".
Ingo Schwarze [Wed, 13 Aug 2014 15:25:22 +0000 (15:25 +0000)]
Use <em> for .Em and .Bf -emphasis.
The vast majority of .Em in real-world manuals is stress emphasis,
for which <em> is the correct markup. Admittedly, there are some
instances of .Em usage for alternate quality, for which <i> would
be a better match. Most of these are technical terms that neither
allow semantic markup nor are keywords - for the latter, .Sy would
be preferable. A typical example is that the shell breaks input into
.Em words .
Alternate voice or mood, which would also require <i>, is almost
absent from manuals.
We cannot satisfy both stress emphasis and alternate quality, so
pick the one that fits more often and looks less wrong when off.
Patch from Guy Harris <guy at alum dot mit dot edu>.
ok joerg@ bentley@
Ingo Schwarze [Tue, 12 Aug 2014 19:28:16 +0000 (19:28 +0000)]
In mdoc(7) and man(7), if a width is given as a bare number without
specifying a unit, the implied unit is 'n' (on the terminal, one
character position; in PostScript, half of the current font size
in points), not 'u' (roff output device basic unit). No functional
change right now, but important for the upcoming scaling unit fixes.
Ingo Schwarze [Tue, 12 Aug 2014 19:28:03 +0000 (19:28 +0000)]
The macro SCALE_HS_INIT() is always passed the result of strlen() or
an equivalent number as its argument, and strlen() measures the width
of a string in characters, not in basic units. No functional change
right now, but important for the upcoming scaling unit fixes.