Ingo Schwarze [Tue, 10 Jan 2017 21:59:47 +0000 (21:59 +0000)]
For the .Ux/.Ox family of macros, do text production at the validation
stage rather than in each and every individual formatter, using the
new NODE_NOSRC flag. More rigorous and also ten lines less code.
Ingo Schwarze [Tue, 10 Jan 2017 12:53:07 +0000 (12:53 +0000)]
Introduce flags NODE_NOSRC and NODE_NOPRT for AST nodes.
Use them to mark generated nodes and nodes that shall not produce output.
Let -Ttree output mode display these new flags.
Use NODE_NOSRC for .Ar, .Mt, and .Pa default arguments.
Use NODE_NOPRT for .Dd, .Dt, and .Os.
These will help to make handling of text production macros more rigorous.
Ingo Schwarze [Mon, 9 Jan 2017 17:49:57 +0000 (17:49 +0000)]
Use stdout rather than stdin for controlling the terminal
such that "cat foo.mdoc | man -l" works.
Issue reported by Christian Neukirchen <chneukirchen at gmail dot com>
and also tested by him on Void Linux with both glibc and musl.
The patch makes sense to millert@.
Ingo Schwarze [Mon, 9 Jan 2017 12:48:58 +0000 (12:48 +0000)]
The .No macro is not supposed to produce fixed-width font, it is not
the same as .Li, so don't use <code>.
Bug reported by <Anton dot Lindqvist at gmail dot com> on tech@.
Ingo Schwarze [Mon, 9 Jan 2017 01:37:03 +0000 (01:37 +0000)]
Warnings and errors that occur during mdoc_validate()
or during man_validate() have to affect the mandoc(1) EXIT STATUS.
Many thanks to <Yuri dot Pankov at gmail dot com> (illumos developer)
for reporting this regression.
Ingo Schwarze [Sun, 8 Jan 2017 22:51:55 +0000 (22:51 +0000)]
Indentation must be measured in units of the surrounding text,
not in units of the contained text. Consequently, "display"
and "lit" class tags must not be on the same element: First,
"display" must set up the indentation, still using the outer
units, and only after that, "lit" may change the font.
This fixes .Bd -literal which got the wrong indentation.
Bug reported by tb@.
Ingo Schwarze [Sun, 8 Jan 2017 02:01:17 +0000 (02:01 +0000)]
Tolerate bare tabs in SYNOPSIS .Cd for now.
It's used in half a dozen pages.
Even though i have been thinking about it for years,
i still can't suggest anything better.
The false positives are annoying.
Ingo Schwarze [Sun, 8 Jan 2017 00:11:23 +0000 (00:11 +0000)]
Stricter validation of the NAME section, in particular:
- require a comma between names
- reject all other text nodes
- reject all empty Nm below NAME, not only in the leading position
- reject Nm after Nd
Ingo Schwarze [Wed, 28 Dec 2016 17:34:18 +0000 (17:34 +0000)]
Make the second, section number argument of .Xr mandatory.
In fact, we have been requiring it for many years.
The only reason to not warn when it was missing
was excessive traditionalism - it was optional in 4.4BSD.
Ingo Schwarze [Wed, 7 Dec 2016 22:59:29 +0000 (22:59 +0000)]
When reporting "whitespace at end of input line" on lines ending with
roff(7) comments, let the column number in the message point to the
end of the line rather than to the beginning of the comment.
Improvement suggested by bluhm@.
Ingo Schwarze [Sat, 19 Nov 2016 15:24:51 +0000 (15:24 +0000)]
Do not install libmandoc.a by default.
The only environment where it is ever needed is NetBSD base.
Even NetBSD ports and pkgsrc should better not install it.
Triggered by a question from bentley@.
Ingo Schwarze [Tue, 8 Nov 2016 16:23:58 +0000 (16:23 +0000)]
implement tag priority 0, which will tag only keys that appear as
tag candidates exactly once, and use it for .Em and .Sy;
written on the TGV Toulouse-Paris
Ingo Schwarze [Tue, 18 Oct 2016 22:27:25 +0000 (22:27 +0000)]
The termination condition of the iteration logic in page_bymacro()
was overzealous. Consequently, macro=substr and macro~regexp searches
only returned all pages containing the first matching macro value,
rather than all pages containing any of the matching macro values.
Bug reported by tb@ - thanks!
Ingo Schwarze [Tue, 18 Oct 2016 16:06:44 +0000 (16:06 +0000)]
Compat glue for the FreeBSD comparison function prototype for fts_open(3)
which differs from what most other systems use.
While here, improve diagnostic output of ./configure tests.
Ingo Schwarze [Tue, 18 Oct 2016 14:15:33 +0000 (14:15 +0000)]
Simplify and correct support for reproducible builds, such that database
entries come in a well-defined order even in the presence of MLINKS.
Do this by using the compar() argument of fts_open(3) rather than
trying to sort later, which missed some cases.
This also shortens the code by a few lines.
Diff from Ed Maste <emaste @ FreeBSD>, adapted to our tree
and tweaked a bit by me, final version confirmed by Ed.
Ingo Schwarze [Sun, 9 Oct 2016 18:16:56 +0000 (18:16 +0000)]
Delete complicated code dealing with .Bl -tag without -width,
and just let it default to -width 6n, which agrees with the
traditional -width Ds that is still in widespread use.
I just pushed a patch upstream to GNU roff that does the same for
groff_mdoc(7). Before, groff contained code that was even more
complicated than mandoc, but both resulted in quite different
user-visible output. Now, both agree, and output is nicer for both.
Useless complication noticed by Carsten Kunze (Heirloom roff).
We cannot use fputs(3) in passthrough() because the stdout stream
might be in stdio wide orientation due to prior formatting of an
unformatted manual in man -aTutf8 mode. So for now, use fflush(3)
followed by unbuffered write(2) instead. Fixes output corruption
on glibc discovered on Linux while testing a diff to fix a loosely
related bug reported by <jmates at ee dot washington dot edu>.
I detest the concept of stdio stream orientation. One day, i will
rewrite term_ascii.c to always use narrow streams, even in UTF-8
output mode. But that's too much work for today.
Make sure an output device is allocated before calling terminal_sepline(),
fixing a NULL pointer access that happened when the first of multiple pages
shown was preformatted, as in "man -a groff troff".
Crash reported by <jmates at ee dot washington dot edu> on bugs@, thanks!
When "makewhatis -d" tries to add to a database that doesn't (yet) exist,
silently create it from scratch instead of printing a warning.
The annoying warning message was reported by ajacoutot@, and espie@
convincingly argues that a non-existing database can be considered
equivalent to an empty one.
Ingo Schwarze [Tue, 30 Aug 2016 22:01:07 +0000 (22:01 +0000)]
When the database is corrupt in the sense of containing invalid
pointers in the pages table, do not access NULL pointers, but
gracefully handle the errors.
Similar patches will be needed for the macro tables, too.
<attila at stalphonsos dot com> audited the code and pointed out to me
that dbm_get() can return NULL for corrupted databases, but that isn't
handled properly at various places.
Ingo Schwarze [Sun, 28 Aug 2016 16:15:12 +0000 (16:15 +0000)]
If a line inside .Bl -column starts with a tab character
and there was no preceding .It macro, do not read the byte
before the beginning of the line buffer.
Found by tb@ with afl@.
Ingo Schwarze [Mon, 22 Aug 2016 16:15:26 +0000 (16:15 +0000)]
When trying to edit an existing database with makewhatis(8) -d or -u
but reading the database fails, report the full path to the database
on standard error, and mention that the database is automatically
recreated from scratch.
Suggested by espie@.
Ingo Schwarze [Mon, 22 Aug 2016 16:07:16 +0000 (16:07 +0000)]
When running into a mandoc.db(5) file still using the obsolete
format based on SQLite 3, say so in words that mortals can
understand rather than babbling about hex magic.
Suggested by espie@.
Ingo Schwarze [Sat, 20 Aug 2016 17:59:34 +0000 (17:59 +0000)]
When a mismatching end macro occurs while at least two nested blocks
are open, all except the innermost open block got a bogus MDOC_ENDED
marker, in some situations triggering segfaults down the road
which tb@ found with afl(1).
Fix the logic error by figuring out up front whether an end macro
has a matching body, and if it hasn't, don't mark any blocks as broken.
Ingo Schwarze [Sat, 20 Aug 2016 15:58:21 +0000 (15:58 +0000)]
When scanning upwards for a column list to put a .Ta macro in,
ignore body end markers of lists breaking other blocks.
Fixing a logical error that caused a NULL deref found by tb@ with afl(1).
Ingo Schwarze [Sat, 20 Aug 2016 14:43:50 +0000 (14:43 +0000)]
If a column list starts with implicit rows (that is, rows without .It)
and roff-level nodes (e.g. tbl or eqn) follow, don't run into an
assertion. Instead, wrap the roff-level nodes in their own row.
Issue found by tb@ with afl(1).
Ingo Schwarze [Wed, 17 Aug 2016 20:46:56 +0000 (20:46 +0000)]
When the content of a manual page does not specify a section, the
empty string got added to the list of sections, breaking the database
format slightly and causing the page to not be considered part of
any section, not even if a section could be deduced from the directory
or from the file name.
Bug found due to the bogus pcredemo(3) "manual" in the pcre-8.38p0 package.
Ingo Schwarze [Wed, 17 Aug 2016 18:59:37 +0000 (18:59 +0000)]
When reading back a mandoc.db(5) file in order to apply incremental
changes, do not prepend a stray NAME_FILE (0x10) byte to the first
names of pages.
Bug found while investigating another issue reported by sthen@.
Ingo Schwarze [Wed, 17 Aug 2016 18:10:39 +0000 (18:10 +0000)]
Make sure manuals in architecture-independent directories are treated
as architecture-independent even if they abuse the third (architecture)
argument of the .Dt macro for random stuff like "freetds reference manual".
While the .Dt syntax is not the same as the .TH syntax in man(7),
punishing offenders by treating them as architecture-dependent and
hence completely excluding them from searches is too severe.
Problem reported by sthen@.
Ingo Schwarze [Thu, 11 Aug 2016 13:30:25 +0000 (13:30 +0000)]
Even after switching from a pending head to the body, we have to
continue scanning upwards, because the enclosing block might already
be pending as well, e.g. .Bl .Bl .It Bo .El .It.
Tree corruption leading to a later NULL deref found by tb@ with afl(1).
Ingo Schwarze [Thu, 11 Aug 2016 10:47:16 +0000 (10:47 +0000)]
If a .Bd display is on the one hand doomed to be deleted because
it has no type, but is on the other hand breaking another block,
delete its end marker as well, or the end marker may remain behind
as an orphan, triggering an assertion in the terminal formatter.
Problem found by tb@ with afl(1).
Ingo Schwarze [Wed, 10 Aug 2016 20:17:50 +0000 (20:17 +0000)]
Don't deref NULL if the only child of the first .Sh is an empty
in-line macro, and don't printf("%s", NULL) if the first child
of the first .Sh is a macro; again found by tb@ with afl(1).
(No, you should never use macros in any .Sh at all, please.)
Ingo Schwarze [Wed, 10 Aug 2016 12:50:24 +0000 (12:50 +0000)]
When trying to figure out which C compiler make(1) wants to use,
pass it the POSIX -s option. On most systems, this won't make a
difference, but Bdale Garbee reported that the make(1) on his Debian
system, most likely some version of gmake, breaks Makefile.local
by printing some 'entering directory' messages. I failed to reproduce
and Bdale didn't report back, but judging from gmake source code,
this is likely to help and unlikely to do harm elsewhere.
Ingo Schwarze [Wed, 10 Aug 2016 12:06:41 +0000 (12:06 +0000)]
When validating a .Bl list that defaults to -item for want of a type,
don't let a subsequent -width access mdoc_argnames[] out of bounds.
Found by tb@ with afl(1).
Ingo Schwarze [Wed, 10 Aug 2016 11:03:43 +0000 (11:03 +0000)]
Fix assertion failures caused by whitespace inside \o'' (overstrike)
sequences that jsg@ found with afl(1):
* Avoid writing \t\b in term.c.
* Handle trailing \b in term_ps.c.
Ingo Schwarze [Fri, 5 Aug 2016 23:15:08 +0000 (23:15 +0000)]
The concept of endianness seems to be somewhat newfangled, so the
respective conversion functions are not yet properly standardized.
Rumour has it that POSIX is working on it, though.
For now, sprinkle some configuration glue.
Ingo Schwarze [Thu, 4 Aug 2016 09:33:57 +0000 (09:33 +0000)]
Fix an assertion failure that happened when trying to add a page
with makewhatis -d to a completely empty database.
Reported by Mark Patruck <mark at wrapped dot cx>, thanks!
Ingo Schwarze [Tue, 2 Aug 2016 11:09:46 +0000 (11:09 +0000)]
POSIX allows PATH_MAX to not be defined, meaning "unlimited".
Found by Aaron M. Ucko <amu at alum dot mit dot edu> on the GNU Hurd,
via Bdale Garbee, https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=829624
Also add EFTYPE at two places where it was forgotten.
Some base system pages, for example perl(1), contain non-ASCII
characters in their source code, so switch on charset autodetection
in the same way as in man(1) itself.
Issue reported by Pavan Maddamsetti at gmail dot com on bugs@.
Autodetect a suitable locale for -Tutf8 mode,
and allow overriding it manually.
Based on a patch from Svyatoslav Mishyn <juef at openmailbox dot org>
tweaked by me.
The idea originally came up in a conversation with Markus Waldeck.
No need to populate the TYPE_arch and TYPE_sec bits, the information
is provided directly to dba_page_add() in dbadd_mlink()
and to dba_page_new() in dbadd().
No need for a dedicated loop for NAME_FILE.
It's done in dbadd_mlink() anyway.
In this context, also record section numbers taken from filenames
and from .Dt and .TH macros, architectures taken from .Dt macros,
and fix the filtering of duplicate filename entries.
Now that our man.conf(5) format is mature and extremely simple,
delete manpath(1) support. With the mandoc-based man(1), manpath(1)
is utterly useless. Just set MANPATH_DEFAULT in configure.local
for sane operating system defaults, use man.conf(5) for machine-
specific modifications, and use ${MANPATH}, -m, and -M for user
preferences.
Remove the dependency on SQLite without loss of functionality.
Stop supporting systems that don't have mmap(3).
Drop the obsolete names_check() now that we deleted MLINKS.
Since the mdoc/man parser unification, the parser is always allocated
in mparse_alloc(), so delete all the curp->man == NULL checks.
Triggered by a patch from Christos Zoulas suggesting to add
yet another such check.
To remove the const qualifier from a pointer to an object - either
because we know it is actually mutable or because we are passing
it to a function that doesn't accept a const object but won't
actually attempt to modify it - simply casting from (const type *)
to (type *) is legal C and clearly expresses the intent.
So get rid of the obfuscating UNCONST macro.
Basic idea discussed with guenther@.