Fixes: T} can be followed by a delimiter then more data. Make this
work and add documentation for it.
Also make tbl_term() not puke if the number of data cells is less than
the number of layout cells (which happens from time to time). This
still needs work because we should pad out empty cells so that the
borders all work out.
Stuff tbl_calc() into out.c so that it can be shared by all output modes
(isn't now, but will need to be, used by -T[x]html also). Necessitated
a lot of churn in getting tbl_calc* code out of tbl_term.c and into
out.c, including renaming some structures and so on. The abstraction is
in having a pointer to a wrapper function for calculating string widths.
The char devices use term_strlen and term_len; the others will probably
just use strlen().
While at it, remove some superfluous assertions in the tbl code. This
allows all tbl manuals to clear.
Lastly, set the right-margin to be the maximum margin for each table
span. This allows big, complicated tbl-pages like terminfo to be
displayed. They're ugly, but they work.
Ingo Schwarze [Tue, 4 Jan 2011 23:48:39 +0000 (23:48 +0000)]
Merge from OpenBSD (similar to my original fix committed on Oct 15, 2010):
For now, parse and ignore minimal column width specifications.
First step to get terminfo(5) to build.
Support `T{' and `T}' data blocks. When a standalone `T{' is
encountered as a line's last data cell, move into TBL_PART_CDATA mode
whilst leaving the cell's designation as TBL_DATA_NONE. When new data
arrives that's not a standalone `T}', append it to the cell contends.
Close out and warn appropriately.
Fix to make horizontal spanners in the layout be properly printed.
mandoc also now warns (so does tbl(1)) if a horizontal spanner is
specified along with data.
While here, fix up some documentation and uncomment the tbl reference.
Ingo Schwarze [Tue, 4 Jan 2011 01:23:18 +0000 (01:23 +0000)]
Multiple man(7) .IP and .TP fixes started during p2k10:
Affecting both -Tascii and -Thtml:
* The .IP HEAD uses the second argument as the width, not the last one.
* Only print the first .IP HEAD argument, not all but the last.
Affecting only -Tascii:
* The .IP and .TP HEADs must be printed without literal mode,
but literal mode must be restored afterwards.
* After the .IP and .TP bodies, we only want term_newln(), not
term_flushln(), or we would get two blank lines in literal mode.
* The .TP HEAD does not use TWOSPACE, just like .IP doesn't either.
* In literal mode, clear NOLPAD after each line, or subsequent lines
would get no indentation whatsoever.
Affecting only -Thtml:
* Only print next-line .TP children, instead of all but the first.
OK kristaps@ on the -Tascii part; and:
"Can you work this into man_html.c, too?"
Ingo Schwarze [Mon, 3 Jan 2011 23:53:51 +0000 (23:53 +0000)]
Partial cleanup of argument count validation in mdoc(7):
* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.
Looks fine to kristaps@.
Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.
Ingo Schwarze [Mon, 3 Jan 2011 23:24:16 +0000 (23:24 +0000)]
Calling a macro with fewer arguments than it is defined with is OK;
the remaining ones default to the empty string, not to NULL.
Regression reported and fix tested by kristaps@.
Ingo Schwarze [Mon, 3 Jan 2011 22:42:37 +0000 (22:42 +0000)]
Unify roff macro argument parsing (in roff.c, roff_userdef()) and man macro
argument parsing (in man_argv.c, man_args()), both having different bugs,
to use one common macro argument parser (in mandoc.c, mandoc_getarg()),
because from the point of view of roff, man macros are just roff macros,
hence their arguments are parsed in exactly the same way.
While doing so, fix these bugs:
* Escaped blanks (i.e. those preceded by an odd number of backslashes)
were mishandled as argument separators in unquoted arguments to
user-defined roff macros.
* Unescaped blanks preceded by an even number of backslashes were not
recognized as argument separators in unquoted arguments to man macros.
* Escaped backslashes (i.e. pairs of backslashes) were not reduced
to single backslashes both in unquoted and quoted arguments both
to user-defined roff macros and to man macros.
* Escaped quotes (i.e. pairs of quotes inside quoted arguments) were
not reduced to single quotes in man macros.
OK kristaps@
Note that mdoc macro argument parsing is yet another beast for no good
reason and is probably afflicted by similar bugs. But i don't attempt
to fix that right now because it is intricately entangled with lots of
unrelated high-level mdoc(7) functionality, like delimiter handling and
column list phrase handling. Disentagling that would waste too much
time now.
Switch on the `TS' documentation in roff.7. As per off-line discussion,
this may be moved to tbl.7, but for the time being, keep it in the
document as it's developed.
Also note that my handling of horizontal rules in layouts needs some
work.
Add in support for number table cells that account for escapes and so
on. Note also that -Tps and -Tpdf, with these last two commits, produce
more readable output ("less crappy").
Start using term_strlen() instead of strlen(). tbl_term.c can now
properly handle embedded escapes when calculating its widths. NOTE:
this doesn't yet apply to the decimal-point calculation.
Make width calculations occur within tbl_term.c, not tbl.c. This allows
for front-ends to make decisions about widths, not the back-end.
To pull this off, first make each tbl_head contain a unique index value
(0 <= index < total tbl_head elements) and remove the tbl_calc() routine
from the back-end.
Then, when encountering the first tbl_span in the front-end, dynamically
create an array of configurations (termp_tbl) keyed on each tbl_head's
unique index value. Construct the decimals and widths at this time,
then continue parsing as before.
The termp_tbl and indexes are required because we pass a const tbl AST
into the front-end.
Make sure we don't continue recursively parsing once we've exited with
failure (this had caused some segfaults with the new assert() call in
MAN_HALT and MDOC_HALT).
Clarified the role of MDOC_HALT in libmdoc functions by having accessor
functions assert() if they're called after MDOC_HALT is set.
This makes more sense than returning 0 because this return value is used
for parse errors, not programme-flow errors, and it's inconsistent to
use the same value for both. Plus, prior to this, I'd return 0 without
printing an error message, which would cause failure to go unreported to
the operator.
Turn on -Tascii tbl printing. The output still has some issues---I'm
not sure whether it's in the header calculation or term.c squashing
spaces or whatever, but let's get this in for general testing as soon as
possible.
Churn to get parts of 'struct tbl' visible from mandoc.h: rename the
existing 'struct tbl' as 'struct tbl_node', then move all option stuff
into a 'struct tbl' in mandoc.h.
This conflicted with a structure in chars.c, which was renamed.
Merge in the width, decimal, and positioning code for individual data rows
from tbl.bsd.lv. This is more or less verbatim, less queue macros and also
a check for NULL layout.
This concludes the back-end parsing for a little while, as the front-end
display may now be configured.
Plug in the "head" concept for tables. A tbl_head specifies the full
layout for each row, including vertical spacers. One grabs the tbl_head
for a row and iterates through each entry, plugging data from the
tbl_span into the header as appropriate.
This is pulled in more or less verbatim from tbl.bsd.lv. In fact, this
is verbatim except that lists macros are made into hard-coded lists (for
compatibility, as long-ago noted by joerg@).
Add -man support for tables. Like -mdoc, this consists of an
external-facing function man_addspan() (this required shuffling around
the descope routine) and hooks elsewhere.
Add table processing structures to -mdoc. This consists of an
external-facing function mdoc_addspan(), then various bits to prohibit
printing and scanning (this requires some if's to be converted into
switch's).
Ingo Schwarze [Thu, 30 Dec 2010 00:51:32 +0000 (00:51 +0000)]
Plan9 has a man(7) implementation that looks extremely archaic,
even more archaic than Solaris/Heirloom stuff; so that is quite
interesting from a perspective of compatibility and history.
Initial check-in of table data-row processing. For the time being, this
parses table data then throws it away immediately. It does not yet try
to cross-check data rows against layout or anything. This copied more
or less completely from tbl.bsd.lv.
Adding initial options processing (not hooked into parse yet). This is
more or less copied from tbl.bsd.lv and still needs integration with the
general mandoc framework, e.g., with error messages.
Initial tbl framework. Parse point is in libroff, which keeps a
reference to a current tbl parse and routes ALL text into the tbl parse
after stripping reserved words and making block-level pre-processing
(e.g., `ig'). This is consistent with an analysis of embedded `TS/TE'
in manuals with sprinkled -mdoc, roff, and -man macros.
Fact of a parse is exposed to main.c by a return value (ROFF_TBL), which
will trigger main.c to add a foreign parsed body to the -mdoc or -man
parse stream. This interface isn't in yet, but will follow the
parse-text functions in both libraries. I put this login in main.c
because I don't want libroff calling directly into libmdoc or libman.
As a consequence, a parsed row can be pushed directly into any -mdoc or
-man context (put a `Bd -literal -offset indent' into a `TE/TS' block to
see why this is necessary). It will then absorb formatting cues in the
front-ends.
A note on naming. I decided on libroff.h instead of tbl.h because this
is purely within the roff layer. Separate tbl implementations will
need, then, to interface with libroff. This is "how it should be"
because tbl is tightly linked with roff in terms of `ds' and other
formatting macros, as well as, of course, special characters and other
roffisms.
Ingo Schwarze [Mon, 27 Dec 2010 21:41:05 +0000 (21:41 +0000)]
In case an ID attribute is written in pieces, only protect the first
piece with a prepended 'x', not each piece, such that quoted and
unquoted .Sh, .Ss, and .Sx arguments are compatible with each other.
Fixing a bug reported by Nicolas Joly <njoly at NetBSD dot org>,
avoiding a regression in my first patch as pointed out by njoly as well.
"feel free to do so" kristaps@
Specifying both %T and %J in an `Rs' block causes the title to be quoted
instead of underlined. This only happens in -Tascii, as -T[x]html both
underlines and italicises.
As per schwarze@'s suggestions, roll back the refcount structure in
favour of a simpler shim for normalised data in the node allocation and
free routines. This removes the need to bump and copy references within
validator handlers, removes a pointer redirect, and also kills the
refcount structure itself. Data is assumed to "live" either in a
MDOC_BLOCK or MDOC_ELEM and is copied accordingly.
Drastically fix -T[x]html's handling of font-escape mode changes (i.e.,
using \fI or \fP). Now, using these modes will cause a font to be
rendered for each word; furthermore, setting mode within a word will do
the correct thing.
Second, make -man use real font tags (B, I, SMALL) to set its font
instead of using font modes and fix up the pre-macro unsetting of the
current mode.
This fixes how roff.7 wasn't validating (<P> closing out a font mode)
and has been checked against gcc.1 (more will come). I considered
failure to validate OUR manual to be a show-stopper for the up-coming
release.
Ingo Schwarze [Wed, 22 Dec 2010 23:53:55 +0000 (23:53 +0000)]
minor tweaks:
1. improve .Bl wording (from jmc@)
2. jmc@ noted that the .Mt default (the same as in groff) makes no sense,
and there is no better default we could use; thus, regard it as
implementation dependent and do not document it
3. fix formatting of one COMPATIBILITY note: move "and" out of .Sx
ok kristaps@, jmc@
Revert IGNPAR to a warning after clue-stick applied by schwarze@:
although technically-speaking a lost macro is an error (e.g.,
MANDOCERR_MACRO), casting out some extra whitespace (note, IGNPAR only
happens in conditions where whitespace already exists!) is hardly an
error matter.