Ingo Schwarze [Mon, 17 Jan 2011 00:21:29 +0000 (00:21 +0000)]
Refrain from throwing fatal errors for
* .br .sp .nf .fi .na with arguments - just skip the arguments
* .TH lacking arguments - use empty strings instead like groff
* .TH with excessive arguments - skip those
Reminded by joerg@, ok kristaps@.
Ingo Schwarze [Sun, 16 Jan 2011 20:12:45 +0000 (20:12 +0000)]
When processing a blank text line, do not break out of text processing
into macro processing code. Fixing a regression introduced in 1.95,
found because it caused segfaults in my regression suite.
OK kristaps@
Ingo Schwarze [Sun, 16 Jan 2011 04:00:34 +0000 (04:00 +0000)]
Implement the roff .rm request (remove macro).
Using the new roff_getname() function, this is really simple.
Breaks mandoc of the habit of reporting an error in each pod2man(1) preamble.
Reminded by a report from brad@; ok kristaps@.
Change how -Thtml behaves with tables: use multiple rows, with widths
set by COL, until an external macro is encountered. At this point in
time, close out the table and process the macro. When the first table
row is again re-encountered, re-start the table. This requires a bit of
tracking added to "struct html", but the change is very small and
follows the logic of meta-fonts. This all follows a bug-report by
joerg@.
Downgrade -man message of ignored empty paragraph to MANDOC_IGNPAR. The
change in man_macro.c was from an assertion caused by a subtle problem:
(1) macro is removed, causing m->last to be m->last->parent; (2) by jumping
to the m->last->parent after post-validation, the original
m->last->parent is skipped; (3) the rewinder climbs to the root of the
tree and aborts.
The original issue recorded in the TODO by schwarze@, reminded by Brad
Smith.
Make -man -Tascii not break within literal lines, e.g.,
.nf
.B hello world
.fi
Also, clean up the print_man_node() function a little bit. This problem
has long since been in the TODO and was recently noted again by Brad
Smith. The -T[x]html fix will follow...
Add support for "^" vertical spanners. Unlike GNU tbl, raise
error-class messages when data is being ignored by specifying it in "^"
cells (either as-is or in blocks).
Also note again that horizontal spanners aren't really supported...
Ingo Schwarze [Tue, 11 Jan 2011 00:11:45 +0000 (00:11 +0000)]
Refactoring in preparation for .rm support:
Unify parsing of names given as roff request arguments into a new
function roff_getname(), which is rather different from the parsing
function for normal arguments, mandoc_getarg(), because names cannot
be quoted and cannot contain whitespace or escaped characters.
The new function now throws an ERROR when finding escaped characters
in a name.
"I'm fine with this." kristaps@
When a row of data is being parsed and it's a line or double-line
(instead of data), re-use the last "layout" pointer instead of advancing
to the next one.
Fixes: T} can be followed by a delimiter then more data. Make this
work and add documentation for it.
Also make tbl_term() not puke if the number of data cells is less than
the number of layout cells (which happens from time to time). This
still needs work because we should pad out empty cells so that the
borders all work out.
Stuff tbl_calc() into out.c so that it can be shared by all output modes
(isn't now, but will need to be, used by -T[x]html also). Necessitated
a lot of churn in getting tbl_calc* code out of tbl_term.c and into
out.c, including renaming some structures and so on. The abstraction is
in having a pointer to a wrapper function for calculating string widths.
The char devices use term_strlen and term_len; the others will probably
just use strlen().
While at it, remove some superfluous assertions in the tbl code. This
allows all tbl manuals to clear.
Lastly, set the right-margin to be the maximum margin for each table
span. This allows big, complicated tbl-pages like terminfo to be
displayed. They're ugly, but they work.
Ingo Schwarze [Tue, 4 Jan 2011 23:48:39 +0000 (23:48 +0000)]
Merge from OpenBSD (similar to my original fix committed on Oct 15, 2010):
For now, parse and ignore minimal column width specifications.
First step to get terminfo(5) to build.
Support `T{' and `T}' data blocks. When a standalone `T{' is
encountered as a line's last data cell, move into TBL_PART_CDATA mode
whilst leaving the cell's designation as TBL_DATA_NONE. When new data
arrives that's not a standalone `T}', append it to the cell contends.
Close out and warn appropriately.
Fix to make horizontal spanners in the layout be properly printed.
mandoc also now warns (so does tbl(1)) if a horizontal spanner is
specified along with data.
While here, fix up some documentation and uncomment the tbl reference.
Ingo Schwarze [Tue, 4 Jan 2011 01:23:18 +0000 (01:23 +0000)]
Multiple man(7) .IP and .TP fixes started during p2k10:
Affecting both -Tascii and -Thtml:
* The .IP HEAD uses the second argument as the width, not the last one.
* Only print the first .IP HEAD argument, not all but the last.
Affecting only -Tascii:
* The .IP and .TP HEADs must be printed without literal mode,
but literal mode must be restored afterwards.
* After the .IP and .TP bodies, we only want term_newln(), not
term_flushln(), or we would get two blank lines in literal mode.
* The .TP HEAD does not use TWOSPACE, just like .IP doesn't either.
* In literal mode, clear NOLPAD after each line, or subsequent lines
would get no indentation whatsoever.
Affecting only -Thtml:
* Only print next-line .TP children, instead of all but the first.
OK kristaps@ on the -Tascii part; and:
"Can you work this into man_html.c, too?"
Ingo Schwarze [Mon, 3 Jan 2011 23:53:51 +0000 (23:53 +0000)]
Partial cleanup of argument count validation in mdoc(7):
* Do not segfault on empty .Db, .Rs, .Sm, and .St.
* Let check_count() really throw the requested level, not always ERROR.
* Downgrade most bad argument counts from ERROR to WARNING.
* And some related internal cleanup.
Looks fine to kristaps@.
Note that the macros using eerr_ge1() still need to be checked at a later
time; but as all the others are done, let's use what we already have.
Ingo Schwarze [Mon, 3 Jan 2011 23:24:16 +0000 (23:24 +0000)]
Calling a macro with fewer arguments than it is defined with is OK;
the remaining ones default to the empty string, not to NULL.
Regression reported and fix tested by kristaps@.
Ingo Schwarze [Mon, 3 Jan 2011 22:42:37 +0000 (22:42 +0000)]
Unify roff macro argument parsing (in roff.c, roff_userdef()) and man macro
argument parsing (in man_argv.c, man_args()), both having different bugs,
to use one common macro argument parser (in mandoc.c, mandoc_getarg()),
because from the point of view of roff, man macros are just roff macros,
hence their arguments are parsed in exactly the same way.
While doing so, fix these bugs:
* Escaped blanks (i.e. those preceded by an odd number of backslashes)
were mishandled as argument separators in unquoted arguments to
user-defined roff macros.
* Unescaped blanks preceded by an even number of backslashes were not
recognized as argument separators in unquoted arguments to man macros.
* Escaped backslashes (i.e. pairs of backslashes) were not reduced
to single backslashes both in unquoted and quoted arguments both
to user-defined roff macros and to man macros.
* Escaped quotes (i.e. pairs of quotes inside quoted arguments) were
not reduced to single quotes in man macros.
OK kristaps@
Note that mdoc macro argument parsing is yet another beast for no good
reason and is probably afflicted by similar bugs. But i don't attempt
to fix that right now because it is intricately entangled with lots of
unrelated high-level mdoc(7) functionality, like delimiter handling and
column list phrase handling. Disentagling that would waste too much
time now.
Switch on the `TS' documentation in roff.7. As per off-line discussion,
this may be moved to tbl.7, but for the time being, keep it in the
document as it's developed.
Also note that my handling of horizontal rules in layouts needs some
work.
Add in support for number table cells that account for escapes and so
on. Note also that -Tps and -Tpdf, with these last two commits, produce
more readable output ("less crappy").
Start using term_strlen() instead of strlen(). tbl_term.c can now
properly handle embedded escapes when calculating its widths. NOTE:
this doesn't yet apply to the decimal-point calculation.
Make width calculations occur within tbl_term.c, not tbl.c. This allows
for front-ends to make decisions about widths, not the back-end.
To pull this off, first make each tbl_head contain a unique index value
(0 <= index < total tbl_head elements) and remove the tbl_calc() routine
from the back-end.
Then, when encountering the first tbl_span in the front-end, dynamically
create an array of configurations (termp_tbl) keyed on each tbl_head's
unique index value. Construct the decimals and widths at this time,
then continue parsing as before.
The termp_tbl and indexes are required because we pass a const tbl AST
into the front-end.
Make sure we don't continue recursively parsing once we've exited with
failure (this had caused some segfaults with the new assert() call in
MAN_HALT and MDOC_HALT).
Clarified the role of MDOC_HALT in libmdoc functions by having accessor
functions assert() if they're called after MDOC_HALT is set.
This makes more sense than returning 0 because this return value is used
for parse errors, not programme-flow errors, and it's inconsistent to
use the same value for both. Plus, prior to this, I'd return 0 without
printing an error message, which would cause failure to go unreported to
the operator.
Turn on -Tascii tbl printing. The output still has some issues---I'm
not sure whether it's in the header calculation or term.c squashing
spaces or whatever, but let's get this in for general testing as soon as
possible.
Churn to get parts of 'struct tbl' visible from mandoc.h: rename the
existing 'struct tbl' as 'struct tbl_node', then move all option stuff
into a 'struct tbl' in mandoc.h.
This conflicted with a structure in chars.c, which was renamed.
Merge in the width, decimal, and positioning code for individual data rows
from tbl.bsd.lv. This is more or less verbatim, less queue macros and also
a check for NULL layout.
This concludes the back-end parsing for a little while, as the front-end
display may now be configured.
Plug in the "head" concept for tables. A tbl_head specifies the full
layout for each row, including vertical spacers. One grabs the tbl_head
for a row and iterates through each entry, plugging data from the
tbl_span into the header as appropriate.
This is pulled in more or less verbatim from tbl.bsd.lv. In fact, this
is verbatim except that lists macros are made into hard-coded lists (for
compatibility, as long-ago noted by joerg@).
Add -man support for tables. Like -mdoc, this consists of an
external-facing function man_addspan() (this required shuffling around
the descope routine) and hooks elsewhere.
Add table processing structures to -mdoc. This consists of an
external-facing function mdoc_addspan(), then various bits to prohibit
printing and scanning (this requires some if's to be converted into
switch's).
Ingo Schwarze [Thu, 30 Dec 2010 00:51:32 +0000 (00:51 +0000)]
Plan9 has a man(7) implementation that looks extremely archaic,
even more archaic than Solaris/Heirloom stuff; so that is quite
interesting from a perspective of compatibility and history.
Initial check-in of table data-row processing. For the time being, this
parses table data then throws it away immediately. It does not yet try
to cross-check data rows against layout or anything. This copied more
or less completely from tbl.bsd.lv.
Adding initial options processing (not hooked into parse yet). This is
more or less copied from tbl.bsd.lv and still needs integration with the
general mandoc framework, e.g., with error messages.