Ingo Schwarze [Sun, 8 Aug 2010 14:51:32 +0000 (14:51 +0000)]
simplify the code copying the macro name, and sync the
accompagnying comment between man_pmacro() and mdoc_pmacro();
ok'd by kristaps@ together with main.c rev. 1.102
Ingo Schwarze [Sun, 8 Aug 2010 14:45:59 +0000 (14:45 +0000)]
Make sure we really throw away non-ASCII characters.
For example, on OpenBSD without locale settings,
isgraph(3) returns true for some eight-bit characters.
ok kristaps@
Clean out the isgraph() checks in mdoc.c and man.c. These code paths
were never taken since main.c begin skipping over unrecognisable
characters, so they were noops.
"Groff allows the initial macro on a line to be delimited by a space of
by a tab; so allow the tab in mandoc, too." Original problem noted by
schwarze@. Sync with OpenBSD.
Ingo Schwarze [Fri, 6 Aug 2010 17:09:58 +0000 (17:09 +0000)]
tweaks from jmc@:
* correct a few obvious mistakes
* adopt some of jmc@'s recent changes to man(7)
* cut down just a little on the awful tendency
to stick a hyphen between two words.
Ingo Schwarze [Fri, 6 Aug 2010 17:07:11 +0000 (17:07 +0000)]
merge from OpenBSD:
- HISTORY is interesting even when there are STANDARDS
- more precise instructions what to put into AUTHORS
- add the version argument to the mdoc(7) .Os macro
IMPORTANT FIX: add missing braces around alloc failure conditional in
fuction-isation of PS_GROWBUF. Obviously the original commit was never
actually tested, as -Tps and -Tpdf errored out immediately.
Fix `ds' handling. This was stripping characters from "val", when the
syntax of `ds' is such that ALL text following the first
non-space/non-double-quote is part of the value. This also fixes the
warning of *(string++) = NULL report by kristaps@ and joerg@.
Fix how `Bd -unfilled' and `Bd -literal' break lines. This unbreaks
displays to work as old groff shows them; however, new groff still does
some fancy shit.
Clean up some tight spots in mandoc's default mode: pessimistically
pre-allocate the output buffer for words and in-line the buffera()
function, which was only called in one place anyway.
Remove asciisz from chars.in. It frees up a nice chunk of memory and at
the overhead of running strlen() for ASCII strings (yes, I benchmarked
this running mandoc_char(7) as input again and again with
hundredth-second penalties... on my slow-ass alpha).
Initial PDF shim over PS. This produces working PDF output with -Tpdf.
It's currently missing the xref table, so you'll get a warning in most
PDF viewers). It also produces lots of redundant output, which will go
away once I get a better handle on the PDF spec. The code doesn't
really touch any existing functionality; it's a bunch of conditionals
atop the -Tps (term_ps.c) implementation. I'm checking it in now to
have it exist and be auditable. It needs clean-up, polish, and general
care (and xref!).
Bring `sp', `Sp', and `br' behaviour for -man in line with how -mdoc's
is handled: correctly. This removes superfluous line breaks in many
-man manuals.
Have `nf' and `fi' flush lines. This is necessary or the LITERAL will
be meaningless when invoked within a non-flushing context. This based
on a formatting bug report submitted by Jonathon Gray (jsg@) via
Christian Weisgerber (naddy@).
In the SYNOPSIS, .Nm at the beginning of an input line starts
an .Nm block, and gets special handling (new line, indentation).
But .Nm in the middle of a line is just a normal in-line element,
so make sure it does NOT get the special handling.
Partly fixes the test(1) SYNOPSIS; indentation after "[" is still
excessive, which is an unrelated and more difficult issue.
Reminded of the problem by jmc@;
OK kristaps@.
Accomodate for groff's crappy behaviour wherein an unrecognised
single-character escape (and ONLY this type of escape) will map back
into itself:
"If a backslash is followed by a character that does not
constitute a defined escape sequence the backslash is silently
ignored and the character maps to itself."
Strip non-graphable input characters from input. The manuals
specifically say that this is not allowed, and were it allowed, output
would be inconsistent across output media (-Tps will puke,
non-your-charset terminals will puke, etc.).
With this done, simplify check_text() to only check escapes and for
tabs. Add in a new tab warning, too.
sync to OpenBSD:
* briefly mention the HISTORY of the man(7) language
* update the copyright notice
* improve the wording in a few places
* fix a couple of typos
including two suggestions from J.C. Roberts
feedback and ok jmc@, ok sobrado@ and kristaps@
Throw out a2roffdeco() in out.c for a readable version. The prior one
was completely unmaintainable. The new one is both readable and quite
similar to mandoc_special(), which in future versions will easily allow
throwing-away of unsupported escapes (such as \m). It's also a fair bit
smaller.
DECO_SIZE has been removed: this crap, like colours, will not be
supported.
mandoc_special() also has #if 0'd switch branches for ALL groff.7
escapes and some lint fixes.
By letting strncmp() do its job and not helping it with a prior length
check, we can remove the hard-coded length of all escape patterns. This
frees up a nice chunk of memory.
After .Sm on, spacing ought to restart right away, before the next token,
and not with a delay, after the next token. But be careful not to cause
leading white space at the beginning of a line or column.
In OpenBSD, improves chmod(1), ksh(1), tar(1), ps(1) and probably many more.
ok kristaps@ and tested by jmc@ and sobrado@
Re-constitution of `ds' symbol processing. First, push the
roff_getstr() family of functions into roff.c with the "first_string"
directly in struct roff. Second, pre-process each line for reserved
words in libroff, splicing and re-running a line if it has one (this
allows defined symbols to be macros). Remove term.c's invocation of the
roff_getstrn() function. Removed function documentation in roff.3 and
added roff.7 `ds' documentation.
Proper `Bk -words' support: only suppress breaks within a line, but
allow end-of-line to break. This fixes the bad behaviour found when
macros within `Bk' never break.
Renamed mandoc.c to libmandoc.c. This is in the efforts of getting a
cleaner namespace for functions across the entire system (mandoc.h:
getting parsed-string values, or declarations necessary for the AST
data), and compiler functions (libmandoc.h: back-end functions and
declarations).