Make `struct roff' be passed into libmdoc and libman upon creation.
This is required for supporting in-line equations. While here, push
registers properly into roff and add an set/get/mod interface.
Provide implementations of `define', `set', and `unset'.
Tie them into the stream of data.
Document these appropriate, bringing in the grammar as defined by the
original eqn manual (Kernighan/Richie).
Add initial `define' support for eqn(7).
This works by iterating over a simple list. It's a slow, auditable
early implementation. Data is read (the reading function will be
reused) then parsed, then the line re-run if remaining stuff exists.
Note this function isn't the same as mandoc_getarg(), as eqn(7) uses a
different system for reading quoted strings.
This doesn't actually use the defines.
First step of making mandocdb a true makewhatis/mandb replacement:
accept a set of directories on the command line ("manpaths") that are
recursed for files. The databases are created in each manpath root.
This temporarily removes OP_UPDATE and OP_DELETE functionality, which
will be added back in.
Move parts of mandocdb that "do stuff" to the databases into their own
functions. This will make it easier to call repeatedly (for different
directoreis) as must be done with the new interfaces being developed.
Rename makewhatis [back] into mandocdb. This is to maintain consistency
with OpenBSD, which is sandboxing the code for merge. It makes sense
because it doesn't really make a `makewhatis' file in the traditional
sense, so it may be confusing.
Fairly straightforward patch adding basic update (-u) and remove (-r)
functionality to makewhatis. This is somewhat expensive (requiring the
index file to be trawled multiple times), but it's a good start.
Make sure that `br' and `sp' don't emit space before the initial `SH' in
-man. This actually seems to be what groff does. Sort-of. Anyway,
it's required to get perlpod pages rendered nicely, so until perlpod
stops producing shit, do it. Ok schwarze@.
Fix two issues: the first, where a `.\}' wasn't being interpreted as a
proper macro in some conditions, resulting in strange parse errors. The
second, where `\}' was being re-written as `\&'. Instead, we re-write
this as two spaces OR nothing at all, if at the end of line. This isn't
exactly what groff does (who knows...) but is a much safer and better
way than how I was doing it before.
Clean up how -man -T[x]html handles TP, IP, and HP (dd lists and
indented paragraph macros, respectively). This cleans up code and also
cleans up the output quite a lot.
Fix a bug in the -man parser where deleting nodes (such as `PP' or `LP'
in certain situations) caused the next macros to be assigned as siblings
instead of child nodes to the original parent. Noticed and ok by
schwarze@.
The bufcat() function in -T[x]html was eating one byte off the end of its
concatenated string. This for some reason hasn't been found before now... ?
Anyway, fixed, and make the IDs created again be correctly prefixed by a
letter as per the HTML spec.
Fix a TODO noted by schwarze@, originally by Christian Weisgerber:
literal mode (`nf') is ended by SH (and, it turns out, SS as well).
Noted the updated behaviour in man.7 as well.
Make scan for text tokens in a line recursive. This is really only for
the benefit of `Nd', which is the only [to date] node that can consist
of sub-nodes.
Ouch: predefined strings moved into roff.c weren't being reinitalised
after the first parse. Do this, but note there are more efficient ways
just waiting for a table of macros.
First fix how `sp 1' doesn't imply `1v' (it now does) and that 1
followed by non-digits, e.g. `1g', really means `1'. Next, fix some
spacing issues where `sp' was invoked in -man after sections or
subsections. Make sure this behaviour is mirrored in -Thtml.
Try again to get the transfer from hash to btree working. This time
just closing and re-opening the database, as deleting records with
(*hash->del) either in the scan loop or after it causes uncertain
behaviour (left-over keys, mystery keys, etc.). This finally does the
Right Thing (tm).
Big change to makewhatis: use an in-memory hashtable to collapse
multiple types of the same name (e.g., "foo" being a manual name,
utility name, etc.) into a single bitmask'd region. This considerably
reduces the size of the keyword database.
Allow RS/RE blocks to nest. This requires first the syntax tree to
accomodate for the fix, then for the front-ends. -T[x]html accepted the
syntax tree natively, but -Tascii had to use relative offsets. It's
quite a simple fix.
Add back in a check that the leading `-' exists for arguments. This
mysteriously disappeared in 1.14. No idea why. While here, remove an
unnecessary header and order the function prototypes.
Fix an assertion failure raised by the following interesting scenario: a
auto-opened `It' (i.e., a column list with a free-text first line) with
leading spaces in the line triggering assertion when searching for
arguments.
This led to a fix giving a nice performance speed-ups (a few percent,
with some quick trials): the search for flags immediately exits if the
macro has no flags, instead of having to first parse the leading word
then look it up. I also cleaned up the argv parsing stuff a little bit
and added more documentation.
Fix some bad bits in the mandoc manual: `Xr' instead of `Sx', unescaped
stuff that should be escaped, and a style matter or two. Pointed out by
Jason McIntyre, thanks!
preconv is now on encoding-recognition parity with groff. This last
commit adds parsing of "File Variables" in the first two lines in order
to grok the encoding. This completes groff's recognition sequence (-e,
BOM, File variables, -D, default). I've also cleaned up the manual to
indicate this and for some general readability.
preconv is now compiled by default in the Makefile.
Significantly improve preconv. Allow it to recode UTF-8 characters into
the \[uNNNN] strings (taking into account big-endian archs). Also allow
it to determine from the BOM whether it's a UTF-8 file. Also add the
initial manual. This has been tested over a random selection of UTF-8
documents, as
% preconv -e utf-8 foo.1 | ./mandoc -Tlocale
where -Tlocale is allowed (-DUSE_WCHAR).
Note that we're still missing the "type" indicator that preconv accepts.
If a predefined string is missing, emit a warning and make it an empty
string instead of passing it along to libmdoc/libman (where it'll be
printed verbatim, now). This is what groff seems to do, too (of course
without a warning).
It's annoying that we don't have preconv, so throw together a quick
version and let it grow in-tree. Right now, this only supports the
Latin-1 and US-ASCII encoding. I'll do UTF-8 next. It's
call-compatible with GNU's preconv although I don't do fancy stuff like
BOM or header check. This will come. I used read.c's file-grokking
code.
Use the correct Unicode value for the zero-width space, which means that
spec2cp never needs to fall through to spec2str. Then clean out html.c
of its unnecessary print_res() function.
Remove predefined strings from the chars.in file, as they're now local
to predefs.in. This also makes "BOTH" entries directly into CHAR. The
res2str and spec2str are now effectively the same function.
Most important move in getting predefined strings entirely contained
within roff.c. These are now grokked from a table in the roff
allocation routine and rest in the newly-created predefs.in (for
consistency with chars.in). This is a first implementation and will
likely be optimised along with the ds/de lookup table itself.
This allows mandoc-defined predefined strings to be correctly removed or
whatnot; earlier they couldn't. What will follow is the stripping-away
of all predefined-string crud in the other parts of the system.
Have conditional closure for both text and macro lines call through to
ccond(). Fix the text handler to behave like the macro handler
regarding escaped \}. Make \} actually become a zero-width space, too,
and clean up the documentation in this regard.