First, roff_res() has no need to invoke ROFF_RERUN: since it's executed
before any other roff processing occurs, it's Ok to just let it do its
thing and pass through. Also, make sure this function is ALWAYS called,
not just when first_string is defined.
Second, add a new function, roff_parsetext(), that post-processes
non-macro lines. This, for the time being, amounts to detecting soft
hyphens. This fixes a long-standing bug in that -man now has proper
hyphen breaking!
Move checking of escapes into roff.c, where we're already stepping
through looking for user-defined escapes. This clears up a nice bit of
validation code.
Implement the first steps of equation parsing from within libmdoc.
This consists of a shim around the text parser that calls out to libroff
if equation components exist on the line. Right now this will do
nothing, as the equation delimiter always returns nil.
Scary-looking but otherwise harmless changes allow me to build for Windows.
That is to say, with mingw32. This amounts to the following:
(1) break compat.c into compat_strlcpy.c and compat_strlcat.c
(2) add compat_getsubopt.c (from OpenBSD) and test-getsubopt.c
(3) add test-strptime.c for HAVE_STRPTIME
(4) add ifdef bits here and there, where necessary
(5) remove some harmless unportable stuff (u_char, localtime_r)
I've added the appropriate mdocml.zip target to the Makefile, too.
Add support for tdefine and ndefine. Consolidate some error messages. Add
somem more version notes (getting there). Have the equation nanme be captured.
The circumflex is also a special space character.
Note this and clean up some documentation in eqn.7.
Also add some version notes, although I'm not ready for a release yet.
Add all rendered symbols used by eqn. I use the Second-Edition User's
Manual (1978) for this, so it should catch most of them. They just map
into the mandoc_char escaped characters.
Use a macro instead of doing a string-fragment compare. I just get
worried that I'm going to write the wrong size on both sides of the
equality (I've already done it a few times). This cleans up the code
readability a bit.
Accomodate for hard-spaces with tildes. For now, consider them regular
spaces. Also allow for tabs. Finally, have the parser correctly handle
open and close brackets smooshed against other terms. All of these
handle "details" noted in the CACM paper.
Complete eqn.7 parsing. Features all productions from the original 1975
CACM paper in an LR(1) parse (1 -> eqn_rewind()). Right now the code is
a little jungly, but will clear up as I consolidate parse components.
The AST structure will also be cleaned up, as right now it's pretty ad
hoc (this won't change the parse itself). I added the mandoc_strndup()
function will here.
Finish the eqn syntactic parser. This correctly parses terms and does
the proper `define' dance, which amounts to pure word-replace (you can,
say, define `foo' as `define' then define `define' as something else).
eqn.c is now ready for some semantic parsing of `box' and `eqn'
productions as defined by the grammar.
Make `struct roff' be passed into libmdoc and libman upon creation.
This is required for supporting in-line equations. While here, push
registers properly into roff and add an set/get/mod interface.
Provide implementations of `define', `set', and `unset'.
Tie them into the stream of data.
Document these appropriate, bringing in the grammar as defined by the
original eqn manual (Kernighan/Richie).
Add initial `define' support for eqn(7).
This works by iterating over a simple list. It's a slow, auditable
early implementation. Data is read (the reading function will be
reused) then parsed, then the line re-run if remaining stuff exists.
Note this function isn't the same as mandoc_getarg(), as eqn(7) uses a
different system for reading quoted strings.
This doesn't actually use the defines.
First step of making mandocdb a true makewhatis/mandb replacement:
accept a set of directories on the command line ("manpaths") that are
recursed for files. The databases are created in each manpath root.
This temporarily removes OP_UPDATE and OP_DELETE functionality, which
will be added back in.
Move parts of mandocdb that "do stuff" to the databases into their own
functions. This will make it easier to call repeatedly (for different
directoreis) as must be done with the new interfaces being developed.
Rename makewhatis [back] into mandocdb. This is to maintain consistency
with OpenBSD, which is sandboxing the code for merge. It makes sense
because it doesn't really make a `makewhatis' file in the traditional
sense, so it may be confusing.
Fairly straightforward patch adding basic update (-u) and remove (-r)
functionality to makewhatis. This is somewhat expensive (requiring the
index file to be trawled multiple times), but it's a good start.
Make sure that `br' and `sp' don't emit space before the initial `SH' in
-man. This actually seems to be what groff does. Sort-of. Anyway,
it's required to get perlpod pages rendered nicely, so until perlpod
stops producing shit, do it. Ok schwarze@.
Fix two issues: the first, where a `.\}' wasn't being interpreted as a
proper macro in some conditions, resulting in strange parse errors. The
second, where `\}' was being re-written as `\&'. Instead, we re-write
this as two spaces OR nothing at all, if at the end of line. This isn't
exactly what groff does (who knows...) but is a much safer and better
way than how I was doing it before.
Clean up how -man -T[x]html handles TP, IP, and HP (dd lists and
indented paragraph macros, respectively). This cleans up code and also
cleans up the output quite a lot.
Fix a bug in the -man parser where deleting nodes (such as `PP' or `LP'
in certain situations) caused the next macros to be assigned as siblings
instead of child nodes to the original parent. Noticed and ok by
schwarze@.
The bufcat() function in -T[x]html was eating one byte off the end of its
concatenated string. This for some reason hasn't been found before now... ?
Anyway, fixed, and make the IDs created again be correctly prefixed by a
letter as per the HTML spec.
Fix a TODO noted by schwarze@, originally by Christian Weisgerber:
literal mode (`nf') is ended by SH (and, it turns out, SS as well).
Noted the updated behaviour in man.7 as well.
Make scan for text tokens in a line recursive. This is really only for
the benefit of `Nd', which is the only [to date] node that can consist
of sub-nodes.
Ouch: predefined strings moved into roff.c weren't being reinitalised
after the first parse. Do this, but note there are more efficient ways
just waiting for a table of macros.
First fix how `sp 1' doesn't imply `1v' (it now does) and that 1
followed by non-digits, e.g. `1g', really means `1'. Next, fix some
spacing issues where `sp' was invoked in -man after sections or
subsections. Make sure this behaviour is mirrored in -Thtml.