In apropos_db.c, move all btree reading (and safety checks) into the
btree_read() function. Also, add a forgotten free() for the type of
grokked record.
Then in both mandocdb.c and apropos_db.c, make the "rec" field of the
btree by in network-order.
Fix mandocdb(8) to pass over the type when pruning the database. This
fixed `-d' perpetually adding the same files. While here, clean up the
code and document it. Remove -vv (complain if you want it back in).
Document the error messages in a DIAGNOSTICS section of mandocdb(8).
Add whatis(1) to www and start version information.
While here, change "mdoc macro compiler" to "UNIX manpage compiler", which
is more correct. I'm not sold on this language; I may end up just going
with mandoc(1)'s notation.
Note also that our archives are now hosted at gmane.
Note that mandocdb(8) record type is 64-bit and show all possible values.
Also slightly clarify the role of mdoc/man/cat. Finally, remove mandoc(1)
reference (it's not mentioned in the manual).
Snip some whitespace from apropos(1) and remove mandoc(1) ref from
whatis(1) (both apropos/whatis aren't related to mandoc from an
operator's perspective).
Make `-i' only apply to regular expressions. For the equality operator
(and thus the default), always use strcasestr(). Discussed on tech@
with schwarze@. While here, fix the apropos.c usage() message to be
consistent with apropos(1) and clean up the EXAMPLES in apropos(1).
Ingo Schwarze [Mon, 28 Nov 2011 09:44:05 +0000 (09:44 +0000)]
Tweak whatis(1):
* Bugfix: Use all arguments, not just the last one.
* Use 'Nm~' instead of 'Nm,Nd~' to match OpenBSD behaviour.
* For the progname, accept '^whatis', not '^whatis$' to ease testing.
ok kristaps@
Ingo Schwarze [Mon, 28 Nov 2011 01:37:34 +0000 (01:37 +0000)]
Discuss the default behaviour up front before talking about options
modifying it; based on a remark by kristaps@.
While here, mention parsing of unformatted files
and the changed index format and fix a few minor issues.
Ingo Schwarze [Sun, 27 Nov 2011 23:27:31 +0000 (23:27 +0000)]
Reimplement the global command line options -a and -v
as static global variables, reducing the maze of arguments
passed around among various static functions.
Suggested by kristaps@.
Ingo Schwarze [Sun, 27 Nov 2011 23:11:37 +0000 (23:11 +0000)]
Save the manual type (mdoc, man, or cat) in the index file
of the mandoc databases, as suggested by kristaps@.
Given the well-structured code, this is surprisingly simple.
This changes the mandoc.index database format.
Run "sudo mandocdb" to regenerate your databases.
Ingo Schwarze [Sun, 27 Nov 2011 22:57:53 +0000 (22:57 +0000)]
Rudimentary handling of formatted manuals ("cat pages").
Coded on the train back from p2k11 in Budapest.
Kristaps has seen the patch and agreed with the direction.
Ingo Schwarze [Sat, 26 Nov 2011 22:38:11 +0000 (22:38 +0000)]
Sync to OpenBSD, mostly gratuitous and whitespace differences,
but a few serious things as well:
* -M overrides MANPATH
* -m prepends to the path
* put back database close calls that got lost in mandocdb
* missing sys/types.h in manpath.c, needed for size_t
ok kristaps@
Ingo Schwarze [Sat, 26 Nov 2011 11:23:56 +0000 (11:23 +0000)]
Store page titles in the correct case, and by default, only
put stuff into the database that man(1) will be able to retrieve.
However, support an option to use all directories and files.
feedback and ok kristaps@
I say that mandocdb(8) uses "man(1)'s method", but it doesn't. It just uses
the configuration file and ignores MANPATH. Everybody else uses MANPATH
(being apropos and man), so why shouldn't we?
Make a small manual for how to run man.cgi.
This exists almost entirely to document that /tmp must exist in a jailed
Apache directory for dbopen() not to fail. This was a massive headache
to track down.
(1) Insecure. This means that we're operating over the full file-system
with access to mandoc(1). In this mode, mandocdb entries are formatted
on-the-fly. The $INSECURE environment variable must be passed to
man.cgi for this mode to work.
(2) Secure. Manuals are assumed to be pre-formatted in a cache directory,
which may be set with $CACHE_DIR but default to /cache/man.cgi.
This mode works with manup(8), which updates the cached pages from
outside of the jail. man.cgi simply locates the manual file and
outputs it to stdout.
Export the manpath_manconf() function, slightly reorderng manpath.c while
doing so. This will be used by a jailed man.cgi, as the cache built by
manup(8) creates a man.conf for it to use.
Add manup(8). This runs through mandocdb(8) databases (in the same way that
apropos(1) does so) and updates an HTML fragment cache for use by man.cgi.
Right now man.cgi is "online" in that it requires mandoc(1) in its path,
but this doesn't work for, say, OpenBSD's apache chroot(1). This allows
a cache to be maintained.
man.cgi works for the non-jailed case.
In other words, if you smash this into a cgi-bin directory, it will Just
Work for your system's manuals (it of course needs access to mandoc(1) and
your file-system, hence "non-jailed").
The notion of a jailed case is much more subtle and being worked on now.
Let apropos_db.h export the volume of manpages for a parsed record.
This is necessary since an array of records can have duplicate record
numbers in different mandoc.index files.
The volume [right now] is just the index of the parsed mandoc.index in
the manpaths. This is sensible because the order of the manpath is
significant (it's the order of duplicate-named manuals displayed by
man(1)) and is thus not likely to change.
Have mandocdb(8) take advantage of manpath.h.
This brings it in line with makewhatis(8), which, like apropos(1), will use
man.conf (or manpath(1)) if no manpath entries are provided.
Support for Open/NetBSD's /etc/man.conf and others' manpath(1).
Most of this code (except the manpath part) written by schwarze@.
This isn't hooked into anything yet.
Update historical record to be historical and not made-up. Data from
<manpages.bsd.lv/history.html>. Ok schwarze@ (with modifications) and
Jason McIntyre.
Clarify some behaviour, bringing schwarze@'s patch and mine closer together
(although I still don't have -M, which is a big piece).
First, the default search path is the cwd. This will change to use -M
once I look over that code.
If MANPATH is specified, this replaces the cwd.
Both of these are augmented by -m.
If paths don't exist or don't have databases, they're silently ignored.
This makes perfect sense: you may be given a superset of possible paths.
The corner case of no paths (where, say, MANPATH consists of bogus paths
or the cwd is unreadable) simply means that no paths are searched.
Integrate a moderately-patched version of schwarze@'s support for multiple
directories containing mandocdb(8) databases. Some changes follow:
(1) don't support -M yet;
(2) fall back to cwd if no prior manpath has been specified;
(3) resolve manpages using realpath() to prevent consecutive chdir()'s
over relative paths;
(4) note where further error-reporting is required;
(5) fix leaking memory on exit in several cases.
Merge schwarze@'s work for 64-bit types. This is based on a tweaked patch
submitted to tech@ on 16/11/2011, 01:39. It has been updated to account
for the logical-operator functions and to avoid keeping a live pointer into
the DBT value, which is not guaranteed to be consistent across calls into
the bdb library.
Ingo Schwarze [Sat, 19 Nov 2011 13:29:47 +0000 (13:29 +0000)]
Improve misleading comment:
* Not sure there were any text nodes, might have been other stuff instead.
* Not sure it was just one node, maybe several were deleted.
* No problem if some nodes were deleted, as long as some valid ones are left.
* We do not leave early, but after cleaning out all the crap.
* We are not "bailing", but we consider the block valid after cleanup.
Ingo Schwarze [Mon, 14 Nov 2011 15:10:27 +0000 (15:10 +0000)]
Add lots of information about special characters that's actually needed
in practice, and discourage using fancy characters in manuals.
Text about "Dashes and Hyphens" by jmc@.
Feedback and ok jmc@, grudgingly ok kristaps@.
Have exprcomp() accept a string instead of an array-pointer. Also, collapse
the arguments in apropos(1) into a single string passed to exprcomp(). Ok
schwarze@.
Ingo Schwarze [Sun, 13 Nov 2011 13:15:14 +0000 (13:15 +0000)]
Make the default left text margin configurable from the command line,
just like the default right margin already is. This may be useful for
people with expensive screen real estate. Besides, it helps automated
man(7) to mdoc(7) output comparisons to validate -Tman output.
ok kristaps@ on an earlier version
Ingo Schwarze [Sun, 13 Nov 2011 10:49:57 +0000 (10:49 +0000)]
Inventing new keywords for mostly the same thing when a well-established
set of keywords already exists is a bad idea, so reuse the mdoc(7)
macro names as apropos(1) search types. This is a gain in brevity
as well. Some time ago, kristaps@ agreed in principle.
The search type bit field constants are used by both mandocdb(8) and
apropos(1) and should better stay in sync, so give them their own
header file.
Ingo Schwarze [Sun, 13 Nov 2011 00:53:13 +0000 (00:53 +0000)]
Fix two crashes that occur when walking very large (i.e. real-world) trees:
1) Avoid excessive, needless recursion, lest you overflow the stack;
2) Close all dir file descriptors, lest you run out of descriptors.
ok kristaps@
Split apropos.c into db.c and apropos.h with simpler code (re-written, but
inspired by apropos.c and mandoc-tools' mandoc-cgi.c). This uses UTF-8
right now for its re-writing, but will soon accomodate for the regular
suspects (this is a rather simple matter).
I also introduce man.cgi (cgi.c), which is a standalone CGI that replaces
mandoc-tools' mandoc.cgi. Right now it's just a framework.
Ingo Schwarze [Mon, 7 Nov 2011 01:24:40 +0000 (01:24 +0000)]
When the HEAD scope of .TP is broken by another block macro,
do not abort with a FATAL error, but report a report a WARNING,
remove the broken .TP from the syntax tree, and prod on.
Reported repeatedly by ports people, at least by brad@ and jeremy@.
Also fixes rendition(4) in Xenocara.
ok kristaps@
Ingo Schwarze [Thu, 3 Nov 2011 20:48:52 +0000 (20:48 +0000)]
When .TH sets no data, leave the date field in the page footer blank,
do not use the current date. This removes a gratuitous output difference
with respect to groff.
ok kristaps@
Ingo Schwarze [Tue, 1 Nov 2011 14:59:27 +0000 (14:59 +0000)]
Clean up the description of .Dt:
- Volume and arch are both optional and not alternatives.
- Zap verbiage about what's obvious from the synopsis.
- For fixed argument strings, use .Cm, not .Ar.
Using lots of input from jmc@.
Also, state that the list of valid architectures varies by OS.
If a downstream distribution wants to provide a specific list,
maintaining a local patch is the way to go.
Ingo Schwarze [Mon, 24 Oct 2011 21:47:59 +0000 (21:47 +0000)]
Implement missing enclosures (Ao Do Qo Qq So Bro Brq)
and enclosure-like in-line macros (Ad Cd Dv Er Ev Li Ms Tn).
The .No macro works without explicit implementation.
Ingo Schwarze [Mon, 24 Oct 2011 21:41:45 +0000 (21:41 +0000)]
Handle infinite recursion the same way as groff:
When string expansion exceeds the recursion limit, drop the whole
input line, instead of leaving just the string unexpanded.
Ingo Schwarze [Mon, 24 Oct 2011 20:30:57 +0000 (20:30 +0000)]
Handle \N numbered character escapes the same way as groff:
If \N is followed by a digit, ignore \N and the digit.
If \N is followed by a non-digit, the next non-digit
ends the character number; the two delimiters need not match.
Kristaps calls that "gross, but not our fault".
For now, i'm fixing \N only. Other escapes taking numeric arguments
may or may not need similar handling, but \N is by far the most
important for practical purposes.
Ingo Schwarze [Thu, 20 Oct 2011 20:27:21 +0000 (20:27 +0000)]
Implement the missing text production macros (Bsx Bx Dx Fx Nx Ox Ux Bt Ud).
Some macros work without explicit implementation (At Db Os St).
ok kristaps@
Ingo Schwarze [Sun, 16 Oct 2011 12:20:34 +0000 (12:20 +0000)]
Remove a bunch of useless assignments,
and assert that print_bvspace cannot be called on NULL pointers.
No change in behaviour, none of these were bugs,
but the code becomes easier to understand.
Based on a clang report posted by joerg@; ok kristaps@.
Ingo Schwarze [Sun, 9 Oct 2011 22:10:53 +0000 (22:10 +0000)]
Always print <table> column widths in -T[x]html;
if desired, they can be overridden in the CSS file.
Suggested by kristaps@, and i always like to simplify code.
Use a binary tree (for now, unbalanced) for deduping the records in the
results array. This is much faster than the previous method, a linear
search, at a small cost. Note that array offsets are used instead of
storing the res pointer because we may realloc the results vector.
Get ready for version. I'm bumping the minor release to 1.12 because
this involves both a major functionality addition (-Tman), a new utility
(apropos), and both apropos and mandocdb being built by default.
Tidy up -Tman output. This has NO functional change: (1) introduced a
state struct instead of using global statics; (2) documented throughout
the file; (3) fixed a situation of reaching past the end of our buffer
for zero-length strings; (4) alpha-ordered the functions. (1) and (3)
ok schwarze@. (2) and (4) are purely style and documentation.
Clean up file a bit: remove errx and err function pointers from the
state struct (directly using fprintf and perror to do this); add some
in-line documentation; remove state init and destroy directly to the
main function.
Import apropos from mandoc-tools after inlining all source files
(originally including extern.h, state.c, and sort.c). The apropos
utility interfaces with the databases of mandocdb to provide semantic
searching capabilities. It Works For Me, but will need lots of cleanup
in the coming months.