When routing to a "result" page in the cgi, remember our input parameters
and repeat them in the search bar. This is handy. While here, make the
QUERY_STRING parser a bit simpler.
Make the stored "cat"/"mdoc"/"man" strings just be c/d/a single-character
bytes. This cuts down a little in index size and allows for cleaner
extraction of information.
Make paths in the mandocdb(8) index relative to the databases' path
prefix. This means that an index in, say, /usr/share/man will point to
man1/foo.1 instead of /usr/share/man/man1/foo.1. Not only does this
save a lot of space, it also allows manual trees to be moved around
without any side effects to the mandocdb(8) databases.
- include search bar above result page (I relent: it's annoying to
follow three links then press back three times to get a search page);
- make man.cgi.css into man-cgi.css so Apache isn't confused by two
handlers (css, cgi);
- finally consolidate example.style.css to be under the div.mandoc css
selector;
- put catman pages under div.catman;
- put search bar under div#mancgi;
- reflect this properly in the bundled CSS files.
Have manpath.c properly use manpath(1), that is, using -C and -m and so on.
This also cleans up the code a little bit. While here, make some functions
static that are only used within manpath.c.
Compatibility support fgetln() on Linux. This uses the BSD-licensed
implementation from NetBSD tnftpd, Christos Zoulas (copyright message
retained in the compat_fgetln.c file). Patch verified by schwarze@. He
notes that you'll need -pthread for -static binaries (due to libdb), so
I've noted that -static should really only be used for BSD UNIX.
While here, add some forgotten goop to the Makefile, building and
cleaning extra manpages.
When 303'ing a search directly to a page, remember to specify its manroot.
Also allow for a CSS_DIR to specify alternate CSS locations.
Finally, some clutter as I assume that "css" and "progname" are already
HTML-safe.
Finishing touches on multi-manroot man.cgi. If more than one root is
specified, write them out using a SELECT box. Else write nothing (the
manroot will still be checked if it's specified).
Switch on "manpath=" handling, which I call the "manroot" (as "manpath" is
reserved for paths within a manroot). This functionality is bare-bones:
right now, the default manroot is the first one scanned from the cache
directory. At some point this will be sexy and smooth, but it's easy to
upgrade functionality by modifying pathgen() and so forth. If a manroot
isn't parsed from the "manpath=", results are always empty.
- Deprecate kvals (key/value pairs for QUERY_STRING values). Since there's
only one place that uses this, kval_parse (now http_parse()) dumps directly
into struct query, which is more high-level.
- Put query values directly into struct req.
- The biggest difference is dynamic support for multiple "manroots". A
"manroot" is a path with an "etc/catman.conf" file. When the cgi starts,
it (prefix) recurses through its CACHE_DIR searching for "etc" directories.
When one's found, it sees if a catman.conf file exists. This is marked
as a manroot and appended to a list. The name of a manroot is the path
without slashes (e.g., OpenBSD/4.9 -> "OpenBSD 4.9").
Right now "manroot" isn't enabled. The first manroot is chosen as the
real one. I'll add the interface to it in the next checkins, but it'll be
quite simple.
Ingo Schwarze [Sat, 10 Dec 2011 16:53:39 +0000 (16:53 +0000)]
Fix selection of arch-specific manuals:
(1) Correctly compare cat vs. man paths.
(2) Compare arch (and section) names case-insensitively.
Problem noticed by kristaps@.
When specifying an architecture to whatis(1)/apropos(1)/man.cgi(7), do a
comparison only if the manual specifies an architecture, otherwise let it
through. Looked over by schwarze@. This brings us much more in line with
OpenBSD's behaviour.
Ingo Schwarze [Fri, 9 Dec 2011 11:16:34 +0000 (11:16 +0000)]
Tweak pformatted():
* If the first section is empty, use the file name as .Nd.
* No need to check (len > 0) after successful fgetln(3).
* Improve some comments and strip trailing whitespace.
ok kristaps@
Clean up grok of preformatted manual description.
(1) put fclose() at the end, as line isn't valid afterward (see fgetln())
(2) clean up loops to be more readable to my old eyes
(3) mandate trailing newline, nul-terminate, and use strrchr
If arguments are passed to mandocdb(8) in "default" mode, then use
realpath() to convert them into absolute paths before putting the
traversed subdirectory filenames into the index.
First, remove the catman(8) jobstart() stuff. It only copies files.
Second, when creating the destination filename, append the index's file
(which is an absolute path) to the cache directory, not to the index's
directory name.
Apropos and man.cgi should strcasecmp their output sorting.
man.cgi should sort in the first place -- it wasn't before.
Revert uppercasing of man.cgi title.
Add skeleton man.cgi.css file. I don't think this should become more
complicated than this. Also make the title be printed out in caps as it
is in apropos(1) and whatis(1).
Accept old-school man.cgi parameters like "sektion" and "query". This still
needs work because specifying an arch with "arch=i386" will return results
that don't have an arch specified. I think this is weird, but it will need
to be supported if we want backwards compatibility.
Have a whatis/apropos mode, with the default (hitting enter within the
expression text) be whatis. This is a much nicer default than apropos,
which can be scary. While here, fix the cat.css location (erroneously
put in the response page instead of the catman page) and add bits for
a default style-sheet.
Ingo Schwarze [Wed, 7 Dec 2011 01:57:20 +0000 (01:57 +0000)]
Implement search support for 24 additional macros, extract more information
from Fn, and lift section restrictions from An Cd Er Ev Fn Fo In Pa St Va Vt
by removing 4 handler functions and 50 lines of code.
ok kristaps@
Add cat2html functionality. This keeps track of italic/bold mode per line
and properly handles some funny troff-isms we've exposed. I originally
wanted to use man2html.c (found on W3's website with no known author)
but the code is dodgy. This will need some more work (links, etc.) but
does a decent job thusfar.
Note: I think it's better style NOT to use <pre>, and instead have each
line employ <BR> afterward. This allows browsers to break the lines if
necessary. This can be changed trivially (replacing the newline and pre
tags with the <BR> and new tag).
Ingo Schwarze [Sun, 4 Dec 2011 23:10:52 +0000 (23:10 +0000)]
Implement mdoc(7)-like output style variant for man(7) documents:
* one instead of three blank lines after the page header;
* one instead of three blank lines before the page footer;
* source instead of title(section) in the lower right corner.
Select this style variant with the undocumented command line option -Omdoc.
In the long run, we hope to unify the ouput of both languages and
to pull this out again, but that requires coordination with groff.
Grudgingly ok and, (as usual,-) more comments requested by kristaps@
Make catman and man.cgi understand the index type-field.
Also make catman's man.conf be generated as catman.conf to avoid clobbering
a real man.conf file.
Finally, add a placeholder catman() function to man.cgi for preformatted
manuals in the cache.
Ingo Schwarze [Sun, 4 Dec 2011 00:44:12 +0000 (00:44 +0000)]
Jumping out of man_unscope() for the root node is a bad idea
because that will skip root node validation, potentially entering
rendering modules will NULL pointers lurking in the meta data.
Instead, always validate the root node and (as suggested by joerg@)
assert validity of the meta data before using it in the renderers.
ok joerg@
Ingo Schwarze [Sat, 3 Dec 2011 23:59:14 +0000 (23:59 +0000)]
Remove an OpenBSD-specific tweak regarding .Xr spacing and make it
compatible with groff-1.21. This tweak was originally added for
compatibility with groff-1.15, which is no longer needed.
Ingo Schwarze [Sat, 3 Dec 2011 16:58:54 +0000 (16:58 +0000)]
When processing .Sh HEAD, as soon as we know which section this is,
fix up the section attributes of the HEAD, it's parent BLOCK, and
all its (text) children. This is required because the section
attributes get set when each node is allocated, i.e. before processing
the content of the node itself. Thus, the listed nodes got the section
attribute of the preceding section. No need to fix up the BODY, all
is fine there already.
Found while implementing TYPE_Sh for mandocdb(8).
OK and comment requested by kristaps@.
Ingo Schwarze [Sat, 3 Dec 2011 16:08:51 +0000 (16:08 +0000)]
ISO style "%Y-%m-%d" dates are common in man(7) .TH.
They have been considered valid in the past, but were reformatted
to the mdoc(7) "Month day, year" style.
To make page footers more similar to groff, no longer reformat them,
just print them as they are.
This doesn't change anything with respect to what's considered valid
or what is warned about.
Ingo Schwarze [Fri, 2 Dec 2011 01:37:14 +0000 (01:37 +0000)]
In man(7), when no explicit volume name is given, use the default
volume name for the respective manual section, just like in mdoc(7).
This gives us nicer page headers for cvs(1), lynx(1), tic(1),
mkhybrid(8), and many curses(3) manuals.
ok kristaps@
To not break compatibility, i wrote a corresponding patch for GNU troff
which Werner Lemberg accepted upstream at rev. 1.65 of:
http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff
This is a little gross: Linux and Apple need lots some cajoling to work
with byte-swapping. Tested on Mac. Any Linux machines somebody can
test on? Anybody?
While here, note the correct byte-size in mandocdb(8) and also note
field widths and endianness. The btree is now endian-neutral.
In apropos_db.c, move all btree reading (and safety checks) into the
btree_read() function. Also, add a forgotten free() for the type of
grokked record.
Then in both mandocdb.c and apropos_db.c, make the "rec" field of the
btree by in network-order.
Fix mandocdb(8) to pass over the type when pruning the database. This
fixed `-d' perpetually adding the same files. While here, clean up the
code and document it. Remove -vv (complain if you want it back in).
Document the error messages in a DIAGNOSTICS section of mandocdb(8).
Add whatis(1) to www and start version information.
While here, change "mdoc macro compiler" to "UNIX manpage compiler", which
is more correct. I'm not sold on this language; I may end up just going
with mandoc(1)'s notation.
Note also that our archives are now hosted at gmane.
Note that mandocdb(8) record type is 64-bit and show all possible values.
Also slightly clarify the role of mdoc/man/cat. Finally, remove mandoc(1)
reference (it's not mentioned in the manual).
Snip some whitespace from apropos(1) and remove mandoc(1) ref from
whatis(1) (both apropos/whatis aren't related to mandoc from an
operator's perspective).
Make `-i' only apply to regular expressions. For the equality operator
(and thus the default), always use strcasestr(). Discussed on tech@
with schwarze@. While here, fix the apropos.c usage() message to be
consistent with apropos(1) and clean up the EXAMPLES in apropos(1).
Ingo Schwarze [Mon, 28 Nov 2011 09:44:05 +0000 (09:44 +0000)]
Tweak whatis(1):
* Bugfix: Use all arguments, not just the last one.
* Use 'Nm~' instead of 'Nm,Nd~' to match OpenBSD behaviour.
* For the progname, accept '^whatis', not '^whatis$' to ease testing.
ok kristaps@
Ingo Schwarze [Mon, 28 Nov 2011 01:37:34 +0000 (01:37 +0000)]
Discuss the default behaviour up front before talking about options
modifying it; based on a remark by kristaps@.
While here, mention parsing of unformatted files
and the changed index format and fix a few minor issues.
Ingo Schwarze [Sun, 27 Nov 2011 23:27:31 +0000 (23:27 +0000)]
Reimplement the global command line options -a and -v
as static global variables, reducing the maze of arguments
passed around among various static functions.
Suggested by kristaps@.
Ingo Schwarze [Sun, 27 Nov 2011 23:11:37 +0000 (23:11 +0000)]
Save the manual type (mdoc, man, or cat) in the index file
of the mandoc databases, as suggested by kristaps@.
Given the well-structured code, this is surprisingly simple.
This changes the mandoc.index database format.
Run "sudo mandocdb" to regenerate your databases.
Ingo Schwarze [Sun, 27 Nov 2011 22:57:53 +0000 (22:57 +0000)]
Rudimentary handling of formatted manuals ("cat pages").
Coded on the train back from p2k11 in Budapest.
Kristaps has seen the patch and agreed with the direction.
Ingo Schwarze [Sat, 26 Nov 2011 22:38:11 +0000 (22:38 +0000)]
Sync to OpenBSD, mostly gratuitous and whitespace differences,
but a few serious things as well:
* -M overrides MANPATH
* -m prepends to the path
* put back database close calls that got lost in mandocdb
* missing sys/types.h in manpath.c, needed for size_t
ok kristaps@
Ingo Schwarze [Sat, 26 Nov 2011 11:23:56 +0000 (11:23 +0000)]
Store page titles in the correct case, and by default, only
put stuff into the database that man(1) will be able to retrieve.
However, support an option to use all directories and files.
feedback and ok kristaps@
I say that mandocdb(8) uses "man(1)'s method", but it doesn't. It just uses
the configuration file and ignores MANPATH. Everybody else uses MANPATH
(being apropos and man), so why shouldn't we?
Make a small manual for how to run man.cgi.
This exists almost entirely to document that /tmp must exist in a jailed
Apache directory for dbopen() not to fail. This was a massive headache
to track down.
(1) Insecure. This means that we're operating over the full file-system
with access to mandoc(1). In this mode, mandocdb entries are formatted
on-the-fly. The $INSECURE environment variable must be passed to
man.cgi for this mode to work.
(2) Secure. Manuals are assumed to be pre-formatted in a cache directory,
which may be set with $CACHE_DIR but default to /cache/man.cgi.
This mode works with manup(8), which updates the cached pages from
outside of the jail. man.cgi simply locates the manual file and
outputs it to stdout.
Export the manpath_manconf() function, slightly reorderng manpath.c while
doing so. This will be used by a jailed man.cgi, as the cache built by
manup(8) creates a man.conf for it to use.
Add manup(8). This runs through mandocdb(8) databases (in the same way that
apropos(1) does so) and updates an HTML fragment cache for use by man.cgi.
Right now man.cgi is "online" in that it requires mandoc(1) in its path,
but this doesn't work for, say, OpenBSD's apache chroot(1). This allows
a cache to be maintained.
man.cgi works for the non-jailed case.
In other words, if you smash this into a cgi-bin directory, it will Just
Work for your system's manuals (it of course needs access to mandoc(1) and
your file-system, hence "non-jailed").
The notion of a jailed case is much more subtle and being worked on now.
Let apropos_db.h export the volume of manpages for a parsed record.
This is necessary since an array of records can have duplicate record
numbers in different mandoc.index files.
The volume [right now] is just the index of the parsed mandoc.index in
the manpaths. This is sensible because the order of the manpath is
significant (it's the order of duplicate-named manuals displayed by
man(1)) and is thus not likely to change.
Have mandocdb(8) take advantage of manpath.h.
This brings it in line with makewhatis(8), which, like apropos(1), will use
man.conf (or manpath(1)) if no manpath entries are provided.