1 ************************************************************************
2 * Official mandoc TODO.
3 * $Id: TODO,v 1.194 2014/12/10 21:54:13 schwarze Exp $
4 ************************************************************************
6 Many issues are annotated for difficulty as follows:
8 - loc = locality of the issue
9 * single file issue, affects file only, or very few
10 ** single module issue, affects several files of one module
11 *** cross-module issue, significantly impacts multiple modules
12 and may require substantial changes to internal interfaces
13 - exist = difficulty of the existing code in this area
14 * affected code is straightforward and easy to read and change
15 ** affected code is somewhat complex, but once you understand
16 the design, not particularly difficult to understand
17 *** affected code uses a special, exceptionally tricky design
18 - algo = difficulty of the new algorithm to be written
19 * the required logic and code is straightforward
20 ** the required logic is somewhat complex and needs a careful design
21 *** the required logic is exceptionally tricky,
22 maybe an approach to solve that is not even known yet
23 - size = the amount of code to be written or changed
24 * a small number of lines (at most 100, usually much less)
25 ** a considerable amount of code (several dozen to a few hundred)
26 *** a large amount of code (many hundreds, maybe thousands)
27 - imp = importance of the issue
28 * mostly for completeness
29 ** would be nice to have
30 *** issue causes considerable inconvenience
32 Obviously, as the issues have not been solved yet, these annotations
33 are mere guesses, and some may be wrong.
35 ************************************************************************
37 ************************************************************************
39 - The abort() in bufcat(), html.c, can be triggered via buffmt_includes()
40 by running -Thtml -Oincludes on a file containing a long .In argument.
41 Fixing this will probably require reworking the whole bufcat() concept.
42 loc ** exist * algo * size ** imp **
44 ************************************************************************
46 ************************************************************************
48 --- missing roff features ----------------------------------------------
50 - .ad (adjust margins)
51 .ad l -- adjust left margin only (flush left)
52 .ad r -- adjust right margin only (flush right)
53 .ad c -- center text on line
54 .ad b -- adjust both margins (alias: .ad n)
55 .na -- temporarily disable adjustment without changing the mode
56 .ad -- re-enable adjustment without changing the mode
57 Adjustment mode is ignored while in no-fill mode (.nf).
58 loc *** exist *** algo ** size ** imp ** (parser reorg would help)
61 found by naddy@ in xloadimage(1)
62 loc ** exist *** algo * size * imp *
64 - .nr third argument (auto-increment step size, requires \n+)
65 found by bentley@ in sbcl(1) Mon, 9 Dec 2013 18:36:57 -0700
66 loc * exist * algo * size * imp **
68 - .ns (no-space mode) occurs in xine-config(1)
69 reported by brad@ Sat, 15 Jan 2011 15:45:23 -0500
70 loc *** exist *** algo *** size ** imp *
72 - .ta (tab settings) occurs in ircbug(1) and probably gnats(1)
73 reported by brad@ Sat, 15 Jan 2011 15:50:51 -0500
74 also Tcl_NewStringObj(3) via wiz@ Wed, 5 Mar 2014 22:27:43 +0100
75 also posix2time(3) Carsten Kunze Mon, 1 Dec 2014 13:03:10 +0100
76 loc ** exist *** algo ** size ** imp ***
78 - .ti (temporary indent)
79 found by naddy@ in xloadimage(1)
80 found by bentley@ in nmh(1) Mon, 23 Apr 2012 13:38:28 -0600
81 loc ** exist ** algo ** size * imp ** (parser reorg helps a lot)
84 found by jca@ in ratpoison(1) Sun, 30 Jun 2013 12:01:09 +0200
85 loc * exist ** algo ** size ** imp **
88 found in cclive(1) and nasm(1) asciidoc/DocBook output
89 bentley@ on discuss@ Sat, 21 Sep 2013 22:29:34 -0600
90 naddy@ Thu, 4 Dec 2014 16:26:41 +0100
91 loc ** exist ** algo ** size * imp ** (parser reorg helps a lot)
93 - \n+ and \n- numerical register increment and decrement
94 found by bentley@ in sbcl(1) Mon, 9 Dec 2013 18:36:57 -0700
95 loc * exist * algo * size * imp **
97 - \w'' improve width measurements
98 would not be very useful without an expression parser, see below
99 needed for Tcl_NewStringObj(3) via wiz@ Wed, 5 Mar 2014 22:27:43 +0100
100 loc ** exist *** algo *** size * imp ***
102 - using undefined strings or macros defines them to be empty
103 wl@ Mon, 14 Nov 2011 14:37:01 +0000
104 loc * exist * algo * size * imp *
106 --- missing mdoc features ----------------------------------------------
108 - fix bad block nesting involving multiple identical explicit blocks
109 see the OpenBSD mdoc_macro.c 1.47 commit message
110 loc * exist *** algo *** size * imp **
112 - .Bl -column .Xo support is missing
114 restore .Xr and .Dv to
115 lib/libc/compat-43/sigvec.3
116 lib/libc/gen/signal.3
117 lib/libc/sys/sigaction.2
118 loc * exist *** algo *** size * imp **
120 - edge case: decide how to deal with blk_full bad nesting, e.g.
121 .Sh .Nm .Bk .Nm .Ek .Sh found by jmc@ in ssh-keygen(1)
122 from jmc@ Wed, 14 Jul 2010 18:10:32 +0100
123 loc * exist *** algo *** size ** imp **
125 - .Bd -centered implies -filled, not -unfilled, which is not
126 easy to implement; it requires code similar to .ce, which
127 we don't have either.
128 Besides, groff has bug causing text right *before* .Bd -centered
129 to be centered as well.
130 loc *** exist *** algo ** size ** imp ** (parser reorg would help)
132 - .Bd -filled should not be the same as .Bd -ragged, but align both
133 the left and right margin. In groff, it is implemented in terms
134 of .ad b, which we don't have either. Found in cksum(1).
135 loc *** exist *** algo ** size ** imp ** (parser reorg would help)
137 - implement blank `Bl -column', such as
141 loc * exist *** algo *** size * imp *
143 - explicitly disallow nested `Bl -column', which would clobber internal
144 flags defined for struct mdoc_macro
145 loc * exist * algo * size * imp **
147 - In .Bl -column .It, the end of the line probably has to be regarded
148 as an implicit .Ta, if there could be one, see the following mildly
149 ugly code from login.conf(5):
150 .Bl -column minpasswordlen program xetcxmotd
151 .It path Ta path Ta value of Dv _PATH_DEFPATH
154 reported by Michal Mazurek <akfaew at jasminek dot net>
155 via jmc@ Thu, 7 Apr 2011 16:00:53 +0059
156 loc * exist *** algo ** size * imp **
158 - inside `.Bl -column' phrases, punctuation is handled like normal
159 text, e.g. `.Bl -column .It Fl x . Ta ...' should give "-x -."
161 - inside `.Bl -column' phrases, TERMP_IGNDELIM handling by `Pf'
162 is not safe, e.g. `.Bl -column .It Pf a b .' gives "ab."
163 but should give "ab ."
165 - check whether it is correct that `D1' uses INDENT+1;
166 does it need its own constant?
167 loc * exist ** algo ** size * imp **
169 - prohibit `Nm' from having non-text HEAD children
170 (e.g., NetBSD mDNSShared/dns-sd.1)
171 (mdoc_html.c and mdoc_term.c `Nm' handlers can be slightly simplified)
173 - support translated section names
174 e.g. x11/scrotwm scrotwm_es.1:21:2: error: NAME section must be first
175 that one uses NOMBRE because it is spanish...
176 deraadt tends to think that section-dependent macro behaviour
177 is a bad idea in the first place, so this may be irrelevant
178 loc ** exist ** algo ** size * imp **
180 - When there is free text in the SYNOPSIS and that free text contains
181 the .Nm macro, groff somehow understands to treat the .Nm as an in-line
182 macro, while mandoc treats it as a block macro and breaks the line.
183 No idea how the logic for distinguishing in-line and block instances
184 should be, needs investigation.
185 uqs@ Thu, 2 Jun 2011 11:03:51 +0200
186 uqs@ Thu, 2 Jun 2011 11:33:35 +0200
187 loc * exist ** algo *** size * imp **
189 --- missing man features -----------------------------------------------
191 - -T[x]html doesn't stipulate non-collapsing spaces in literal mode
193 --- missing tbl features -----------------------------------------------
195 - look at the POSIX manuals in the books/man-pages-posix port,
196 they use some unsupported tbl(7) features.
197 loc * exist ** algo ** size ** imp ***
199 - use Unicode U+2500 to U+256C for table borders
200 in tbl(7) -Tutf-8 output
201 suggested by bentley@ Tue, 14 Oct 2014 04:10:55 -0600
202 loc * exist ** algo * size * imp **
204 - allow standalone `.' to be interpreted as an end-of-layout
205 delimiter instead of being thrown away as a no-op roff line
206 reported by Yuri Pankov, Wed 18 May 2011 11:34:59 CEST
207 loc ** exist ** algo ** size * imp **
209 --- missing eqn features -----------------------------------------------
211 - The "size" keyword is parsed, but ignored by the formatter.
212 loc * exist * algo * size * imp *
214 - The spacing characters `~', `^', and tab are currently ignored,
215 see User's Guide (Second Edition) page 2 section 4.
216 loc * exist * algo ** size * imp **
218 - Mark and lineup are parsed and ignored,
219 see User's Guide (Second Edition) page 5 section 15.
220 loc ** exist ** algo ** size ** imp **
222 --- missing misc features ----------------------------------------------
224 - italic correction (\/) in PostScript mode
225 Werner LEMBERG on groff at gnu dot org Sun, 10 Nov 2013 12:47:46
226 loc ** exist ** algo * size * imp *
228 - When makewhatis(8) encounters a FATAL parse error,
229 it silently treats the file as formatted, which makes no sense
230 at all for paths like man1/foo.1 - and which also contradicts
231 what the manual says at the end of the description.
232 The end result will be ENOENT for file names returned
233 by mansearch() in manpage.file.
234 loc * exist * algo * size * imp **
236 - makewhatis(8) for preformatted pages:
237 parse the section number from the header line
238 and compare to the section number from the directory name
239 loc * exist * algo * size * imp **
241 - Does makewhatis(8) detect missing NAME sections, missing names,
242 and missing descriptions in all the file formats?
243 loc * exist * algo * size * imp ***
245 - clean up escape sequence handling, creating three classes:
246 (1) fully implemented, or parsed and ignored without loss of content
247 (2) unimplemented, potentially causing loss of content
248 or serious mangling of formatting (e.g. \n) -> ERROR
249 see textproc/mgdiff(1) for nice examples
250 (3) undefined, just output the character -> perhaps WARNING
251 loc *** exist ** algo ** size ** imp *** (parser reorg helps)
253 - kettenis wants base roff, ms, and me Fri, 1 Jan 2010 22:13:15 +0100 (CET)
254 loc ** exist ** algo ** size *** imp *
256 --- compatibility checks -----------------------------------------------
258 - is .Bk implemented correctly in modern groff?
259 sobrado@ Tue, 19 Apr 2011 22:12:55 +0200
261 - compare output to Heirloom roff, Solaris roff, and
262 http://repo.or.cz/w/neatroff.git http://litcave.rudi.ir/
264 - look at AT&T DWB http://www2.research.att.com/sw/download
265 Carsten Kunze <carsten dot kunze at arcor dot de> has patches
266 Mon, 4 Aug 2014 17:01:28 +0200
268 - look at pages generated from reStructeredText, e.g. devel/mercurial hg(1)
269 These are a weird mixture of man(7) and custom autogenerated low-level
270 roff stuff. Figure out to what extent we can cope.
271 For details, see http://docutils.sourceforge.net/rst.html
272 noted by stsp@ Sat, 24 Apr 2010 09:17:55 +0200
273 reminded by nicm@ Mon, 3 May 2010 09:52:41 +0100
275 - look at pages generated from ronn(1) github.com/rtomayko/ronn
278 - look at pages generated from Texinfo source by yat2m, e.g. security/gnupg
279 First impression is not that bad.
281 - look at pages generated by pandoc; see
282 https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Writers/Man.hs
283 porting planned by kili@ Thu, 19 Jun 2014 19:46:28 +0200
285 - check compatibility with Plan9:
286 http://swtch.com/usr/local/plan9/tmac/tmac.an
287 http://swtch.com/plan9port/man/man7/man.html
288 "Anthony J. Bentley" <anthonyjbentley@gmail.com> 28 Dec 2010 21:58:40 -0700
290 - check compatibility with the man(7) formatter
291 https://raw.githubusercontent.com/rofl0r/hardcore-utils/master/man.c
293 - check compatibility with
294 http://ikiwiki.info/plugins/contrib/mandoc/
295 https://github.com/schmonz/ikiwiki/compare/mandoc
296 Amitai Schlair Mon, 19 May 2014 14:05:53 -0400
298 ************************************************************************
299 * formatting issues: ugly output
300 ************************************************************************
302 - a column list with blank `Ta' cells triggers a spurrious
303 start-with-whitespace printing of a newline
306 .It Em Authentication<tab>Key Length
307 ought to render "Key Length" with emphasis, too,
308 see OpenBSD iked.conf(5).
309 reported again Nicolas Joly via wiz@ Wed, 12 Oct 2011 00:20:00 +0200
310 loc * exist *** algo *** size ** imp ***
312 - empty phrases in .Bl column produce too few blanks
313 try e.g. .Bl -column It Ta Ta
314 reported by millert Fri, 02 Apr 2010 16:13:46 -0400
315 loc * exist *** algo *** size * imp **
317 - .%T can have trailing punctuation. Currently, it puts the trailing
318 punctuation into a trailing MDOC_TEXT element inside its own scope.
319 That element should rather be outside its scope, such that the
320 punctuation does not get underlines. This is not trivial to
321 implement because .%T then needs some features of in_line_eoln() -
322 slurp all arguments into one single text element - and one feature
323 of in_line() - put trailing punctuation out of scope.
324 Found in mount_nfs(8) and exports(5), search for "Appendix".
325 loc ** exist ** algo *** size * imp **
327 - Trailing punctuation after .%T triggers EOS spacing, at least
328 outside .Rs (eek!). Simply setting ARGSFL_DELIM for .%T is not
329 the right solution, it sends mandoc into an endless loop.
330 reported by Nicolas Joly Sat, 17 Nov 2012 11:49:54 +0100
331 loc * exist ** algo ** size * imp **
333 - global variables in the SYNOPSIS of section 3 pages
334 .Vt vs .Vt/.Va vs .Ft/.Va vs .Ft/.Fa ...
335 from kristaps@ Tue, 08 Jun 2010 11:13:32 +0200
337 - in enclosures, mandoc sometimes fancies a bogus end of sentence
338 reminded by jmc@ Thu, 23 Sep 2010 18:13:39 +0059
339 loc * exist ** algo *** size * imp ***
341 - formatting /usr/local/man/man1/latex2man.1 with groff and mandoc
342 reveals lots of bugs both in groff and mandoc...
343 reported by bentley@ Wed, 22 May 2013 23:49:30 -0600
345 --- PDF issues ---------------------------------------------------------
347 - PDF output doesn't use a monospaced font for .Bd -literal
348 Example: "mandoc -Tpdf afterboot.8 > output.pdf && pdfviewer output.pdf".
349 Search the text "Routing tables".
350 Also check what PostScript mode does when fixing this.
351 reported by juanfra@ Wed, 04 Jun 2014 21:44:58 +0200
352 instructions from juanfra@ Wed, 11 Jun 2014 02:21:01 +0200
353 add a new <</Type /Font>> block to the PDF files with /BaseFont /Courier
354 and change the /Name from /F0 to the new font (/F5 (?)).
355 loc * exist ** algo ** size * imp **
357 --- HTML issues --------------------------------------------------------
359 - <dl><dt><dd> formatting is ugly
360 hints are easy to find on the web, e.g.
361 http://stackoverflow.com/questions/1713048/
362 see also matthew@ Fri, 18 Jul 2014 19:25:12 -0700
363 loc * exist * algo ** size * imp ***
365 - jsg on icb, Nov 3, 2014:
366 try to guess Xr in man(7) for hyperlinking
368 - The tables used to render the three-part page headers actually force
369 the width of the <body> to the max-width given for <html>.
370 Not yet sure how to fix that...
371 Observed by an Anonymous Coward on undeadly.org:
372 http://undeadly.org/cgi?action=article&sid=20140925064244&pid=1
373 loc * exist * algo ** size * imp ***
375 - consider whether <var> can be used for Ar Dv Er Ev Fa Va.
376 from bentley@ Wed, 13 Aug 2014 09:17:55 -0600
378 - check https://github.com/trentm/mdocml
380 ************************************************************************
381 * formatting issues: gratuitous differences
382 ************************************************************************
384 - .Fn reopens a new scope after punctuation in mandoc,
385 but closes its scope for good in groff.
386 Do we want to change mandoc or groff?
387 Steffen Nurpmeso Sat, 08 Nov 2014 13:34:59 +0100
388 loc * exist ** algo ** size * imp **
390 - In .Bl -enum -width 0n, groff continues one the same line after
391 the number, mandoc breaks the line.
392 mail to kristaps@ Mon, 20 Jul 2009 02:21:39 +0200
393 loc * exist ** algo ** size * imp **
395 - .Pp between two .It in .Bl -column should produce one,
396 not two blank lines, see e.g. login.conf(5).
397 reported by jmc@ Sun, 17 Apr 2011 14:04:58 +0059
398 reported again by sthen@ Wed, 18 Jan 2012 02:09:39 +0000 (UTC)
399 loc * exist *** algo ** size * imp **
401 - If the *first* line after .It is .Pp, break the line right after
402 the tag, do not pad with space characters before breaking.
403 See the description of the a, c, and i commands in sed(1).
404 loc * exist ** algo ** size * imp **
406 - If the first line after .It is .D1, do not assert a blank line
407 in between, see for example tmux(1).
408 reported by nicm@ 13 Jan 2011 00:18:57 +0000
409 loc * exist ** algo ** size * imp **
411 - Trailing punctuation after .It should trigger EOS spacing.
412 reported by Nicolas Joly Sat, 17 Nov 2012 11:49:54 +0100
413 Probably, this should be fixed somewhere in termp_it_pre(), not sure.
414 loc * exist ** algo ** size * imp **
417 should be "NetBSD 1.0A", not "NetBSD 1.0a",
418 see OpenBSD ccdconfig(8).
419 loc * exist * algo * size * imp **
421 - In .Bl -tag, if a tag exceeds the right margin and must be continued
422 on the next line, it must be indented by -width, not width+1;
423 see "rule block|pass" in OpenBSD ifconfig(8).
424 loc * exist *** algo ** size * imp **
426 - When the -width string contains macros, the macros must be rendered
427 before measuring the width, for example
428 .Bl -tag -width ".Dv message"
429 in magic(5), located in src/usr.bin/file, is the same
430 as -width 7n, not -width 11n.
431 The same applies to .Bl -column column widths;
432 reported again by Nicolas Joly Thu, 1 Mar 2012 13:41:26 +0100 via wiz@ 5 Mar
433 reported again by Franco Fichtner Fri, 27 Sep 2013 21:02:28 +0200
434 loc *** exist *** algo *** size ** imp ***
435 An easy partial fix would be to just skip the first word if it starts
436 with a dot, including any following white space, when measuring.
437 loc * exist * algo * size * imp ***
439 - The \& zero-width character counts as output.
440 That is, when it is alone on a line between two .Pp,
441 we want three blank lines, not two as in mandoc.
442 loc ** exist ** algo ** size * imp **
444 - Header lines of excessive length:
445 Port OpenBSD man_term.c rev. 1.25 to mdoc_term.c
446 and document it in mdoc(7) and man(7) COMPATIBILITY
447 found while talking to Chris Bennett
448 loc * exist * algo * size * imp *
450 - trailing whitespace must be ignored even when followed by a font escape,
454 operate in batch mode
456 loc ** exist ** algo ** size * imp **
458 ************************************************************************
460 ************************************************************************
462 - check that MANDOCERR_BADTAB is thrown in the right cases,
463 i.e. when finding a literal tab character in fill mode,
464 and possibly change the wording of the warning message
465 to refer to fill mode, not literal mode
466 See the mail from Werner LEMBERG on the groff list,
467 Fri, 14 Feb 2014 18:54:42 +0100 (CET)
468 loc * exist ** algo ** size * imp **
470 - warn about attempts to call non-callable macros
471 Steffen Nurpmeso Tue, 11 Nov 2014 22:55:16 +0100
472 Note that formatting is inconsistent in groff.
473 .Fn Po prints "Po()", .Ar Sh prints "file ..." and no "Sh".
474 Relatively hard because the relevant code is scattered
475 all over mdoc_macro.c and all subtly different.
476 loc ** exist ** algo ** size ** imp **
478 - warn about "new sentence, new line"
479 loc ** exist ** algo *** size * imp **
481 - mandoc_special does not really check the escape sequence,
482 but just the overall format
483 loc ** exist ** algo *** size ** imp **
485 - integrate mdoclint into mandoc ("end-of-line whitespace" thread)
486 from jmc@ Mon, 13 Jul 2009 17:12:09 +0100
487 from kristaps@ Mon, 13 Jul 2009 18:34:53 +0200
488 from jmc@ Mon, 13 Jul 2009 17:45:37 +0059
489 from kristaps@ Mon, 13 Jul 2009 19:02:03 +0200
490 (mostly done, check what remains)
492 - -Tlint parser errors and warnings to stdout
493 to tech@mdocml, naddy@ Wed, 28 Sep 2011 11:21:46 +0200
494 wait! kristaps@ Sun, 02 Oct 2011 17:12:52 +0200
496 - for system errors, use errno/strerror/warn/err
498 ************************************************************************
499 * documentation issues
500 ************************************************************************
502 - mention hyphenation rules:
503 breaking at letter-letter in text mode (not macro args)
504 proper hyphenation is unimplemented
506 - talk about spacing around delimiters
507 to jmc@, kristaps@ Sat, 23 Apr 2011 17:41:27 +0200
509 - mark macros as: page structure domain, manual domain, general text domain
512 - mention /usr/share/misc/mdoc.template in mdoc(7)?
514 - Is all the content from http://www.std.com/obi/BSD/doc/usd/28.tbl/tbl
517 ************************************************************************
519 ************************************************************************
521 - Why are we using MAP_SHARED, not MAP_PRIVATE for mmap(2)?
522 How does SQLITE_CONFIG_PAGECACHE actually work? Document it!
523 from kristaps@ Sat, 09 Aug 2014 13:51:36 +0200
525 Several areas can be cleaned up to make mandoc even faster. These are
527 - improve hashing mechanism for macros (quite important: performance)
529 - improve hashing mechanism for characters (not as important)
531 - the PDF file is HUGE: this can be reduced by using relative offsets
533 - instead of re-initialising the roff predefined-strings set before each
534 parse, create a read-only version the first time and copy it
535 loc * exist ** algo ** size * imp **
537 ************************************************************************
539 ************************************************************************
541 - Use libz directly instead of forking gunzip(1).
542 Suggested by bapt at FreeBSD among others.
544 - We use the input line number at several places to distinguish
545 same-line from different-line input. That plainly doesn't work
546 with user-defined macros, leading to random breakage.
548 - Find better ways to prevent endless loops
549 in roff(7) macro and string expansion.
551 - Finish cleanup of date handling.
552 Decide which formats should be recognized where.
553 Update both mdoc(7) and man(7) documentation.
554 Triggered by Tim van der Molen Tue, 22 Feb 2011 20:30:45 +0100
556 - struct mparse refactoring
557 Steffen Nurpmeso Thu, 04 Sep 2014 12:50:00 +0200
559 - Consider creating some views that will make the database more
560 readable from the sqlite3 shell. Consider using them to
561 abstract from the database structure, too.
562 suggested by espie@ Sat, 19 Apr 2014 14:52:57 +0200
564 ************************************************************************
566 ************************************************************************
568 - Enable HTTP compression by detecting gzip encoding and filtering
570 - Sandbox (see OpenSSH).
571 - Enable caching support via HTTP 304 and If-Modified-Since.
572 - Allow for cgi.h to be overridden by CGI environment variables.
573 Otherwise, binary distributions will inherit the compile-time
574 behaviour, which is not optimal.
575 - Have Mac OSX systems automatically disable -static compilation of the
576 CGI: -static isn't supported.
578 ************************************************************************
579 * to improve in the groff_mdoc(7) macros
580 ************************************************************************
582 - use uname(1) to set doc-default-operating-system at install time
583 tobimensch Mon, 1 Dec 2014 00:25:07 +0100