aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/preconv.c
Commit message (Collapse)AuthorAgeFilesLines
* Allow compilation on Mac OS X, which doesn't have MACHINE defined.Kristaps Dzonsons2015-03-061-2/+2
| | | | | | While there, specify some casts to satisfy the compiler warnings. OK schwarze@
* Rewrite the low-level UTF-8 parser from scratch.Ingo Schwarze2014-12-191-76/+59
| | | | | | | | | | | | | | It accepted invalid byte sequences like 0xc080-c1bf, 0xe08080-e09fbf, 0xeda080-edbfbf, and 0xf0808080-f08fbfbf, produced valid roff Unicode escape sequences from them, and the algorithm contained strong defenses against any attempt to fix it. This cures an assertion failure in the terminal formatter caused by sneaking in ASCII 0x08 (backspace) by "encoding" it as an (invalid) multibyte UTF-8 sequence, found by jsg@ with afl. As a bonus, the new algorithm also reduces the code in the function by about 20%.
* Remove needless and harmful byte swapping on big endian architectures.Ingo Schwarze2014-11-141-25/+5
| | | | | | | Problem found and patch provided by Martin Natano at bitrig, thanks! Tested on macppc by natano@ and on i386, amd64, and sparc64 myself. While here, sync with OpenBSD by removing some trailing whitespace.
* Refactor, no functional change: Remove the parse point from struct buf.Ingo Schwarze2014-11-011-11/+12
| | | | | | Some functions need multiple parse points, some none at all, and it varies whether any of them need to be passed around. So better pass them as a separate argument, and only when needed.
* KNF: indentation and sort variables by size; no functional changeIngo Schwarze2014-10-261-5/+4
|
* integrate preconv(1) into mandoc(1);Ingo Schwarze2014-10-251-368/+59
| | | | enhances functionality and reduces code and docs by more than 300 lines
* Improve build system and autodetection.Ingo Schwarze2014-08-161-4/+4
| | | | | | | | | * Make ./configure standalone, that's what people expect. * Let people write a ./configure.local from scratch, not edit existing files. * Autodetect wchar, sqlite3, and manpath and act accordingly. * Autodetect the need for -L/usr/local/lib and -lutil. * Get rid of config.h.p{re,ost}, let ./configure only write what's needed. * Let ./configure write a Makefile.local snippet, that's quite flexible.
* Get rid of HAVE_CONFIG_H, it is always defined; idea from libnbcompat.Ingo Schwarze2014-08-101-3/+2
| | | | | | Include <sys/types.h> where needed, it does not belong in config.h. Remove <stdio.h> from config.h; if it is missing somewhere, it should be added, but i cannot find a *.c file where it is missing.
* Sync to OpenBSD, no functional change:Ingo Schwarze2013-06-021-8/+3
| | | | | | * Add the missing mparse_parse_buffer prototype. * Drop the useless MAP_FILE constant: It's not specified in POSIX, so it's not required, it's the default anyway, and it's 0 anyway.
* Scary-looking but otherwise harmless changes allow me to build for Windows.Kristaps Dzonsons2011-07-241-2/+8
| | | | | | | | | | | | That is to say, with mingw32. This amounts to the following: (1) break compat.c into compat_strlcpy.c and compat_strlcat.c (2) add compat_getsubopt.c (from OpenBSD) and test-getsubopt.c (3) add test-strptime.c for HAVE_STRPTIME (4) add ifdef bits here and there, where necessary (5) remove some harmless unportable stuff (u_char, localtime_r) I've added the appropriate mdocml.zip target to the Makefile, too.
* Some small lint checks in preconv. Also add it to the default lint rule.Kristaps Dzonsons2011-05-261-6/+6
|
* preconv is now on encoding-recognition parity with groff. This lastKristaps Dzonsons2011-05-261-3/+105
| | | | | | | | | commit adds parsing of "File Variables" in the first two lines in order to grok the encoding. This completes groff's recognition sequence (-e, BOM, File variables, -D, default). I've also cleaned up the manual to indicate this and for some general readability. preconv is now compiled by default in the Makefile.
* Significantly improve preconv. Allow it to recode UTF-8 characters intoKristaps Dzonsons2011-05-261-13/+117
| | | | | | | | | | | | | the \[uNNNN] strings (taking into account big-endian archs). Also allow it to determine from the BOM whether it's a UTF-8 file. Also add the initial manual. This has been tested over a random selection of UTF-8 documents, as % preconv -e utf-8 foo.1 | ./mandoc -Tlocale where -Tlocale is allowed (-DUSE_WCHAR). Note that we're still missing the "type" indicator that preconv accepts.
* It's annoying that we don't have preconv, so throw together a quickKristaps Dzonsons2011-05-261-0/+316
version and let it grow in-tree. Right now, this only supports the Latin-1 and US-ASCII encoding. I'll do UTF-8 next. It's call-compatible with GNU's preconv although I don't do fancy stuff like BOM or header check. This will come. I used read.c's file-grokking code.