Commit Graph

1180 Commits (51d7a62bee6bc14dc719c49cd688b4b5aa5dd1b0)

Author SHA1 Message Date
Daniel Kolesa 51d7a62bee eliminate -Wweak-vtables warnings 2018-01-05 22:48:38 +01:00
Daniel Kolesa 770ed476ca sink-writing decode 2018-01-05 22:16:18 +01:00
Daniel Kolesa 58ccfbe276 direct encoding funcs to u8/u16/uw from any other UTF 2018-01-05 21:49:00 +01:00
Daniel Kolesa daea42666e add range-based advancing encode funcs 2018-01-05 21:16:46 +01:00
Daniel Kolesa 61ed0ce71f printing of utf-16 and wide strings in format 2018-01-05 20:02:14 +01:00
Daniel Kolesa 200919d96f add funcs to deal with decoding/encoding of wchar_t values/sequences 2018-01-05 19:26:30 +01:00
Daniel Kolesa be25d42660 refactor unicode impl 2018-01-05 18:55:34 +01:00
Daniel Kolesa 44854072f7 prevent usage on potential broken platforms/toolchains 2018-01-05 03:05:17 +01:00
Daniel Kolesa d74736d8f4 add utf-16 decoding/encoding support 2018-01-05 02:18:36 +01:00
Daniel Kolesa 723c06c612 various warning fixes with -Weverything 2018-01-03 17:13:38 +01:00
Daniel Kolesa 2cbcc85fa8 add newlines between is-to funcs 2018-01-03 02:16:22 +01:00
Daniel Kolesa 5c204f7f54 remove unnecessary newlines 2018-01-03 02:14:24 +01:00
Daniel Kolesa fd4b26046c noexcept the ctype funcs 2018-01-03 02:12:23 +01:00
Daniel Kolesa ad149ff0f6 unicode-aware case-insensitive string compares 2018-01-03 01:22:07 +01:00
Daniel Kolesa 2949b2de0c add fallbacks for when string_utf.hh doesn't exist yet 2018-01-03 00:37:31 +01:00
Daniel Kolesa 34b27cd1c1 make the unicode tables const 2018-01-02 23:40:56 +01:00
Daniel Kolesa af635dc77a unicode fixes/cleanups 2018-01-02 23:28:37 +01:00
Daniel Kolesa 8aace1e65a flip the tolower/toupper funcs 2018-01-02 22:35:56 +01:00
Daniel Kolesa a0337c401e implement all the unicode ctype funcs, generate the tables 2018-01-02 22:23:18 +01:00
Daniel Kolesa 2b291bca39 implement the sorting logic in unicode generator 2018-01-02 19:04:12 +01:00
Daniel Kolesa 80aadd906e copyright for unicode data 2018-01-02 03:18:44 +01:00
Daniel Kolesa b3990a1d49 initial code for unicode lookup table generator 2018-01-02 03:00:56 +01:00
Daniel Kolesa 7bd668dab3 add unicode data 2018-01-02 01:41:33 +01:00
Daniel Kolesa 7c2bfa45df overload more Unicode stuff for UTF-32 slices 2018-01-02 00:30:58 +01:00
Daniel Kolesa dd2515de6c add a utility func to construct a container using a range 2018-01-01 21:06:25 +01:00
Daniel Kolesa b75f5f4881 implement utf-32 string printing in format 2018-01-01 20:59:39 +01:00
Daniel Kolesa c35f7377bf encode characters into utf-8 in format with our API 2018-01-01 03:11:00 +01:00
Daniel Kolesa 8e6852572c reject surrogate code points in decoding 2018-01-01 02:36:39 +01:00
Daniel Kolesa 0857edfef4 add a function to encode utf-32 to utf-8 2018-01-01 01:02:49 +01:00
Daniel Kolesa 278b6a6269 define string ranges over wchar/char16/char32 2017-12-31 23:42:46 +01:00
Daniel Kolesa c4f67b08b9 rename codepoint to decode 2017-12-31 20:06:36 +01:00
Daniel Kolesa b350eced7e move utf::length wrappper to header 2017-12-31 19:18:08 +01:00
Daniel Kolesa 48a7b45115 remove unneeded forward decl 2017-12-31 19:17:02 +01:00
Daniel Kolesa 1a07db8bac merge utf impl bits into one place 2017-12-31 19:16:16 +01:00
Daniel Kolesa d3cdbe2fcf expose unicode stuff through string slices 2017-12-31 19:12:51 +01:00
Daniel Kolesa fb2f9e3b0e add support for printing unicode and wide characters 2017-12-31 15:32:33 +01:00
Daniel Kolesa b2ee5c1bd0 add iter_codes to iterate a UTF-8 string by code points 2017-12-31 14:50:48 +01:00
Daniel Kolesa d6a13d8f97 expose multibyte-to-codepoint conversion 2017-12-31 03:26:15 +01:00
Daniel Kolesa 3c75d7db98 add some initial code for upcoming unicode support 2017-12-31 03:01:25 +01:00
Daniel Kolesa f7929a1b45 ditch char_traits in string stuff 2017-12-15 23:32:06 +01:00
Daniel Kolesa 84fc2bc9c4 do not rely on global locale when converting in format 2017-11-13 17:15:30 +01:00
Daniel Kolesa e5a21382af much better integer formatting (less stack space, better precision handling) 2017-11-13 02:13:42 +01:00
Daniel Kolesa baf0dd4ca6 MB_CUR_MAX is not a constant 2017-11-12 20:25:58 +01:00
Daniel Kolesa db28b66892 workaround the awful bullshit when formatting numbers under locale
Because of the C++ locale APIs and libstdc++ being worthless trash,
we need to resort to this kind of nonsense in order to avoid
gibberish when dealing with grouping and decimal separators.

Libc++ gets this right (comes up with ASCII style representations
when requesting locale facets dealing with char type) but for some
dumb reason libstdc++ comes up with representations that are
garbage even when using a UTF-8 locale, so I guess we'll deal
with it this way for the time being...

That said, all of this code is probably broken on systems that
don't use Unicode and honestly I don't care.
2017-11-12 20:14:26 +01:00
Daniel Kolesa 84572dfd01 use C locale in I/O and format by default
Should probably get rid of state in locale-aware APIs, too.
2017-11-12 19:15:43 +01:00
Daniel Kolesa f882292f2a get rid of gcc specific workarounds 2017-11-10 20:19:25 +01:00
Daniel Kolesa 1893f4b941 gcc warning fix 2017-11-10 20:11:58 +01:00
Daniel Kolesa ec9ddb2aad yield current task in single-threaded coroutine scheduler on spawn
This schedules tasks more aggressively, letting side tasks always run.
2017-11-05 22:10:22 +01:00
Daniel Kolesa 67525af4e5 gcc/libstdc++ 7.x fixes 2017-11-03 12:56:34 +01:00
Daniel Kolesa c6a854fac3 clearer readme 2017-06-19 17:09:30 +02:00