Commit Graph

1299 Commits (27ce727eefd92c7db1d90c7f1e10b9f2ab4d5960)

Author SHA1 Message Date
Daniel Kolesa 3db863c68d allow encode to work on arbitrary ranges 2018-01-07 19:38:47 +01:00
Daniel Kolesa 818fd1e8e8 add utf::unit_bits and encode/iter_u taking number of bits (8/16/32) 2018-01-07 19:27:08 +01:00
Daniel Kolesa 798fcec6c8 remove the iter_u* wrappers, leave iter_u 2018-01-07 19:16:13 +01:00
Daniel Kolesa fb91f77eb0 remove the encode_u* wrappers (just use generic type-basedd ver) 2018-01-07 19:08:08 +01:00
Daniel Kolesa 4a2e5cd557 completely unify encode funcs 2018-01-07 18:44:12 +01:00
Daniel Kolesa 541fa43cbb reduce the encode_u* to 1 sink version per variant 2018-01-07 18:22:30 +01:00
Daniel Kolesa 24d1b5ec25 various traits and constants for unicode types 2018-01-07 17:13:53 +01:00
Daniel Kolesa be803bac7b remove sink-based decode (covered by encode_u32) 2018-01-07 16:37:45 +01:00
Daniel Kolesa 640a9714f0 universal templated encode, iter for all types 2018-01-07 02:17:05 +01:00
Daniel Kolesa 4343bb408d implement compare/case_compare for all slice types 2018-01-07 01:15:17 +01:00
Daniel Kolesa 92cfe0aaf5 iter_codes -> iter_u32, iter_bytes -> iter_u8 2018-01-06 02:10:38 +01:00
Daniel Kolesa d94e75cd18 add by-utf8-byte iteration for all slice types 2018-01-06 02:00:56 +01:00
Daniel Kolesa 4a992d64b5 move back to sane string range comparison operator defs 2018-01-06 01:17:18 +01:00
Daniel Kolesa 7912e699d5 remove char_traits usage 2018-01-06 01:08:19 +01:00
Daniel Kolesa e3362e6c9e allow encoding of noncharacters 2018-01-06 01:03:43 +01:00
Daniel Kolesa fa5ae71202 perform validity check when decoding utf-32 into itself 2018-01-06 00:52:50 +01:00
Daniel Kolesa ed82fa0233 unified length handling for all encodings 2018-01-06 00:42:14 +01:00
Daniel Kolesa aeb5023b30 relax the rules of zero-argument utf::length 2018-01-06 00:27:04 +01:00
Daniel Kolesa e5162233d4 add missing inline 2018-01-06 00:17:47 +01:00
Daniel Kolesa 51d7a62bee eliminate -Wweak-vtables warnings 2018-01-05 22:48:38 +01:00
Daniel Kolesa 770ed476ca sink-writing decode 2018-01-05 22:16:18 +01:00
Daniel Kolesa 58ccfbe276 direct encoding funcs to u8/u16/uw from any other UTF 2018-01-05 21:49:00 +01:00
Daniel Kolesa daea42666e add range-based advancing encode funcs 2018-01-05 21:16:46 +01:00
Daniel Kolesa 61ed0ce71f printing of utf-16 and wide strings in format 2018-01-05 20:02:14 +01:00
Daniel Kolesa 200919d96f add funcs to deal with decoding/encoding of wchar_t values/sequences 2018-01-05 19:26:30 +01:00
Daniel Kolesa be25d42660 refactor unicode impl 2018-01-05 18:55:34 +01:00
Daniel Kolesa 44854072f7 prevent usage on potential broken platforms/toolchains 2018-01-05 03:05:17 +01:00
Daniel Kolesa d74736d8f4 add utf-16 decoding/encoding support 2018-01-05 02:18:36 +01:00
Daniel Kolesa 723c06c612 various warning fixes with -Weverything 2018-01-03 17:13:38 +01:00
Daniel Kolesa 2cbcc85fa8 add newlines between is-to funcs 2018-01-03 02:16:22 +01:00
Daniel Kolesa 5c204f7f54 remove unnecessary newlines 2018-01-03 02:14:24 +01:00
Daniel Kolesa fd4b26046c noexcept the ctype funcs 2018-01-03 02:12:23 +01:00
Daniel Kolesa ad149ff0f6 unicode-aware case-insensitive string compares 2018-01-03 01:22:07 +01:00
Daniel Kolesa 2949b2de0c add fallbacks for when string_utf.hh doesn't exist yet 2018-01-03 00:37:31 +01:00
Daniel Kolesa 34b27cd1c1 make the unicode tables const 2018-01-02 23:40:56 +01:00
Daniel Kolesa af635dc77a unicode fixes/cleanups 2018-01-02 23:28:37 +01:00
Daniel Kolesa 8aace1e65a flip the tolower/toupper funcs 2018-01-02 22:35:56 +01:00
Daniel Kolesa a0337c401e implement all the unicode ctype funcs, generate the tables 2018-01-02 22:23:18 +01:00
Daniel Kolesa 2b291bca39 implement the sorting logic in unicode generator 2018-01-02 19:04:12 +01:00
Daniel Kolesa 80aadd906e copyright for unicode data 2018-01-02 03:18:44 +01:00
Daniel Kolesa b3990a1d49 initial code for unicode lookup table generator 2018-01-02 03:00:56 +01:00
Daniel Kolesa 7bd668dab3 add unicode data 2018-01-02 01:41:33 +01:00
Daniel Kolesa 7c2bfa45df overload more Unicode stuff for UTF-32 slices 2018-01-02 00:30:58 +01:00
Daniel Kolesa dd2515de6c add a utility func to construct a container using a range 2018-01-01 21:06:25 +01:00
Daniel Kolesa b75f5f4881 implement utf-32 string printing in format 2018-01-01 20:59:39 +01:00
Daniel Kolesa c35f7377bf encode characters into utf-8 in format with our API 2018-01-01 03:11:00 +01:00
Daniel Kolesa 8e6852572c reject surrogate code points in decoding 2018-01-01 02:36:39 +01:00
Daniel Kolesa 0857edfef4 add a function to encode utf-32 to utf-8 2018-01-01 01:02:49 +01:00
Daniel Kolesa 278b6a6269 define string ranges over wchar/char16/char32 2017-12-31 23:42:46 +01:00
Daniel Kolesa c4f67b08b9 rename codepoint to decode 2017-12-31 20:06:36 +01:00