| Vocabulary | Summary |
| Common support for ASCII, UTF8 and UTF16 character encodings | |
| Growable string buffers | |
| Splitting sequences and grouping sequence elements | |
| Fixed-size character arrays |
| Vocabulary | Summary |
| ASCII character classes | |
| Simple markup language for generating HTML | |
| Converting strings to byte arrays and vice versa | |
| Parsing expression grammar and packrat parser | |
| Declarative EBNF language for writing PEG parsers | |
| Additional PEG parsers | |
| Regular expressions | |
| simple-tokenizer vocabulary | |
| Correct sorting of sequences of strings with embedded numbers | |
| Unicode 5.2 support | |
| Unicode grapheme and word breaking | |
| Unicode case conversion | |
| Unicode character categories | |
| Parsing words used by Unicode implementation | |
| Unicode string comparison and sorting (collation) | |
| Parsing Unicode data files | |
| Unicode string normalization | |
| Reads the UCD to get the script of a code point | |
| Word wrapping |