Unicode guide/Implementations: Difference between revisions
(Add classifications) |
(Add languages and implementations) |
||
Line 18: | Line 18: | ||
* How locale tailoring is done | * How locale tailoring is done | ||
- | == C == | ||
- C++ too? | |||
== D == | |||
== POSIX == | |||
== DOS == | |||
- | == Windows == | ||
- narrow APIs | |||
- | - wide APIs | ||
== Rust == | |||
== Java == | |||
== Swift == | |||
== Go == | |||
== Kotlin == | |||
== Python == | |||
python 2 | |||
python 3 | |||
== Tcl == | |||
== Lua == | |||
== Squirrel == | |||
== Perl == | |||
== Ruby == | |||
== Zig == | |||
- | == Elixir == | ||
- erlang too? | |||
== Raku == | |||
== Haskell == | |||
- | == PHP == | ||
- narrow APIs | |||
- | - mbstring | ||
== JavaScript == | |||
[[Category:Research]] | [[Category:Research]] |
Revision as of 21:52, 19 March 2022
This page is my attempt to document my research on Unicode string implementations supported in various languages and software.
Classifications
Here's a quick list of things I'll be classifying:
- Bytestring support
- Internal encoding
- String encoding
- Character type
- OS API encoding/type
- Supports bytes in strings
- Can encode/decode to other encodings
- How breaking by code points, graphene, words, paragraphs, etc is done
- How ordering works
- How upper/lower/folding case works
- How finding works
- How regex works
- How locale tailoring is done
C
- C++ too?
D
POSIX
DOS
Windows
- narrow APIs
- wide APIs
Rust
Java
Swift
Go
Kotlin
Python
python 2
python 3
Tcl
Lua
Squirrel
Perl
Ruby
Zig
Elixir
- erlang too?
Raku
Haskell
PHP
- narrow APIs
- mbstring