Unicode guide/Implementations: Difference between revisions
(Add Squirrel) |
(Add classifications) |
||
Line 2: | Line 2: | ||
== Classifications == | == Classifications == | ||
Here's a quick list of things I'll be classifying: | |||
* Bytestring support | |||
* Internal encoding | |||
* String encoding | |||
* Character type | |||
* OS API encoding/type | |||
* Supports bytes in strings | |||
* Can encode/decode to other encodings | |||
* How breaking by code points, graphene, words, paragraphs, etc is done | |||
* How ordering works | |||
* How upper/lower/folding case works | |||
* How finding works | |||
* How regex works | |||
* How locale tailoring is done | |||
- bare encoding/runes (java, windows, wchar, rust, go, javascript, ruby, kotlin, zig, elixir) | - bare encoding/runes (java, windows, wchar, rust, go, javascript, ruby, kotlin, zig, elixir) | ||
Line 11: | Line 25: | ||
- normalized (raku) | - normalized (raku) | ||
- bytestrings | - bytestrings |
Revision as of 18:33, 19 March 2022
This page is my attempt to document my research on Unicode string implementations supported in various languages and software.
Classifications
Here's a quick list of things I'll be classifying:
- Bytestring support
- Internal encoding
- String encoding
- Character type
- OS API encoding/type
- Supports bytes in strings
- Can encode/decode to other encodings
- How breaking by code points, graphene, words, paragraphs, etc is done
- How ordering works
- How upper/lower/folding case works
- How finding works
- How regex works
- How locale tailoring is done
- bare encoding/runes (java, windows, wchar, rust, go, javascript, ruby, kotlin, zig, elixir)
- codepoint based (python, haskell, perl, tcl)
- grapheme-based (swift, raku) which lets you convert a string to codepoints?
- normalized (raku)
- bytestrings
- wchar
- windows
- rust
- java
- swift
- go
- kotlin
- java
- python, utf8b
- tcl
- linux/unix
- javascript
- perl
- ruby
- zig
- raku
- haskell
- elixir
- ICU
C and C++
Python 2
Lua
PHP (ignoring mbstring)
POSIX APIs
Windows narrow APIs
DOS APIs
squirrel