Unicode guide/Implementations

This page is my attempt to document my research on Unicode string implementations supported in various languages and software.

Classifications

Here's a quick list of things I'll be classifying:

- bare encoding/runes (java, windows, wchar, rust, go, javascript, ruby, kotlin, zig, elixir)

- codepoint based (python, haskell, perl, tcl)

- grapheme-based (swift, raku) which lets you convert a string to codepoints?

- normalized (raku)

- bytestrings

- wchar

- windows

- rust

- java

- swift

- go

- kotlin

- java

- python, utf8b

- tcl

- linux/unix

- javascript

- perl

- ruby

- zig

- raku

- haskell

- elixir

- ICU

C and C++

Python 2

Lua

PHP (ignoring mbstring)

POSIX APIs

Windows narrow APIs

DOS APIs

squirrel