Unicode guide/Implementations

From JookWiki
Revision as of 21:52, 19 March 2022 by Jookia (talk | contribs) (Add languages and implementations)

This page is my attempt to document my research on Unicode string implementations supported in various languages and software.

Classifications

Here's a quick list of things I'll be classifying:

  • Bytestring support
  • Internal encoding
  • String encoding
  • Character type
  • OS API encoding/type
  • Supports bytes in strings
  • Can encode/decode to other encodings
  • How breaking by code points, graphene, words, paragraphs, etc is done
  • How ordering works
  • How upper/lower/folding case works
  • How finding works
  • How regex works
  • How locale tailoring is done

C

- C++ too?

D

POSIX

DOS

Windows

- narrow APIs

- wide APIs

Rust

Java

Swift

Go

Kotlin

Python

python 2

python 3

Tcl

Lua

Squirrel

Perl

Ruby

Zig

Elixir

- erlang too?

Raku

Haskell

PHP

- narrow APIs

- mbstring

JavaScript