Jump to content
Toggle sidebar
JookWiki
Search
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Navigation
Main page
Recent changes
Random page
All pages
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information
Editing
Unicode guide/Implementations
(section)
Page
Discussion
English
Read
Edit
Edit source
View history
More
Read
Edit
Edit source
View history
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== C and C++ == C and C++ provide limited functionality related to text handling. *Character type: 8-bit, 16-bit or 32-bit, encoding not defined * Byte strings: No, just regular arrays * Internal encoding: None * String encoding: Depends on locale * Supports bytes in strings: Depends on locale encoding * Supports surrogates in strings: Depends on locale encoding * Supports invalid code points in strings: Depends on locale encoding * Supports normalizing strings: No * Supports querying character properties: No * Supports breaking by code point: No * Supports breaking by extended grapheme cluster: No * Supports breaking by text boundaries: No * Supports encoding and decoding to other encodings: Yes * Supports Unicode regex extensions: No (not applicable in C as it has no regex) * Classifies by: Locale information, only supports single characters * Collates by: Locale information, supports arbitrary strings * Converts case by: Locale information, only supports single characters * Locale tailoring is done by: Current locale *Wraps operating system APIs with Unicode ones: No This could be classified as 'Unicode agnostic' however classification and case conversion is limited to single characters. As a result this is just broken even with the limited functionality it provides. Different platforms usually provide clearer definition: * On POSIX, characters are usually 8-bit ASCII-compatible values * On Windows, characters are 16-bit UTF-16-compatible values Actual support for locales depends on your libc implementation, and this affects most languages that run on your computer. For example, on Linux glibc seems to be the only libc that supports uppercasing Unicode text.
Summary:
Please note that all contributions to JookWiki are considered to be released under the Creative Commons Zero (Public Domain) (see
JookWiki:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
To edit this page, please answer the question that appears below (
more info
):
Who owns this wiki?
Cancel
Editing help
(opens in new window)