Intro

The Unicode Consortium is the premier standards organization for internationalization of software and services, including the encoding of text for all modern computing systems. The Consortium supports internationalization with the Unicode Standard and by providing core libraries, software algorithms, and structured data. .

Unicode is a universal character encoding standard for representing all characters for all writing systems.

Unicode Format

Unicode assigns a unique code point to every character (see list of Unicode characters) with the format U+0041 (U + hexadecimal code point)

CharacterUnicode Code Point
AU+0041
aU+0061
あ (Japanese)U+3042
م (Arabic)U+0645
😀 (emoji)U+1F600

UTF-8 Encoding