Character Set Codelist

From DGIWG
Jump to: navigation, search

Character Set Codelist

The value domain of Character Set Codelist is defined in the following table.


# Code English Name Definition Source
1 ucs2 2 byte fixed UCS 16-bit fixed size Universal Character Set, based on ISO/IEC 10646
2 ucs4 4 byte fixed UCS 32-bit fixed size Universal Character Set, based on ISO/IEC 10646
3 utf7 UCS Transformation Format – 7 bits 7-bit variable size UCS Transfer Format, based on ISO/IEC 10646
4 utf8 UCS Transformation Format – 8 bits Character Set defined by IETF RFC 3629
5 utf16 UCS Transformation Format – 16 bits 16-bit variable size UCS Transfer Format, based on ISO/IEC 10646
6 8859part1 ISO/IEC 8859-1 Information technology – 8-bit single-byte coded graphic character sets – Part 1: Latin alphabet No. 1
7 8859part2 ISO/IEC 8859-2 Information technology – 8-bit single-byte coded graphic character sets – Part 2: Latin alphabet No. 2
8 8859part3 ISO/IEC 8859-3 Information technology – 8-bit single-byte coded graphic character sets – Part 3: Latin alphabet No. 3
9 8859part4 ISO/IEC 8859-4 Information technology – 8-bit single-byte coded graphic character sets – Part 4: Latin alphabet No. 4
10 8859part5 ISO/IEC 8859-5 Information technology – 8-bit single-byte coded graphic character sets – Part 5: Latin/Cyrillic alphabet
11 8859part6 ISO/IEC 8859-6 Information technology – 8-bit single-byte coded graphic character sets – Part 6: Latin/Arabic alphabet
12 8859part7 ISO/IEC 8859-7 Information technology – 8-bit single-byte coded graphic character sets – Part 7: Latin/Greek alphabet
13 8859part8 ISO/IEC 8859-8 Information technology – 8-bit single-byte coded graphic character sets – Part 8: Latin/Hebrew alphabet
14 8859part9 ISO/IEC 8859-9 Information technology – 8-bit single-byte coded graphic character sets – Part 9: Latin alphabet No. 5
15 8859part10 ISO/IEC 8859-10 Information technology – 8-bit single-byte coded graphic character sets – Part 10: Latin alphabet No. 6
16 8859part11 ISO/IEC 8859-11 Information technology – 8-bit single-byte coded graphic character sets – Part 11: Latin/Thai alphabet
17 8859part13 ISO/IEC 8859-13 Information technology – 8-bit single-byte coded graphic character sets – Part 13: Latin alphabet No. 7
18 8859part14 ISO/IEC 8859-14 Information technology – 8-bit single-byte coded graphic character sets – Part 14: Latin alphabet No. 8 (Celtic)
19 8859part15 ISO/IEC 8859-15 Information technology – 8-bit single-byte coded graphic character sets – Part 15: Latin alphabet No. 9
20 8859part16 ISO/IEC 8859-16 Information technology – 8-bit single-byte coded graphic character sets – Part 15: Part 16: Latin alphabet No. 10
21 jis JIS Japanese code set used for electronic transmission
22 shiftJIS Shift JIS Japanese code set used on MS-DOS based machines
23 eucJP EUC JAPAN Japanese code set used on UNIX based machines
24 usAscii US ASCII United states ASCII code set (ISO 646 US)
25 ebcdic EBCDIC IBM mainframe code set
26 eucKR EUC KOREA Korean code set
27 big5 BIG5 Traditional Chinese code set used in Taiwan, Hong Kong of China and other areas
28 GB2312 GB2312 Simplified Chinese code set