Coded Character Sets

This section lists the character sets you can use as your target character set.

The character sets are enumerated in kvcharset.h and defined in the Export class.

DEPRECATED: Target character sets other than UTF-8 and UTF-16 are deprecated in KeyView version 24.4 and later. OpenText recommends that you only use the widely supported UTF-8 and UTF-16 formats. Other character sets are still available, but might be removed in future.

Coded Character Set Description Can be set as target charset?
KVCS_UNKNOWN Unknown character set N
KVCS_SJIS Japanese (uses multibyte encoding), cp932 Deprecated
KVCS_GB Simplified Chinese (China, Singapore, Malaysia) cp936 Deprecated
KVCS_BIG5 Traditional Chinese (Taiwan, Hong Kong, Macaw) cp950 Deprecated
KVCS_KSC Korean, cp949 Deprecated
KVCS_1250 Windows Latin 2 (Central Europe) Deprecated
KVCS_1251 Windows Cyrillic (Slavic) Deprecated
KVCS_1252 Windows Latin 1 (ANSI) Deprecated
KVCS_1253 Windows Greek Deprecated
KVCS_1254 Windows Latin 5 (Turkish) Deprecated
KVCS_1255 Windows Hebrew Deprecated
KVCS_1256 Windows Arabic Deprecated
KVCS_1257 Windows Baltic Rim Deprecated
KVCS_1258 Windows Vietnamese Deprecated
KVCS_8859_1 ISO 8859-1 Latin 1 (Western Europe, Latin America) Deprecated
KVCS_8859_2 ISO 8859-2 Latin 2 (Central Eastern Europe) Deprecated
KVCS_8859_3 ISO 8859-3 Latin 3 (S.E. Europe) Deprecated
KVCS_8859_4 ISO 8859-4 Latin 4 (Scandinavia/Baltic) Deprecated
KVCS_8859_5 ISO 8859-5 Latin/Cyrillic Deprecated
KVCS_8859_6 ISO 8859-6 Latin/Arabic Deprecated
KVCS_8859_7 ISO 8859-7 Latin/Greek Deprecated
KVCS_8859_8 ISO 8859-8 Latin/Hebrew Deprecated
KVCS_8859_9 ISO 8859-9 Latin/Turkish Deprecated
KVCS_8859_14 ISO 8859-14 Deprecated
KVCS_8859_15 ISO 8859-15 Deprecated
KVCS_437 DOS Latin US Deprecated
KVCS_737 DOS Greek Deprecated
KVCS_775 DOS Baltic Rim Deprecated
KVCS_850 DOS Latin 1 Deprecated
KVCS_851 DOS Greek Deprecated
KVCS_852 DOS Latin 2 Deprecated
KVCS_855 DOS Cyrillic Deprecated
KVCS_857 DOS Turkish Deprecated
KVCS_860 DOS Portuguese Deprecated
KVCS_861 DOS Icelandic Deprecated
KVCS_862 DOS Hebrew Deprecated
KVCS_863 DOS Canadian French Deprecated
KVCS_864 DOS Arabic Deprecated
KVCS_865 DOS Nordic Deprecated
KVCS_866 DOS Cyrillic Russian Deprecated
KVCS_869 DOS Greek 2 Deprecated
KVCS_874 Thai Deprecated
KVCS_PDFMACDOC PDF MAC DOC N
KVCS_PDFWINDOC PDF WIN DOC N
KVCS_STDENC Adobe Standard Encoding N
KVCS_PDFDOC Adobe standard PDF character set N
KVCS_037 EBCDIC code page 037 Deprecated
KVCS_1026 EBCDIC code page 1026 Deprecated
KVCS_500 EBCDIC code page 500 Deprecated
KVCS_875 EBCDIC code page 875 Deprecated
KVCS_LMBCS Lotus multibyte character set Group 1 and Group 2 N
KVCS_UNICODE Unicode, UCS-2
KVCS_UTF16 16-bit Unicode transformation format in little-endian byte order
KVCS_UTF8 8-bit Unicode transformation format Y
KVCS_UTF7 7-bit Unicode transformation format Deprecated
KVCS_2022_JP ISO 2022-JP, Japanese mail and news safe encoding (JIS-7) N
KVCS_2022_CN ISO 2022-CN, Chinese mail and news safe encoding N
KVCS_2022_KR ISO 2022-KR, Korean mail and news safe encoding N
KVCS_WP6X Word Perfect 6.x and higher character mapping N
KVCS_10000 Western European (Macintosh) Deprecated
KVCS_KSC5601 Unified Hangul Deprecated
KVCS_GB2312 Simplified Chinese (China, Singapore, Hong Kong) Deprecated
KVCS_GB12345 Traditional Chinese (China) - analogue of GB2312 Deprecated
KVCS_CNS11643 Traditional Chinese - Taiwan. Supplement to Big5 Deprecated
KVCS_JIS0201 Japanese - contains ASCII character set (JIS-Roman) N
KVCS_JIS0212 Japanese. Supplement to JIS0208. Deprecated
KVCS_EUC_JP Japanese Extended UNIX Code Deprecated
KVCS_EUC_GB Simplified Chinese Extended UNIX Code Deprecated
KVCS_EUC_BIG5 Traditional Chinese Extended UNIX Code N
KVCS_EUC_KSC Korean Extended UNIX Code N
KVCS_424 EBCDIC Hebrew N
KVCS_856 PC Hebrew (old) N
KVCS_1006 IBM AIX Pakistan (Urdu) N
KVCS_KOI8R Cyrillic (Russian) Deprecated
KVCS_PDF_JAPAN1 Adobe-Japan1-2 character collection N
KVCS_PDF_KOREA1 Adobe-Korea1-0 character collection N
KVCS_PDF_GB1 Adobe-GB1-3 character collection N
KVCS_PDF_CNS1 Adobe-CNS1-2 character collection N
KVCS_2022_JP_8 ISO 2022-JP, Japanese mail and news safe encoding (JIS8) N
KVCS_720 Arabic DOS-720 Deprecated
KVCS_VISCII Vietnamese VISCII Deprecated
KVCS_8859_10 ISO 8859-10 (Latin 6 Nordic) Deprecated 1
KVCS_8859_13 ISO 8859-13 (Latin 7 Baltic) Deprecated 1
KVCS_57002 ISCII Devanagari (x-iscii-de) Deprecated1
KVCS_57003 ISCII Bengali (x-iscii-be) Deprecated 1
KVCS_57004 ISCII Tamil (x-iscii-ta) Deprecated1
KVCS_57005 ISCII Telugu (x-iscii-te) Deprecated1
KVCS_57006 ISCII Assamese (x-iscii-as) Deprecated1
KVCS_57007 ISCII Oriya (x-iscii-or) Deprecated1
KVCS_57008 ISCII Kannada (x-iscii-ka) Deprecated1
KVCS_57009 ISCII Malayalam (x-iscii-ma) Deprecated1
KVCS_57010 ISCII Gujarathi (x-iscii-gu) Deprecated1
KVCS_57011 ISCII Panjabi (x-iscii-pa) Deprecated 1
KVCS_GB18030b2 Reserved for internal use n/a
KVCS_GB18030 GB18030 (Chinese 4-byte character set) Deprecated
KVCS_8859_11 ISO 8859-11 (Thai) Deprecated
KVCS_8859_16 ISO 8859-16 (Latin-10 South-Eastern Europe) Deprecated
KVCS_ARABICMAC Arabic Mac (x-mac-arabic) Deprecated
KVCS_KOI8U Cyrillic (KOI8U Ukrainian) Deprecated
KVCS_HZGB2312 The 7-bit representation of GB 2312 / RFC 1842 n/a
KVCS_UTF32 32-bit Unicode transformation format N