$utfencode

Returns the result of encoding the provided text in UTF-8.

$utfencode(text, C)

The parameter C is an 8-bit (0-255) number referencing a GDI charset number for specific code page to unicode resolution.

Available charset numbers

The following table describes the available charset numbers:

GDI charset number	Charset
000	ANSI_CHARSET
001	DEFAULT_CHARSET
002	SYMBOL_CHARSET
077	MAC_CHARSET
128	SHIFTJIS_CHARSET
129	HANGEUL_CHARSET
130	JOHAB_CHARSET
134	GB2312_CHARSET
136	CHINESEBIG5_CHARSET
161	GREEK_CHARSET
162	TURKISH_CHARSET
163	VIETNAMESE_CHARSET
177	HEBREW_CHARSET
178	ARABIC_CHARSET
186	BALTIC_CHARSET
204	RUSSIAN_CHARSET
222	THAI_CHARSET
238	EASTEUROPE_CHARSET
255	OEM_CHARSET

If no charset number is provided, the default is assumed (001).

Note: GDI charsets 1 and 255 are system dependent and are therefore expected to return different results across different machines. Values not on the table are treated as a reference to the default (001).

Actual transition

Behind the scene, $utfencode does the following:

Looks up the char code in the charset map to be of what abstract character, e.g. for C the abstract character is LATIN CAPITAL LETTER C.
The abstract character is found in unicode's charset map.
The located char code is used to encode the char in UTF-8 encoding.

It is meaningless to supply a value for C when attempting to encode Unicode code points beyond the first 256. The reason for this is simple: these code pages are byte based and cannot make sense of these extended values.

Examples

$asc($utfdecode($utfencode($chr(195), 161)))

This example will return 915, which is the unicode char code of the greek capital letter Gamma.

$utfencode

Contents

Available charset numbers

Actual transition

Examples

See Also

Navigation menu

$utfencode

Available charset numbers

Actual transition

Examples

See Also

Navigation menu

Search