$utfdecode
From Scriptwiki
Revision as of 18:30, 25 February 2011 by NaNg (talk | contribs) (Created page with "Returns the result of decoding the provided text from UTF-8. $utfdecode(text, C) The parameter C is an 8-bit (0-255) number referencing a GDI charset number for specific code p...")
Returns the result of decoding the provided text from UTF-8.
$utfdecode(text, C)
The parameter C is an 8-bit (0-255) number referencing a GDI charset number for specific code page to unicode resolution.
Available charset numbers
for a list of available charset numbers, see $utfencode.
Actual transition
Behind the scene, $utfdecode does the inverted operation of $utfencode, which is the following:
- Unicode char code is encoded using $utfencode to form its UTF-8 analogue.
- Looks up the char code in unicode's charset map to be of what abstract character, e.g. for C the abstract character is LATIN CAPITAL LETTER C.
- The abstract character is located in the given charset map.
- The located char code is used to encode the char in the given charset map encoding.
Decoding UTF-8 with code page references is more intricate and, at present, slightly flawed insofar as mIRC will attempt to decode invalid UTF-8.
Examples
$asc($utfdecode($utfencode($chr(915)), 161))
This example will return 195, which is the ANSI char code of the greek capital letter Gamma.