Difference between revisions of "How to display ANSI chars in UTF-8 format"
(Created page with "mIRC 7 (and later) has changed the default text encoding from ANSI to UTF-8 encoded text only. As a result, many of the users who still use older mIRC versions (version 6.*) are ...") |
m (Better explaining the encoding problem) |
||
Line 1: | Line 1: | ||
− | mIRC 7 (and later) has changed the default text encoding from ANSI to UTF-8 encoded text only. As a result, | + | mIRC 7 (and later) has changed the default text encoding from ANSI to UTF-8 encoded text only. As a result, users who are still using older mIRC versions (version 6.*) without using UTF-8 to encode their non-english messages, send messages that cannot be viewed correctly in newer, UTF-8 only, clients. It is beyond the scope of the script wiki to explain about UTF-8 and ANSI encodings. |
'''Note:''' All the following examples and information refers to mIRC 7.1 and later. | '''Note:''' All the following examples and information refers to mIRC 7.1 and later. | ||
Line 30: | Line 30: | ||
Using the previous alias, automatic possible transformation of ANSI encoded messages to UTF-8 can be done. | Using the previous alias, automatic possible transformation of ANSI encoded messages to UTF-8 can be done. | ||
− | on $*:TEXT:$(/[ [[DollarPlus|$+]] [[$chr]](224) [[DollarPlus|$+]] - [[DollarPlus|$+]] [[$chr]](250) [[DollarPlus|$+]] ]/): | + | on $*:TEXT:$(/[ [[DollarPlus|$+]] [[$chr]](224) [[DollarPlus|$+]] - [[DollarPlus|$+]] [[$chr]](250) [[DollarPlus|$+]] ]/):#,?: { |
[[If-Then-Else|if]] ($heb2utf([[$1-|$1-]]) != [[$1-|$1-]]) { | [[If-Then-Else|if]] ($heb2utf([[$1-|$1-]]) != [[$1-|$1-]]) { | ||
[[If-Then-Else|if]] ([[$chan_(remote)|$chan]]) { | [[If-Then-Else|if]] ([[$chan_(remote)|$chan]]) { |
Latest revision as of 21:13, 25 February 2011
mIRC 7 (and later) has changed the default text encoding from ANSI to UTF-8 encoded text only. As a result, users who are still using older mIRC versions (version 6.*) without using UTF-8 to encode their non-english messages, send messages that cannot be viewed correctly in newer, UTF-8 only, clients. It is beyond the scope of the script wiki to explain about UTF-8 and ANSI encodings.
Note: All the following examples and information refers to mIRC 7.1 and later.
Contents
Viewing ANSI chars in UTF-8 format
In order to view ANSI encoded text in UTF-8 format, mIRC has two identifiers for the job: $utfencode and $utfdecode, which can be used as in the following example:
alias heb2utf { if ($isid) { return $utfdecode($utfencode($1-, 177)) } echo -ag $utfdecode($utfencode($1-, 177)) }
This example transforms Hebrew ANSI chars to UTF-8 encoded chars, and can be used either as an identifier or a command, in which it will either return the encoded text or display it, accordingly.
Sending ANSI chars from UTF-8 format
In order to send ANSI chars, the text should be tranformed to ANSI encoding, and then be sent explicitly (because mIRC transforms the underlying bytes to UTF-8 encoded text and sends it as UTF-8). In order to do so, the raw command with the -n parameter is to be used, as in the following example:
alias sayheb { .raw -n PRIVMSG $active : $+ $utfdecode($utfencode($1-), 177) echo -at < $+ $iif($chan, $nick($chan, $me).pnick, $me) $+ > $1- }
This example writes a Hebrew ANSI encoded message to the active window.
Examples
Using the previous alias, automatic possible transformation of ANSI encoded messages to UTF-8 can be done.
on $*:TEXT:$(/[ $+ $chr(224) $+ - $+ $chr(250) $+ ]/):#,?: { if ($heb2utf($1-) != $1-) { if ($chan) { echo -gt $chan Possible Hebrew translation ( $+ $nick($chan, $nick).pnick $+ ): $heb2utf($1-) } elseif ($query($nick)) { echo -gt $nick Possible Hebrew translation ( $+ $nick $+ ): $heb2utf($1-) } } }
This example catches all messages which may contain Hebrew ANSI encoded chars and displays the UTF-8 encoded transformed message.
This can be used for other languages as well, by following these changes:
- Find the ANSI code of the first letter in the new language's alpha-bet, e.g. in this example - 224 is the ANSI code of the first letter in the Hebrew alpha-bet - "Alef"
- Find the ANSI code of the last letter in the new language's alpha-bet, e.g. in this example - 250 is the ANSI code of the last letter in the Hebrew alpha-bet - "Tav"
- Replace the previously found codes in the regex of the event trigger. This will make the event catch only possible transformable messages, and not all messages.
- Replace the GDI charset number in the "heb2utf" alias to the new language number.
- Optional: Change the name of the alias "heb2utf" (and all its references) to "<yourlang>2utf", e.g. "greek2utf".