| View previous topic :: View next topic |
| Author |
Message |
Anahel Halfop

Joined: 03 Jul 2009 Posts: 48 Location: Dom!
|
Posted: Mon Sep 28, 2009 4:00 pm Post subject: |
|
|
had same problem, it patched only main.h, so i manually edited tcl.h, you need to add
| Code: | | encoding = "utf-8"; |
after this:
| Code: | if (encoding == NULL) {
encoding = "iso8859-1";
} |
so i should look like that:
| Code: | if (encoding == NULL) {
encoding = "iso8859-1";
}
encoding = "utf-8" |
|
|
| Back to top |
|
 |
speechles Revered One

Joined: 26 Aug 2006 Posts: 1398 Location: emerald triangle, california (coastal redwoods)
|
Posted: Mon Sep 28, 2009 5:11 pm Post subject: |
|
|
| De Kus wrote: | | Are you aware that the real problem has never been the messages going in and out but the channel/user names?! I see only multilingual messages, but not any channel name. Or am I just looking close enough? |
You are correct in the regard it isn't that hard to get correct output for utf-8. Depending on how manipulation of the string is done. If any elements within the string are replaced with any strings in any others encodings, these encoding will break the utf-8 represetation sequences. The rest of the string beyond this will be shown as iso8859-1 (meaning each byte is rendered, rather than sequencing them properly).
Input has always been affected for me. I'm surprised you haven't experienced it yet. This is the same reason you cannot join a utf-8 channel and instead get the incorrect so8859-1 encoding used. The same thing happens when trying to read a users input from within a bind. It seems for utf-8 any type of input fails (by fail, try nesting 2 languages in that utf-8: english and japanese or russian and french. Using just one makes it too easy). There are ways to work-around this, but they will still fail when dealing with accented vowels. The same way eggdrop's output does for some when dealing with accented vowels (most times they use an elaborate string map to fix this condition, see for yourself). Myself, I've noticed that the (Ã / ascii 195) confuses the utf-8 string, and breaks it back to iso8859-1 encoding. This happens when trying to render french accented sentences in utf-8 on an unpatched bot.
Plus this finally puts to rest those wishing better support for utf-8 within the script. So I felt was worth mentioning ;P _________________ speechles' eggdrop tcl archive |
|
| Back to top |
|
 |
shadrach Halfop
Joined: 14 Dec 2007 Posts: 74
|
Posted: Tue Sep 29, 2009 4:52 pm Post subject: |
|
|
| Thank you, I've got it working. |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|