egghelp.org community Forum Index
[ egghelp.org home | forum home ]
egghelp.org community
Discussion of eggdrop bots, shell accounts and tcl scripts.
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Problem with special characters ² ³ and °
Goto page Previous  1, 2
 
Post new topic   Reply to topic    egghelp.org community Forum Index -> Scripting Help
View previous topic :: View next topic  
Author Message
arfer
Master


Joined: 26 Nov 2004
Posts: 436
Location: Manchester, UK

PostPosted: Sun Dec 28, 2008 7:41 pm    Post subject: Reply with quote

Reference my post above, I have since done much reading and muttering under my breath in an attempt to understand the difference between ³ as per a copy/paste from say windows character map and a ³ as generated from the hex notation \xB3. I am non the wiser. Even binary scanning the character shows them to have the same underlying value.

At least I am 50% happy in that I have a solution. Simply build up the regsub pattern BOTH from an explicit copy/paste of the characters themself AND from their implicit hex equivalents.

use --> [regsub -all -- {[°²³\xB0\xB2\xB3]} $varname "\\\\&"]

This is the result when I build up the variable using windows character map for the special characters :-

% set mytest "at xy³ + y² temperature is 14°"
at xy³ + y² temperature is 14°
% return [regsub -all -- {[°²³\xB0\xB2\xB3]} $mytest "\\\\&"]
at xy\³ + y\² temperature is 14\°

This is the result when I build up the variable using hex notation for the special characters :-

% set mytest "at xy\xB3 + y\xB2 temperature is 14\xB0"
at xy³ + y² temperature is 14°
% return [regsub -all -- {[°²³\xB0\xB2\xB3]} $mytest "\\\\&"]
at xy\³ + y\² temperature is 14\°

Works for both.

My guess is that this is pretty much the dirty solution the original poster found.

Please, please, somebody explain what the difference is so that I may sleep peacefully.

/me wanders off threatening a terrible revenge on all descendants of Charles Babbage.
Back to top
View user's profile Send private message
speechles
Revered One


Joined: 26 Aug 2006
Posts: 1398
Location: emerald triangle, california (coastal redwoods)

PostPosted: Sun Dec 28, 2008 8:52 pm    Post subject: Reply with quote

Probably has more to do with encodings, using eggdrop v1.6.17 and tcl 8.4
Quote:
<speechles> .tcl set mytest "at xy³ + y² temperature is 14°"
<bot> Tcl: at xy? + y? temperature is 14?
<speechles> .tcl set testing [regsub -all -- {[°²³]} $mytest "\\\\&"]
<bot> Tcl: at xy\? + y\? temperature is 14\?

This is how it works as iso8859-1, those chars aren't represented correctly they get the question mark treatment. But apparently it has worked because the escapes are properly placed. But if we instead use "utf-8"...
Quote:
<speechles> .tcl set mytest [encoding convertto "utf-8" "at xy³ + y² temperature is 14°"]
<bot> Tcl: at xy³ + y² temperature is 14°
<speechles> .tcl set testing [regsub -all -- {[°²³]} $mytest "\\\\&"]
<sp33chy> Tcl: at xy³ + y² temperature is 14°

Fails, but I can see them clearly...
The work around of course is to use binary/octal/decimal/hex notation when referencing these characters or using the correct encoding to begin with...

When eggdrop finally supports utf-8, and latin charsets aren't confused with iso8859-1 representations.. Well, at that time all this stuff will probably not need work arounds any longer..
_________________
speechles' eggdrop tcl archive
Back to top
View user's profile Send private message
arfer
Master


Joined: 26 Nov 2004
Posts: 436
Location: Manchester, UK

PostPosted: Sun Dec 28, 2008 9:06 pm    Post subject: Reply with quote

Works fine for me using partyline Tcl.

My guess is you are not using a utf-8 compliant IRC client or you are in the bot's partyline via telnet.

Should work in DCC CHAT within mIRC or XChat (I'm using mIRC) providing they are set to display utf-8 by default.
Back to top
View user's profile Send private message
speechles
Revered One


Joined: 26 Aug 2006
Posts: 1398
Location: emerald triangle, california (coastal redwoods)

PostPosted: Sun Dec 28, 2008 9:15 pm    Post subject: Reply with quote

arfer wrote:
Works fine for me using partyline Tcl.

My guess is you are not using a utf-8 compliant IRC client or you are in the bot's partyline via telnet.

Should work in DCC CHAT within mIRC or XChat (I'm using mIRC) providing they are set to display utf-8 by default.

Why does it matter what my irc client does? You aren't seeing the bigger picture. What you are trying to regsub, the encoding you set it to, and what you are regsubbing in, it's encoding both matter.
Quote:
<speechles> .tcl set mytest [encoding convertto "utf-8" "at xy³ + y² temperature is 14°"]
<bot> Tcl: at xy³ + y² temperature is 14°
<speechles> .tcl set mytest2 [encoding convertto "utf-8" "\[°²³\]"]
<sp33chy> Tcl: [°²³]
<speechles> .tcl set testing [regsub -all -- "$mytest2" $mytest "\\\\&"]
<sp33chy> Tcl: at xy\Â\³ + y\Â\² temperature is 14\Â\°

This is meant to demonstrate working outside of the bot's internal encoding or system encoding. You can get it to work, you just have to be explicit.

If you check out the unofficial incith google script, you can see this issue causes problems in several places and has numerous work arounds.
_________________
speechles' eggdrop tcl archive
Back to top
View user's profile Send private message
arfer
Master


Joined: 26 Nov 2004
Posts: 436
Location: Manchester, UK

PostPosted: Sun Dec 28, 2008 9:31 pm    Post subject: Reply with quote

Sorry my mistake.

The bot's partyline seems incapable of interpreting/displaying UTF-8 characters by default. No amount of encoding seems to change that, as your post confirms.

My original posts were using a public commands Tclsh and so done through a mIRC bot channel, hence my solution works because mIRC displays UTF-8 by default.

My solution would likewise work in a TCL script not confined to display within a non UTF-8 environment.
Back to top
View user's profile Send private message
speechles
Revered One


Joined: 26 Aug 2006
Posts: 1398
Location: emerald triangle, california (coastal redwoods)

PostPosted: Sun Dec 28, 2008 9:38 pm    Post subject: Reply with quote

arfer wrote:
My solution would likewise work in a TCL script not confined to display within a non UTF-8 environment.

When eggdrop v1.6.20 is released, eggdrop should become a workable utf-8 environment finally. At that time I'll probably update my irc client, until then... work-around is the name of the game. ^_~
_________________
speechles' eggdrop tcl archive
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    egghelp.org community Forum Index -> Scripting Help All times are GMT - 4 Hours
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Forum hosting provided by Reverse.net

Powered by phpBB © 2001, 2005 phpBB Group
subGreen style by ktauber