| View previous topic :: View next topic |
| Author |
Message |
Reynaldo Halfop
Joined: 11 May 2005 Posts: 54
|
Posted: Wed Jan 20, 2010 6:21 am Post subject: Please help with this character ( ¤) |
|
|
| Code: |
if {[string match -nocase "*¤*" $rname] == 1 || [string length $rname] > 45} { .... }
|
any idea how to detect that chars "¤" it doesn't work, but in my mirc scritps it's working well.
it mirc, the chars in $chr(164) will return to "¤" , i have no idea in tcl, seems so confusing.
| Code: |
if ($chr(164) isin $9-) { .... }
|
|
|
| Back to top |
|
 |
TCL_no_TK Owner

Joined: 25 Aug 2006 Posts: 509 Location: England, Yorkshire
|
|
| Back to top |
|
 |
arfer Master

Joined: 26 Nov 2004 Posts: 436 Location: Manchester, UK
|
Posted: Wed Jan 20, 2010 8:52 am Post subject: |
|
|
| Code: |
if {[string match "*\xA4*" $rname] || [string length $rname] > 45} { .... }
|
The above should work fine because characters up to hex FF can be represented by using a backslash followed by an x and then the two digit hex number (164 decimal == A4 hex).
[12:51] <arfer> .tcl set varname \xA4
[12:51] <Baal> Tcl: ¤
BTW, you dont need to use == 1 with a string equal / string match. The fact that they return 1 (true) or 0 (false) can be used directly in a 'if' statement. Neither do you need the -nocase option since \xA4 is one specific character ie. it does not have an upper/lower case. _________________ I must have had nothing to do |
|
| Back to top |
|
 |
Reynaldo Halfop
Joined: 11 May 2005 Posts: 54
|
Posted: Wed Jan 20, 2010 11:54 pm Post subject: |
|
|
| arfer wrote: | | Code: |
if {[string match "*\xA4*" $rname] || [string length $rname] > 45} { .... }
|
The above should work fine because characters up to hex FF can be represented by using a backslash followed by an x and then the two digit hex number (164 decimal == A4 hex).
[12:51] <arfer> .tcl set varname \xA4
[12:51] <Baal> Tcl: ¤
BTW, you dont need to use == 1 with a string equal / string match. The fact that they return 1 (true) or 0 (false) can be used directly in a 'if' statement. Neither do you need the -nocase option since \xA4 is one specific character ie. it does not have an upper/lower case. |
still the bot cannot detect that character, i've try to use "*\00164*" also not working. |
|
| Back to top |
|
 |
Reynaldo Halfop
Joined: 11 May 2005 Posts: 54
|
Posted: Fri Jan 22, 2010 2:17 am Post subject: |
|
|
how to use it?
| Code: | if {[string match "*[format {%c} "164"]*" $rname] || [string length $rname] > 45} { .... }
|
it's right? thanks for the advice. |
|
| Back to top |
|
 |
TCL_no_TK Owner

Joined: 25 Aug 2006 Posts: 509 Location: England, Yorkshire
|
Posted: Fri Jan 22, 2010 7:21 am Post subject: |
|
|
| Code: | set rname [format {%c} 164]
set test [scan $rname %c]
if {$test == 164} {
puts "It Worked!"
} |  _________________ TCL the misunderstood |
|
| Back to top |
|
 |
Reynaldo Halfop
Joined: 11 May 2005 Posts: 54
|
Posted: Fri Jan 22, 2010 8:43 pm Post subject: |
|
|
| TCL_no_TK wrote: | | Code: | set rname [format {%c} 164]
set test [scan $rname %c]
if {$test == 164} {
puts "It Worked!"
} |  |
| Code: | | set rname [format {%c} 164] |
seems it's not working. , the character is in $rname. |
|
| Back to top |
|
 |
TCL_no_TK Owner

Joined: 25 Aug 2006 Posts: 509 Location: England, Yorkshire
|
Posted: Fri Jan 22, 2010 10:13 pm Post subject: |
|
|
| Code: | set text "¤ Mp3 ¤ My Song - Yeah, it is.mp3 ¤"
set string [lindex [split $text] 0]
set test [scan $string %c]
if {$test == 164} {
putlog "detected that $text! starts with ¤ :P"
} | There's probably a much easyer and better way to do this _________________ TCL the misunderstood |
|
| Back to top |
|
 |
Reynaldo Halfop
Joined: 11 May 2005 Posts: 54
|
Posted: Fri Jan 22, 2010 11:12 pm Post subject: |
|
|
| TCL_no_TK wrote: | | Code: | set text "¤ Mp3 ¤ My Song - Yeah, it is.mp3 ¤"
set string [lindex [split $text] 0]
set test [scan $string %c]
if {$test == 164} {
putlog "detected that $text! starts with ¤ :P"
} | There's probably a much easyer and better way to do this |
Ok, that can detect if the char is in the first of the chars
set text "Mp3 ¤ My Song - Yeah, it is.mp3 ¤"
return of %c will be 77 of M
set hong [scan [string trimleft [lrange $rname 1 end]] %c] |
|
| Back to top |
|
 |
arfer Master

Joined: 26 Nov 2004 Posts: 436 Location: Manchester, UK
|
Posted: Sat Jan 23, 2010 12:37 pm Post subject: |
|
|
Sorry about my earlier post. I should have tested first. I have no clue why it does not work. Other than possible solutions provided by TCL_no_TK methodology, the only way I can get the thing to work is by using a regexp with the actual character in question inside grouping elements ([ and ]), as follows :-
[16:27] <@arfer> % return [regexp -- {[¤]} "bla bla bla bla bla ¤"]
[16:27] <@Baal> 1
What really puzzles me is why none of the following will work :-
[16:29] <@arfer> % return [regexp -- {¤} "bla bla bla bla bla ¤"]
[16:29] <@Baal> 0
[16:29] <@arfer> % return [regexp -- {.*¤.*} "bla bla bla bla bla ¤"]
[16:29] <@Baal> 0
[16:35] <@arfer> % return [regexp -- {\xA4} "bla bla bla bla bla ¤"]
[16:35] <@Baal> 0
[16:36] <@arfer> % return [regexp -- {[\xA4]} "bla bla bla bla bla ¤"]
[16:36] <@Baal> 0
Anyway, the answer to your original query would be :-
| Code: |
if {([regexp -- {[¤]} $rname]) || ([string length $rname] > 45)} { .... }
|
I think!
I wouldn't mind some sort of explanation for this myself. Where is our friend nml375 when we need him? _________________ I must have had nothing to do |
|
| Back to top |
|
 |
nml375 Revered One
Joined: 04 Aug 2006 Posts: 2857
|
Posted: Sat Jan 23, 2010 1:13 pm Post subject: |
|
|
This smells like a character set issue...
All four "non-working" tests evaluate to true under a simple tclsh environment for me. When pasting the same code into a telnet session (still using the very same putty setup), all ¤ are converted to $, causing the patterns with \xA4 to fail, but the others to work...
Doing something as simple as "scan [format %c 164] %c" returns 164, as expected. If I enter the ¤ into the strings and patterns using [format ...], all patterns work without a problem, including the \xA4 ones... _________________ NML_375, idling at #eggdrop@IrcNET |
|
| Back to top |
|
 |
speechles Revered One

Joined: 26 Aug 2006 Posts: 1398 Location: emerald triangle, california (coastal redwoods)
|
Posted: Sat Jan 23, 2010 2:17 pm Post subject: |
|
|
| nml375 wrote: | | This smells like a character set issue... |
Indeed, this is one of those iso8859-1/utf-8 rendering issues. The easiest way to solve it, is to force the string your comparing against into utf-8. Then always use "\xa4" to do _any_ matching. -- OR-- Simply utf-8 patch your bot with thommey's utf-8 patch...
| Code: | ** via partyline -- those Â's in the bot's response aren't rendered on IRC, just here on the forum. They make all the difference when matching though.
<speechles> .tcl set text [encoding convertto utf-8 "Mp3 ¤ My Song - Yeah, it is.mp3 ¤"]
<bot> Tcl: Mp3 ¤ My Song - Yeah, it is.mp3 ¤
** A direct match of course fails ....
<speechles> .tcl set b [regexp {¤} $text]
<bot> Tcl: 0
** Using \xa4 always works no matter where...
<speechles> .tcl set b [regexp {\xa4} $text]
<bot> Tcl: 1
<speechles> .tcl if {[string match *\xa4* $text]} { set b "works" } { set b "fails" }
<bot> Tcl: works
<speechles> .tcl regexp -- {\xa4(.*?)\xa4} $text -> result
<bot> Tcl: 1
** notice the remnant of that silly  in the output. This is the utf-8 sequence broken by an unpatched bot. The bottom way fixes this.
<speechles> .tcl set showme $result
<bot> Tcl: My Song - Yeah, it is.mp3 Â
** the trick is use an atom (.) prior to the \xa4 in regexp's
<speechles> .tcl regexp -- {.\xa4(.*?).\xa4} $text -> result
<bot> Tcl: 1
<speechles> .tcl set showme $result
<bot> Tcl: My Song - Yeah, it is.mp3
** perfect! |
This is clearly an iso8859-1/utf-8 rendering/matching issue. I've dealt with these before...  _________________ speechles' eggdrop tcl archive |
|
| Back to top |
|
 |
arfer Master

Joined: 26 Nov 2004 Posts: 436 Location: Manchester, UK
|
Posted: Sat Jan 23, 2010 2:44 pm Post subject: |
|
|
Thanks guys. Thought I was going nuts there for a while. _________________ I must have had nothing to do |
|
| Back to top |
|
 |
Reynaldo Halfop
Joined: 11 May 2005 Posts: 54
|
Posted: Sat Jan 23, 2010 11:06 pm Post subject: |
|
|
set rname "bla bla bla bla bla ¤"
| Code: | | if {[regexp {\xA4} $rname] == 1 || [string length $rname] > 45} {..} |
Doesnt works! using eggdrop v1.6.18.
i'm going be silly with this char
how about using scan method? just scan the last chars of $name, cause the ugly spammers always(not) have that silly char at the end of their realname |
|
| Back to top |
|
 |
speechles Revered One

Joined: 26 Aug 2006 Posts: 1398 Location: emerald triangle, california (coastal redwoods)
|
Posted: Sun Jan 24, 2010 10:37 am Post subject: |
|
|
You were already told how to make it work my man. Use "convertfrom utf-8" instead of "convertto utf-8" and it works without artifcating in those  remnants into the string.
| Code: | <speechles> .tcl set rname [encoding convertfrom utf-8 "bla bla bla bla bla ¤"]
<bot> Tcl: bla bla bla bla bla ¤
<speechles> .tcl set result [regexp -- {\xa4} $rname]
<bot> Tcl: 1
<speechles> .tcl set rname [encoding convertfrom utf-8 "bla bla bla bla bla bla bla bla bla bla bla ¤ bla bla bla bla bla bla bla bla"]
<bot> Tcl: bla bla bla bla bla bla bla bla bla bla bla ¤ bla bla bla bla bla bla bla bla
<speechles> .tcl if {[regexp {\xa4} [encoding convertfrom utf-8 $rname]] == 1 || [string length $rname] > 45} { set d "works" } { set d "fails" }
<bot> Tcl: works |
| Code: | | if {[regexp {\xa4} [encoding convertfrom utf-8 $rname]] == 1 || [string length $rname] > 45} {..} |
_________________ speechles' eggdrop tcl archive |
|
| Back to top |
|
 |
|