egghelp.org community Forum Index
[ egghelp.org home | forum home ]
egghelp.org community
Discussion of eggdrop bots, shell accounts and tcl scripts.
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Striping out character

 
Post new topic   Reply to topic    egghelp.org community Forum Index -> Scripting Help
View previous topic :: View next topic  
Author Message
bras
Voice


Joined: 03 Feb 2006
Posts: 7

PostPosted: Mon Feb 06, 2006 12:59 am    Post subject: Striping out character Reply with quote

Hi,

I'm doing a script but I'm having some trouble to remove a character in a text. The text is TIME
I know that is 171 in ASCII code and 187, however I don't know how to represent them in a replacevar procedure. I tried:
set echo [replacevar $echo "\0171" ""]
set echo [replacevar $echo "\0187" ""]

Obviously didn't work Sad Anyone could help me ?
Back to top
View user's profile Send private message
demond
Revered One


Joined: 12 Jun 2004
Posts: 3073
Location: San Francisco, CA

PostPosted: Mon Feb 06, 2006 1:51 am    Post subject: Reply with quote

\0171 and \0187 are invalid character escapes, they should be \253 and \273 (since 171 decimal is 253 octal and 187 is 273)
Code:

string map {\253 {} \273 {}} $str

_________________
connection, sharing, dcc problems? click <here>
before asking for scripting help, read <this>
use [code] tag when posting logs, code
Back to top
View user's profile Send private message Visit poster's website
bras
Voice


Joined: 03 Feb 2006
Posts: 7

PostPosted: Mon Feb 06, 2006 9:09 am    Post subject: Reply with quote

Hi demond, thanks very much for getting some time to help me. You were right about the codes, however I don't know why I don't see to be able to strip them out. Here is what I'm doing

Code:

bind pubm "m|m" *\00312TIME* dotime

proc replacevar {strin what withwhat} {
        set output $strin
        set replacement $withwhat
        set cutpos 0
        while { [string first $what $output] != -1 } {
                set cutstart [expr [string first $what $output] - 1]
                set cutstop  [expr $cutstart + [string length $what] + 1]
                set output [string range $output 0 $cutstart]$replacement[string range $output $cutstop end]
        }
        return $output
}

proc dotime { nick host handle channel text } {
set text [split $text]
set time [lrange $text 5 end]
set echo $time               
set echo [replacevar $echo "\253" ""]
set echo [replacevar $echo "\273" ""]
    putserv "PRIVMSG #newsnet :$echo"
}


I don't know why the replacevar proc is not working for this characters. It has always worked for me. An example of the text where I'm stripping out would be:

In Rio de Janeiro : 23h 12m 30s TIME

What I want is only the time, which is not always in this format, that's why I'm trying to work with what is between : and

Would you have any idea why I can't strip out and ?

Thanks again!
Back to top
View user's profile Send private message
demond
Revered One


Joined: 12 Jun 2004
Posts: 3073
Location: San Francisco, CA

PostPosted: Mon Feb 06, 2006 11:54 pm    Post subject: Reply with quote

get rid of that [replacevar] proc, Tcl has built-in proc for replacing string(s) within a string, it's called [string map] (there is also [string replace] of course, but it doesn't suit you for what you need)
_________________
connection, sharing, dcc problems? click <here>
before asking for scripting help, read <this>
use [code] tag when posting logs, code
Back to top
View user's profile Send private message Visit poster's website
bras
Voice


Joined: 03 Feb 2006
Posts: 7

PostPosted: Tue Feb 07, 2006 12:44 am    Post subject: Reply with quote

I used string map too, didn't work neither.

Quote:

set data [string map {"\273" ""} $time]


Can remove everything else but those signs. Can't understand why. Thanks anyway for your patience demond.
Back to top
View user's profile Send private message
demond
Revered One


Joined: 12 Jun 2004
Posts: 3073
Location: San Francisco, CA

PostPosted: Tue Feb 07, 2006 1:19 am    Post subject: Reply with quote

really?
Code:

% set a foo\273bar
foo?bar
% string map {\273 {}} $a
foobar

_________________
connection, sharing, dcc problems? click <here>
before asking for scripting help, read <this>
use [code] tag when posting logs, code
Back to top
View user's profile Send private message Visit poster's website
spock
Master


Joined: 12 Dec 2002
Posts: 319

PostPosted: Tue Feb 07, 2006 1:23 am    Post subject: Reply with quote

try \xAB and \xBB

actually f*** that, if demond's suggestion doesnt work then min ewont either (PEBKAC)
_________________
photon?
Back to top
View user's profile Send private message
bras
Voice


Joined: 03 Feb 2006
Posts: 7

PostPosted: Tue Feb 07, 2006 10:09 am    Post subject: Reply with quote

Yep... neither worked for me...

I found out that it's happening because there are color escapes near the characters I'm working with. Its not \003 though... are there (in case of yes, which are) any other ways to end a color escape besides \003 ?
Back to top
View user's profile Send private message
bras
Voice


Joined: 03 Feb 2006
Posts: 7

PostPosted: Tue Feb 07, 2006 5:46 pm    Post subject: Reply with quote

Just to show what I'm talking about... forgot about the image
Back to top
View user's profile Send private message
demond
Revered One


Joined: 12 Jun 2004
Posts: 3073
Location: San Francisco, CA

PostPosted: Tue Feb 07, 2006 11:43 pm    Post subject: Reply with quote

you simply don't know your codes

print them out with:
Code:

foreach c [split $str {}] {binary scan $c H2 x; putlog "$c \\x$x"}

_________________
connection, sharing, dcc problems? click <here>
before asking for scripting help, read <this>
use [code] tag when posting logs, code
Back to top
View user's profile Send private message Visit poster's website
awyeah
Revered One


Joined: 26 Apr 2004
Posts: 1580
Location: Switzerland

PostPosted: Tue Jul 10, 2007 6:50 am    Post subject: Reply with quote

Actually hes right. Today I was working with this, researched deeply on this topic for 2-3hrs and tested my bot.

The only codes which can be removed, stripped, detected in string or list are from the following range:

Code:

In octal: \300-\377
In hexadecimal: \xC0-\xFF


I tried everything from regexp, regsub and even string map, but the codes from in the range:

Code:

In octal: \200-\277
In hexadecimal: \x80-\xBF


were not detected through anyway. For this I also performed some tests. Here is one of them shown.

In this one I use the whole range as you can see 128 chars and for regexp matching I used \200-\277 & 300-\377 to detect, generally all should be detected, but only \300-\377 were detected.

Code:

<awyeah> .tcl string length ""
<adapter> Tcl: 128

<awyeah> !test ""
<adapter> Remaining: ""


Further I also used regsub to substitude and string map also, they gave me similar answers.

So my conclusion, for wasting the whole afternoon and working on this was that:

In the character range:

Code:

octal: \200-\277 and \300-\377
hexadecimal: \x80-\xFF


Only the range:

Code:

In octal: \300-\377
In hexadecimal: \xC0-\xFF


is detectable.
_________________
·awyeah·

==================================
Facebook: jawad@idsia.ch (Jay Dee)
PS: Guys, I don't accept script helps or requests personally anymore.
==================================


Last edited by awyeah on Tue Jul 10, 2007 8:25 pm; edited 1 time in total
Back to top
View user's profile Send private message Send e-mail Visit poster's website Yahoo Messenger MSN Messenger
awyeah
Revered One


Joined: 26 Apr 2004
Posts: 1580
Location: Switzerland

PostPosted: Tue Jul 10, 2007 8:14 am    Post subject: Reply with quote

Follow up of my previous post. For testing:

In partyline I got this:

Code:

<awyeah> .tcl string map {"" "" "" "" "" "" "" "" "" "" "" ""} "werytyrtewretrwerwetertfg"
<adapter> Tcl: werytyrtewretrwerwetertfg

<awyeah> .tcl string match "**" "werytyrtewretrwerwetertfg"
<adapter> Tcl: 0

<awyeah> .tcl string match "**" "werytyrtewretrwerwetertfg"
<adapter> Tcl: 1


This indicates everything is working correctly in partyline.
Now check, when I load the tcl into the bot and then test.

For this proc, (tcl loaded into the bot):

Code:

bind pub - !test testing

proc testing {n u h c t} {
 set i [string map {"\x8A" "" "\x8C" "" "\x8E" "" "\x9C" "" "\x9E" "" "\x9F" ""} $t]
 putserv "PRIVMSG #adapter :String map: $i"
 if {[string match -nocase "*\x8C*" $t] || [string match -nocase "*\x9E*" $t]} {
 putserv "PRIVMSG #adapter :Match found"
 } else {
 putserv "PRIVMSG #adapter :No match found"
 }
}


and for the same string, I got these results:

Code:

<awyeah> !test "werytyrtewretrwerwetertfg"
<adapter> String map: "werytyrtewretrwerwetertfg"
<adapter> No match found


Means there is definately something wrong.
Evidently, I also check for this proc:

Code:

bind pub - !test testing

proc testing {n u h c t} {
 set i [string map {"" "" "" "" "" "" "" "" "" "" "" ""} $t]
 putserv "PRIVMSG #adapter :String map: $t"
 if {[string match -nocase "**" $t]} {
 putserv "PRIVMSG #adapter :Match found"
 } else {
 putserv "PRIVMSG #adapter :No match found"
 }
}


It also gave me the same result as above:

Code:

<awyeah> !test "werytyrtewretrwerwetertfg"
<adapter> String map: "werytyrtewretrwerwetertfg"
<adapter> No match found


Further more as a conclusion from what I've read there might be 2 identified problems for this case:

1) http://www.ascii.cl/htmlcodes.htm << this page lists that characters from the range \x80-\xBF (or \200-\277) are NOT defined in HTML 4 standard
2) From: /eggdrop/docs/known-problems

Quote:

* High-bit characters are being filtered from channel names. This is a
fault of the Tcl interpreter, and not Eggdrop. The Tcl interpreter
filters the characters when it reads a file for interpreting. Update
your Tcl to version 8.1 or higher.

* Version 8.1 of Tcl doesn't support unicode characters, for example, .
If those characters are handled in a script as text, you run into errors.
Eggdrop can't handle these errors at the moment.


However, strange as it may seem my shell provider has tcl version 8.4 and patch upto 8.4.11.

I think these major two are the basic problems, due to which my aim is not achievable. If anyone has anything to say or any comment, regarding my conclusion, please follow up my post.

Thanks,
JD
_________________
·awyeah·

==================================
Facebook: jawad@idsia.ch (Jay Dee)
PS: Guys, I don't accept script helps or requests personally anymore.
==================================
Back to top
View user's profile Send private message Send e-mail Visit poster's website Yahoo Messenger MSN Messenger
awyeah
Revered One


Joined: 26 Apr 2004
Posts: 1580
Location: Switzerland

PostPosted: Wed Jul 11, 2007 3:02 am    Post subject: Reply with quote

Actually, I got it infact. Its quite easy, I readup today about encoding different ascii character sets, and then tested on some. The major two which can be used for this case are: cp1252 and iso8859-1.

I tried with cp1252 for the proc below, it didnot completely strip the characters and ended up with stripping some and leaving some weird characters as you can see in the output.

Code:

bind pub - !test testing

proc testing {n u h c t} {
 regsub -all {[\200-\377]} [encoding convertfrom cp1252 $t] {} a
 putserv "privmsg #adapter :CP1252: $a"
 regsub -all {[\200-\377]} [encoding convertfrom iso8859-1 $t] {} b
 putserv "privmsg #adapter :ISO8859-1: $b"
}


When I used iso8859-1, everything was stripped off completely as I wanted it to be, see the results below. Smile

Code:

<awyeah> !test "dffdgdffgddsderyrtdfdfertdfseerftdstrydsrtsdfrtyrtdsffsddsfsddfsdtrysdfsdtytrrtjhmjhmmkhjrtmkhjk,hjh,kluihjkhjkuytiuyikwefsewrddssdfdfsffsfssdsdfsddstyfrtsdsdfsd"

<adapter> CP1552: "dffdgdffg&d dsderyrt!df`9R}dfertdfse""a:Serft~dsxtrydsrtsdfrtyrtdsffsddsfsddfsdtrysdfsdtytrrtjhmjhmmkhjrtmkhjk,hjh,kluihjkhjkuytiuyikwefsewrddssdfdfsffsfssdsdfsddstyfrtsdsdfsd"

<adapter> ISO8859-1: "dffdgdffgddsderyrtdfdfertdfseerftdstrydsrtsdfrtyrtdsffsddsfsddfsdtrysdfsdtytrrtjhmjhmmkhjrtmkhjk,hjh,kluihjkhjkuytiuyikwefsewrddssdfdfsffsfssdsdfsddstyfrtsdsdfsd"


Hence to completely be able to use the complete range \200-\377 or \x80-\xFF you need to encode the text in the proc and convertfrom iso8859-1.

Mission successful!
_________________
·awyeah·

==================================
Facebook: jawad@idsia.ch (Jay Dee)
PS: Guys, I don't accept script helps or requests personally anymore.
==================================
Back to top
View user's profile Send private message Send e-mail Visit poster's website Yahoo Messenger MSN Messenger
Display posts from previous:   
Post new topic   Reply to topic    egghelp.org community Forum Index -> Scripting Help All times are GMT - 4 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Forum hosting provided by Reverse.net

Powered by phpBB © 2001, 2005 phpBB Group
subGreen style by ktauber