egghelp.org community Forum Index
[ egghelp.org home | forum home ]
egghelp.org community
Discussion of eggdrop bots, shell accounts and tcl scripts.
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Regexp Problem

 
Post new topic   Reply to topic    egghelp.org community Forum Index -> Scripting Help
View previous topic :: View next topic  
Author Message
Jarek
Voice


Joined: 19 Nov 2007
Posts: 3

PostPosted: Mon Nov 19, 2007 10:45 am    Post subject: Regexp Problem Reply with quote

Hi Folks.

I'd like to get the profile id value out of this line:
Code:

<td align="center"><a class="profil_link" href="javascript:;" onclick="window.open('/profile/index.php?profile_id=20129','_blank','width=730,height=600,status=no,toolbars=no,scrollbars=yes');"><img class="td_border" src="/pictures/60x80/11-07/20129_47400a57eefb8.jpg" width="60" height="80" border="0" alt="jaroslove"></a></td>


How I've to build the regular expression to get the value "20129"?

Thanks.
Back to top
View user's profile Send private message
rosc2112
Revered One


Joined: 19 Feb 2006
Posts: 1454
Location: Northeast Pennsylvania

PostPosted: Mon Nov 19, 2007 12:33 pm    Post subject: Reply with quote

Assuming the data is in a var called $html:
Code:

regexp {'/profile/index.php?profile_id=(.*?)'} $html fullmatch exactmatch

# the data you want will be in $exactmatch var.
Back to top
View user's profile Send private message
Jarek
Voice


Joined: 19 Nov 2007
Posts: 3

PostPosted: Mon Nov 19, 2007 1:08 pm    Post subject: Reply with quote

Hm, $exactmatch is empty after doing this.

My proc looks this like:

Code:

proc poloniaflirt::internalCom { suche } {

  set fullmatch ""
  set exactmatch ""
  set log1 [open pf.txt a]
  set log2 [open reg.txt a]
  set pfsearchurl "http://www.polonia-flirt.de/search/index.php"
  set pfquery [::http::formatQuery sea_nickname "$suche" send "send"]
  set page [http::config -useragent "Mozilla/4.0 (compatible\; MSIE 6.0\; Windows NT 5.0)"]
  set page [::http::geturl $pfsearchurl -query $pfquery -timeout $poloniaflirt::pftimeout]
  set html [::http::data $page]
  puts $log1 "$html"
  close $log1
  regexp {'/profile/index.php?profile_id=(.*?)'} $html fullmatch exactmatch
  puts $log2 "$exactmatch"
  close $log2
  return $page

}
Back to top
View user's profile Send private message
speechles
Revered One


Joined: 26 Aug 2006
Posts: 1398
Location: emerald triangle, california (coastal redwoods)

PostPosted: Mon Nov 19, 2007 6:21 pm    Post subject: Reply with quote

Jarek wrote:
Code:
regexp {'/profile/index.php?profile_id=(.*?)'} $html fullmatch exactmatch

Code:
regexp {'/profile/index\.php\?profile_id=(.*?)'} $html fullmatch exactmatch

You need to \escape the period(.) and you need to \escape the question mark(?)
Back to top
View user's profile Send private message
rosc2112
Revered One


Joined: 19 Feb 2006
Posts: 1454
Location: Northeast Pennsylvania

PostPosted: Mon Nov 19, 2007 7:29 pm    Post subject: Reply with quote

I don't think that would make much difference, as both . (dot) and ? are wildcard chars, so it should have matched the string.

Is the var $html empty of data? You don't handle any error conditions, so it could be that the data is not being retrieved.

Here is an example of getting html data and handling error conditions, then fishing out the data you want:

Code:

set xeurl "http://www.xe.com/ucc/convert.cgi"
set xequery [::http::formatQuery Amount "$amount" From "$fromcur" To "$tocur"]
catch {set page [::http::geturl $xeurl -query $xequery -timeout $xeutimeout]} error
if {[string match -nocase "*couldn't open socket*" $error]} {
        puthelp "PRIVMSG $nick :Error: couldn't connect to XE.com..Try again later"
        ::http::cleanup $page
        return
}
if { [::http::status $page] == "timeout" } {
        puthelp "PRIVMSG $nick :Error: Connection timed out to XE.com."
        ::http::cleanup $page
        return
}
set html [::http::data $page]
::http::cleanup $page

if {[regexp {>Live rates at (.*?)</span>} $html match xetime]} {
        #some of the IF above has been deleted for this example
        # manipulate the data:
        regsub -all {<!.*?>} $fromamount {} fromamount
        regsub -all {<!.*?>} $toamount {} toamount
        puthelp "PRIVMSG $chan :XE.COM: \002$fromamount\002 equals \002$toamount\002 as of $xetime"
} else {
        puthelp "PRIVMSG $chan :Could not obtain results from XE.com, sorry!"
}
Back to top
View user's profile Send private message
nml375
Revered One


Joined: 04 Aug 2006
Posts: 2857

PostPosted: Mon Nov 19, 2007 9:56 pm    Post subject: Reply with quote

Half right, half wrong...
. would match any character, and would survive not being escaped.
? however does not match any characters by itself, but is used to match 0 or 1 occurances of the prefixed atom (in this case the character p). In this case it must be escaped.
_________________
NML_375, idling at #eggdrop@IrcNET
Back to top
View user's profile Send private message
Jarek
Voice


Joined: 19 Nov 2007
Posts: 3

PostPosted: Tue Nov 20, 2007 8:23 am    Post subject: Reply with quote

Quote:
You need to \escape the period(.) and you need to \escape the question mark(?)


Thanks, mate! This was the right thing. I had to escape the special chars. Now it works!
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    egghelp.org community Forum Index -> Scripting Help All times are GMT - 4 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Forum hosting provided by Reverse.net

Powered by phpBB © 2001, 2005 phpBB Group
subGreen style by ktauber