egghelp.org community Forum Index
[ egghelp.org home | forum home ]
egghelp.org community
Discussion of eggdrop bots, shell accounts and tcl scripts.
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

RegExp help please [SOLVED]

 
Post new topic   Reply to topic    egghelp.org community Forum Index -> Scripting Help
View previous topic :: View next topic  
Author Message
Wannabe
Voice


Joined: 10 Feb 2006
Posts: 17

PostPosted: Thu May 03, 2007 2:48 pm    Post subject: RegExp help please [SOLVED] Reply with quote

Hey, im still learning regular expressions, and im pretty much stumped on this one, several people have tried to help me already and its just not right yet, im using a http package to read a website, and then parse for a specific piece of info on that website, the information im after is stored in a table.

The two lines that im looking at are :

<td width="45%"><font face="Verdana, Arial, sans-serif" size=2 class="fontNormal">Kills per Death:</font></td>
<td width="55%"><font face="Verdana, Arial, sans-serif" size=2 class="fontNormal">0.6193</font></td>

what i need to do, is check that the line before has Kills per Death, and then pull the value from the next line which in this case is 0.6193. there are several sections of the table that are identical appart from the text Kills per Death: hence why i need to check both lines.

i have tried many diffrent regexp to get this working, and all return nothing.

Any help and i would be very greatful.


Last edited by Wannabe on Thu May 03, 2007 8:11 pm; edited 1 time in total
Back to top
View user's profile Send private message
Sir_Fz
Revered One


Joined: 27 Apr 2003
Posts: 3793
Location: Lebanon

PostPosted: Thu May 03, 2007 2:59 pm    Post subject: Reply with quote

Code:
regexp {\d+\.\d+} {<td width="55%"><font face="Verdana, Arial, sans-serif" size=2 class="fontNormal">0.6193</font></td>} value

this will store 0.6193 in value.
_________________
Follow me on GitHub

- Opposing

Public Tcl scripts


Last edited by Sir_Fz on Thu May 03, 2007 3:07 pm; edited 1 time in total
Back to top
View user's profile Send private message Visit poster's website
Wannabe
Voice


Joined: 10 Feb 2006
Posts: 17

PostPosted: Thu May 03, 2007 3:07 pm    Post subject: Reply with quote

That gives me a result of 4.01, which im not sure where its getting it from but its not correct.

The entire source that the regexp needs to search through is the source of this page : http://ns.wireplay.co.uk/hlstats.php?mode=playerinfo&player=55

if thats any help, i really dont understand that regexp you gave me atall. i stuggle to get my head around it


EDIT :
Ok i found that it match the 4.01 in this line

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

again i dont really have a clue how the regexp works, or id try fixing it myself
Back to top
View user's profile Send private message
Sir_Fz
Revered One


Joined: 27 Apr 2003
Posts: 3793
Location: Lebanon

PostPosted: Thu May 03, 2007 3:13 pm    Post subject: Reply with quote

Well \d matches any digit, \. matches a period '.' and + (or {1,}) means 1 or more. An alternative regexp you can try is:
Code:
regexp {<.+><.+>(.+)<.+><.+>} {<td width="55%"><font face="Verdana, Arial, sans-serif" size=2 class="fontNormal">0.6193</font></td>} grbg value

$value should contain 0.6193.
_________________
Follow me on GitHub

- Opposing

Public Tcl scripts
Back to top
View user's profile Send private message Visit poster's website
Wannabe
Voice


Joined: 10 Feb 2006
Posts: 17

PostPosted: Thu May 03, 2007 3:24 pm    Post subject: Reply with quote

I think i explained badly, the regexp works on the entire source of the website, not just the two lines i posted, thats the reason i wanted to get the words Kills per Death: so that i was sure it was the right data.

the problem i have is the two seperate lines i dont know how to deal with. but thanks for explaining that regexp. it actually makes sence to me now Smile
Back to top
View user's profile Send private message
Sir_Fz
Revered One


Joined: 27 Apr 2003
Posts: 3793
Location: Lebanon

PostPosted: Thu May 03, 2007 3:30 pm    Post subject: Reply with quote

If you provided code, it would've been easier. The concept is easy, this should explain it:
Code:
# variable $lines is a list containing the html source
set notFound 1
foreach line $lines {
 if {$notFound && [regexp {Kills\sper\sDeath:} $line]} {
  set notFound 0
 } elseif {!$notFound} {
  regexp {\d+\.\d+} $line value
  break
 }
}
# $value contains the number.

_________________
Follow me on GitHub

- Opposing

Public Tcl scripts
Back to top
View user's profile Send private message Visit poster's website
Wannabe
Voice


Joined: 10 Feb 2006
Posts: 17

PostPosted: Thu May 03, 2007 3:49 pm    Post subject: Reply with quote

ive attempted to do what you suggested, however it never seems to find the Kill per Deaths:

Im wondering if ive split the file wrong, ive written $::html to a text document, and it comes out exactly as it is in the source. so im not sure why it wouldnt work my code is :

Code:

set notFound 1
set lines [split $::html \n]
        foreach line $lines {
              if {$notFound && [regexp {Kills\sper\sDeath:} $line]} {
                  set notFound 0
              } elseif {!$notFound} {
                  putquick "PRIVMSG $chan : Line $line found"
                  regexp {\d+\.\d+} $line value
                  putquick "PRIVMSG $chan : Value is $value"
                  break
             }
       }
Back to top
View user's profile Send private message
Sir_Fz
Revered One


Joined: 27 Apr 2003
Posts: 3793
Location: Lebanon

PostPosted: Thu May 03, 2007 6:36 pm    Post subject: Reply with quote

Worked fine for me; tested it on tclsh
Code:
proc bla {} {
 set url "http://ns.wireplay.co.uk/hlstats.php?mode=playerinfo&player=55"
 set token [::http::geturl $url]
 set content [::http::data $token]
 ::http::cleanup $token
 set notFound 1
 foreach line [split $content \n] {
  if {$notFound && [regexp {Kills\sper\sDeath:} $line]} {
   set notFound 0
  } elseif {!$notFound} {
   regexp {\d+\.\d+} $line value
   puts $value
   break
  }
 }
}

Quote:
% package require http
2.5.2
% bla
0.6649

_________________
Follow me on GitHub

- Opposing

Public Tcl scripts


Last edited by Sir_Fz on Thu May 03, 2007 8:58 pm; edited 1 time in total
Back to top
View user's profile Send private message Visit poster's website
Wannabe
Voice


Joined: 10 Feb 2006
Posts: 17

PostPosted: Thu May 03, 2007 8:10 pm    Post subject: Reply with quote

Yep, sorry about that, i accedently deleted a character when removing some trash code, and it make it check the wrong html var, hence no result. its all fixed and working now, thanks Smile
Back to top
View user's profile Send private message
Sir_Fz
Revered One


Joined: 27 Apr 2003
Posts: 3793
Location: Lebanon

PostPosted: Thu May 03, 2007 8:54 pm    Post subject: Reply with quote

This is a much faster method to grep the information:
Code:
proc blo {} {
 set url "http://ns.wireplay.co.uk/hlstats.php?mode=playerinfo&player=55"
 set token [::http::geturl $url]
 set content [split [::http::data $token] \n]
 ::http::cleanup $token
 if {[set i [lsearch -glob $content {*Kills per Death:*}]]!=-1} {
  regexp {\d+\.\d+} [lindex $content [incr i]] value
  puts $value
 }
}

_________________
Follow me on GitHub

- Opposing

Public Tcl scripts
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    egghelp.org community Forum Index -> Scripting Help All times are GMT - 4 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Forum hosting provided by Reverse.net

Powered by phpBB © 2001, 2005 phpBB Group
subGreen style by ktauber