egghelp.org community Forum Index
[ egghelp.org home | forum home ]
egghelp.org community
Discussion of eggdrop bots, shell accounts and tcl scripts.
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Help with regexp/regsub

 
Post new topic   Reply to topic    egghelp.org community Forum Index -> Scripting Help
View previous topic :: View next topic  
Author Message
Metuant
Voice


Joined: 28 Jul 2007
Posts: 3

PostPosted: Sat Jul 28, 2007 6:38 pm    Post subject: Help with regexp/regsub Reply with quote

Hi,

I'm trying to parse some information from a website using regsub and regexp, but i'm completely useless at regexp so now that they've updated their website the regexp no longer works.

The block of information I'm trying to parse (which is sometimes repeated multiple times - hence the while in the code) is:

<tr>
<td class='tablebottom'><img src="/img/member.gif" alt="[M]"/></td>
<!--name--><td class='tablebottom'>Abyssal whip</td>
<td class="tablebottom" title="Former average price: 1,650,000gp [decreased by 50,000gp]"><img src="/img/market/p_d.gif" alt="This price has decreased" /></td>
<!--price--><td class="tablebottom">1,550,000gp - 1,650,000gp</td>
<td class="tablebottom" width="20"><a href="/priceguide.php?report=45&amp;par=" title="Report Incorrect Price"><img src="/img/!.gif" alt="[!]" border="0" /></a></td>
<td class="tablebottom"><a href="/priceguide.php?category=45">Obsidian &amp; Abyssal</a></td>
</tr>

</table></form><br />

I'm trying to grab the item name (Abyssal whip) and its price (1,550,000gp - 1,650,000gp)
Using...

Code:
                  
      while {[regexp "<!--name--><td class=\'tablebottom\'>(.*?)</td>\n\n<!--price--><td class=\"tablebottom\">(.*?)</td>\n<td class=\"tablebottom\" width=\"20\">" $data junk tname tprice]} {
         
         regsub "<!--name--><td class=\'tablebottom\'>[addslashes $tname]</td>\n\n<!--price--><td class=\"tablebottom\">[addslashes $tprice]</td>\n<td class=\"tablebottom\" width=\"20\">" $data - data
         if {$i == 0 || ([string match [string tolower [string range $item 0 1]] [string tolower [string range $tname 0 1]]] && [string length $tname] < [string length $name])} {
            set name $tname
            set price $tprice


I'm assuming that you can't just use \n\n to skip the line of useless data as I'd hoped..

Any help is appreciated
Back to top
View user's profile Send private message
speechles
Revered One


Joined: 26 Aug 2006
Posts: 1398
Location: emerald triangle, california (coastal redwoods)

PostPosted: Sun Jul 29, 2007 3:56 am    Post subject: Reply with quote

Perhaps sanitize the data before you attempt to parse it. So newlines, carriage returns, tabs, etc.. get eliminated before you get to that step.
Code:
regsub -all "\t" $data "" data
regsub -all "\n" $data "" data
regsub -all "\r" $data "" data
regsub -all "\v" $data "" data

You can use a quantifier to express a range. This snippet should work:
Code:
      while {[regexp "<!--name--><td class=\'tablebottom\'>(.*?)</td>.*?<!--price--><td class=\"tablebottom\">(.*?)</td>.*?<td class=\"tablebottom\" width=\"20\">" $data junk tname tprice]} {

         regsub "<!--name--><td class=\'tablebottom\'>[addslashes $tname]</td>.*?<!--price--><td class=\"tablebottom\">[addslashes $tprice]</td>.*?<td class=\"tablebottom\" width=\"20\">" $data - data
         if {$i == 0 || ([string match [string tolower [string range $item 0 1]] [string tolower [string range $tname 0 1]]] && [string length $tname] < [string length $name])} {
            set name $tname
            set price $tprice
Back to top
View user's profile Send private message
Metuant
Voice


Joined: 28 Jul 2007
Posts: 3

PostPosted: Sun Jul 29, 2007 7:03 am    Post subject: Reply with quote

Thanks for the help - it works great :]
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    egghelp.org community Forum Index -> Scripting Help All times are GMT - 4 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Forum hosting provided by Reverse.net

Powered by phpBB © 2001, 2005 phpBB Group
subGreen style by ktauber