| View previous topic :: View next topic |
| Author |
Message |
romprod Halfop
Joined: 19 Oct 2001 Posts: 49
|
Posted: Mon Nov 29, 2010 2:45 pm Post subject: html parser |
|
|
Trying create a basic script to rip info from a url and spit it out to a channel but for some reason it aint working. Can anyone point out the obvious to me please as it's driving me crazy!
| Code: | # Config
set url "http://feed43.com/3222412860174114.xml"
set dcctrigger "test"
# End of config
if {![info exists egghttp(version)]} {
putlog "egghttp.tcl was NOT successfully loaded."
putlog "egghttp_example.tcl has not been loaded as a result."
} else {
proc your_callbackproc {sock} {
global url
set headers [egghttp:headers $sock]
set body [egghttp:data $sock]
regsub -all "\n" $body "" body
regsub -all -nocase {<br>} $body "<br>\n" body
regexp {<b>(.*)<br/>} $body - team
putlog "Team: $team"
}
bind dcc o|o $dcctrigger our:dcctrigger
proc our:dcctrigger {hand idx text} {
global url
set sock [egghttp:geturl $url your_callbackproc]
return 1
}
putlog "egghttp_example.tcl has been successfully loaded."
}
|
|
|
| Back to top |
|
 |
romprod Halfop
Joined: 19 Oct 2001 Posts: 49
|
Posted: Tue Nov 30, 2010 10:57 am Post subject: |
|
|
The above script didn't work because of the page it was getting data from, i've changed the source now and it is working but I'm unable to make it loop through to the next line of text. I'll also include a sample of the html code i'm trying to parse.
| Code: | set rssfeed "http://www.fred.co.uk"
set trigger "!latest"
set channel "#12321"
if {![info exists egghttp(version)]} {
putlog "egghttp.tcl was NOT successfully loaded."
putlog "egghttp_example.tcl has not been loaded as a result."
} else {
proc your_callbackproc {sock} {
global rssfeed channel
set headers [egghttp:headers $sock]
set body [egghttp:data $sock]
regexp {"><h2>(.*?)</h2>} $body - date
puthelp "PRIVMSG $channel : $date"
set xml { $body }
foreach line [split $xml "\n"] {
regexp {<td valign="top" class="tblRow colmNum000">(.*?)</td><td valign="top" class="tblRow">(.*?)</td></tr>} $body - time1 game1
puthelp "PRIVMSG $channel : $time1 $game1"
}
}
bind pub -|* $trigger top:trigger
proc top:trigger {nick host hand chan text} {
global rssfeed
set sock [egghttp:geturl $rssfeed your_callbackproc]
return 1
}
} |
HTML that I need to parse
| Code: | <div class="content"><h1>Barclays Premier League fixtures</h1></div><div class="tblContain"><h2>4 Dec 2010</h2><table class="tblResults" cellpadding="0" cellspacing="2" border="0"><tr>
<td valign="top" class="tblRow colmNum000">zzz1</td><td valign="top" class="tblRow">xxx1</td></tr></table><table class="tblResults" cellpadding="0" cellspacing="2" border="0"><tr>
<td valign="top" class="tblRow colmNum000">zzz2</td><td valign="top" class="tblRow">xxx2</td></tr></table><table class="tblResults" cellpadding="0" cellspacing="2" border="0"><tr>
<td valign="top" class="tblRow colmNum000">zzz3</td><td valign="top" class="tblRow">xxx3</td></tr></table><table class="tblResults" cellpadding="0" cellspacing="2" border="0"><tr>
<td valign="top" class="tblRow colmNum000">zzz4</td><td valign="top" class="tblRow">xxx4</td></tr></table><table class="tblResults" cellpadding="0" cellspacing="2" border="0"><tr>
<td valign="top" class="tblRow colmNum000">zzz5</td><td valign="top" class="tblRow">xxx5</td></tr></table><table class="tblResults" cellpadding="0" cellspacing="2" border="0"><tr>
<td valign="top" class="tblRow colmNum000">zzz6</td><td valign="top" class="tblRow">xxx6</td></tr></table><table class="tblResults" cellpadding="0" cellspacing="2" border="0"><tr>
<td valign="top" class="tblRow colmNum000">zzz7</td><td valign="top" class="tblRow">xxx7</td></tr></table><br/><h2>5 Dec 2010</h2><table class="tblResults" cellpadding="0" cellspacing="2" border="0"><tr>
<td valign="top" class="tblRow colmNum000">zzz8</td><td valign="top" class="tblRow">xxx8</td></tr></table><table class="tblResults" cellpadding="0" cellspacing="2" border="0"><tr>
<td valign="top" class="tblRow colmNum000">zzz9</td><td valign="top" class="tblRow">xxx9</td></tr></table><br/><h2>6 Dec 2010</h2><table class="tblResults" cellpadding="0" cellspacing="2" border="0"><tr>
<td valign="top" class="tblRow colmNum000">zzz10</td><td valign="top" class="tblRow">xxx10</td></tr></table><br/><h2>11 Dec 2010</h2><table class="tblResults" cellpadding="0" cellspacing="2" border="0"><tr>
<td valign="top" class="tblRow colmNum000">zzz11</td><td valign="top" class="tblRow">xxx11</td></tr></table><table class="tblResults" cellpadding="0" cellspacing="2" border="0"><tr>
<td valign="top" class="tblRow colmNum000">zzz12</td><td valign="top" class="tblRow">xxx12</td></tr></table></div><div class="content infoArea"> |
The only outcome will now be
| Code: | [02:43:56] <@nick> !latest
[02:44:01] <+bot> 4 Dec 2010
[02:44:03] <+bot> zzz1 xxx1 |
But I would like
| Code: | [02:43:56] <@nick> !latest
[02:44:01] <+bot> 4 Dec 2010
[02:44:03] <+bot> zzz1 xxx1
[02:44:03] <+bot> zzz2 xxx2
[02:44:03] <+bot> zzz3 xxx3
[02:44:03] <+bot> zzz4 xxx4
[02:44:03] <+bot> zzz5 xxx5
etc etc etc
|
Thanks in davance!  |
|
| Back to top |
|
 |
|