| View previous topic :: View next topic |
| Author |
Message |
Elfriede Halfop
Joined: 07 Aug 2007 Posts: 67
|
Posted: Fri May 21, 2010 7:35 am Post subject: Parse url from web content |
|
|
Hopefully someone can tell me whats wrong on that. Im going to parse a url out of a webpage, but all i got is Data: many times ^^ Ive searched alot on this Forum, but im not getting the point, how to parse :/ I just wanna output the first matching url.
| Code: |
bind pub - !geturl geturl:proc
proc geturl:proc {nick host handle channel text} {
set url [lindex $text 0]
set token [::http::geturl $url]
set content [::http::data $token]
::http::cleanup $content
foreach line [split $content \n] {
if {[regexp -nocase {http(.*?)} $content match url]} {
sendmsg #test "Data: [join $url]"
}
}
}
|
|
|
| Back to top |
|
 |
nml375 Revered One
Joined: 04 Aug 2006 Posts: 2857
|
Posted: Fri May 21, 2010 2:26 pm Post subject: |
|
|
Try using the greedy quantifier * instead of the non-greedy *?
Also, the output is most likely not a list, so don't use join. Similarly, $text is a string, not a list, so use split before attempting to use lindex:
Next, use $line, not $content in your regular expression, otherwize the foreach loop would be pretty pointless...
| Code: | bind pub - !geturl geturl:proc
proc geturl:proc {nick host handle channel text} {
set url [lindex [split $text] 0]
set token [::http::geturl $url]
set content [::http::data $token]
::http::cleanup $content
foreach line [split $content \n] {
if {[regexp -nocase {http(.*)} $line match url]} {
sendmsg #test "Data: $url"
}
}
} |
_________________ NML_375, idling at #eggdrop@IrcNET |
|
| Back to top |
|
 |
Elfriede Halfop
Joined: 07 Aug 2007 Posts: 67
|
Posted: Fri May 21, 2010 3:45 pm Post subject: |
|
|
Many thanks for ur answer, but the output looks atm like:
Data: ://imdb.de/title/... Û
The http is cutted and theres a space after the url, where the output should end - can u please add that ?
PS: how to stop eg on first match ? ^^ |
|
| Back to top |
|
 |
nml375 Revered One
Joined: 04 Aug 2006 Posts: 2857
|
Posted: Fri May 21, 2010 5:04 pm Post subject: |
|
|
If you want the full line matching the url, please use $match instead of $url in your sendmsg command.
To stop further processing within the foreach-loop, use the break command just after the sendmsg command. _________________ NML_375, idling at #eggdrop@IrcNET |
|
| Back to top |
|
 |
Elfriede Halfop
Joined: 07 Aug 2007 Posts: 67
|
Posted: Sat May 22, 2010 3:21 am Post subject: |
|
|
Many thanks!!! Now its working, like ive wanted it  |
|
| Back to top |
|
 |
|