| View previous topic :: View next topic |
| Author |
Message |
theice Voice
Joined: 13 Mar 2008 Posts: 36
|
Posted: Sun Mar 23, 2008 12:48 am Post subject: parsing another website |
|
|
| Code: | set title [lrange $text 0 end]
putserv "PRIVMSG $c :$title:"
regexp {<td><b>"<a href="/wiki/.*?" title="$title">.*?</a>"</b></td>(.*?)</tr>} $data - data
regexp {<td><b><a href="/wiki/.*?" title="(.+?)">.*?</a></b></td>.*?<td>(.+?)</td>.*?<td>(.+?)</td>.*?<td>(.+?)</td>.*?<td>(.+?)</td>.*?<td>(.+?)</td>} $data - artist guitar bass drums vocals band
putserv "PRIVMSG $c :by-$artist , Difficulties: Guitar-$guitar , Bass-$bass , VoX-$vocals , Drums-$drums ,
Band-$band"
http::cleanup $data
} |
working partially:
http://en.wikipedia.org/wiki/List_of_songs_in_Rock_Band
trying to grab the information from the site the problem is, its using different types of html coding for each title =[
| Code: | [00:47] <@|ICE|> .song Black Hole Sun
[00:47] <+ICEdrop> Black Hole Sun:
[00:47] <+ICEdrop> by-Jet (band) , Difficulties: Guitar-Tier 6 , Bass-Tier 6 , VoX-Tier 7 , Drums-Tier 5 , Band-Tier 6 |
instead of grabbing the correct $title, it grabs the very first one "Are You Gonna Be My Girl" |
|
| Back to top |
|
 |
speechles Revered One

Joined: 26 Aug 2006 Posts: 1398 Location: emerald triangle, california (coastal redwoods)
|
Posted: Sun Mar 23, 2008 11:13 pm Post subject: Re: parsing another website |
|
|
| theice wrote: | | Code: | | regexp {<td><b>"<a href="/wiki/.*?" title="$title">.*?</a>"</b></td>(.*?)</tr>} $data - data |
|
This is wrong, will never work within curly braces (substitution does not take place within curly bracings). The type of regexp you desire is known as a dynamic regexp. Look at the wikipedia/wikimedia portion of the unofficial google script, it uses these for #subtag look-ups. To use them correctly first build your regexp into a variable, then use quotes to build the regexp.
| Code: | set dynregex "<td><b>\"<a href=\"/wiki/.*?\" title=\"$title\">.*?</a>\"</b></td>(.*?)</tr>"
if {![regexp "$dynregex" $data - data]} {
#notfound
} {
#found
} |
Notice, you MUST escape quotes within other quotes, but within curly braces there is no need.
also, what is the purpose of this beauty?! | Code: | | set title [lrange $text 0 end] | remember, do not confuse lists with strings, or vice versa. When you do unexpected behavior occurs, and you will be constantly fighting this later with code kludges and messy filters to compensate. It's always better to do it correctly to begin with. | Code: | | set title [join [lrange [split $text] 0 end]] | Notice the split (to protect special characters mischevious users may try for input), then an lrange on the list split creates, and afterwards a join to turn this list back into a string. Remember, #1 rule of Tcl never confuse a list and a string. |
|
| Back to top |
|
 |
metroid Owner
Joined: 16 Jun 2004 Posts: 771
|
Posted: Wed Apr 02, 2008 3:23 pm Post subject: |
|
|
though you told him how to use split and join properly, you still didn't fix that nasty lrange.
Using lrange $var 0 end is the exact same as not doing anything at all.
In this case, you can just use set title $text because "set title [join [lrange [split $text] 0 end]]" quite simply is the exact same. |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|