View previous topic :: View next topic |
Author |
Message |
holycrap Op
Joined: 21 Jan 2008 Posts: 152
|
Posted: Sat Dec 12, 2009 7:57 pm Post subject: |
|
|
Maybe Santa will give us a present on this one.  |
|
Back to top |
|
 |
Trixar_za Op

Joined: 18 Nov 2009 Posts: 143 Location: South Africa
|
Posted: Tue Dec 15, 2009 3:44 pm Post subject: |
|
|
holycrap wrote: | Maybe Santa will give us a present on this one.  | I guess that makes me santa...
Here is what you need to change:
Code: | # valid, set our url
set input(url) "http://horoscopes.astrology.com/daily${input(query)}.html"
foreach sign [split ${incith::horoscope::en_chinese} " "] {
if {$input(query) == $sign} {
set input(url) "http://horoscopes.astrology.com/dailychinese${input(query)}.html"
break
}
} |
to
Code: | # valid, set our url
set input(url) "http://feeds.astrology.com/dailyoverview"
foreach sign [split ${incith::horoscope::en_chinese} " "] {
if {$input(query) == $sign} {
set input(url) "http://feeds.astrology.com/dailychinese"
break
}
} |
and then you have to change this:
Code: | # html parsing
#
# fetch the sign and the horoscope
regexp {<div class="all_about_head_pad">ALL ABOUT (.+?)</div>} $html - output(sign)
regexp {<p style="margin-bottom: 20px;">(.*?)</p>} $html - output(horoscope) |
to
Code: | regsub -all {(?:<!\[CDATA\[)} $html {} html
# html parsing
#
# fetch the sign and the horoscope
set output(sign) [string totitle $output(query)]
set regex "<title>$output(sign) (.*?)</title>.+<description><p>(.*?)</p>"
regexp $regex $html - junk output(horoscope) |
And now the script will be using the rss feeds and should be working again - Enjoy!
Edit: The lazy can download it here |
|
Back to top |
|
 |
speechles Revered One

Joined: 26 Aug 2006 Posts: 1398 Location: emerald triangle, california (coastal redwoods)
|
Posted: Tue Dec 15, 2009 7:14 pm Post subject: |
|
|
Here's a question.. If it is taken from an rss page, why do you need to query the website always?
Code: | regexp -nocase {<lastbuilddate>(.*?)</lastbuilddate>} $html - ::horoscope(lastbuild)
regexp -all -inline -nocase -- {<item>(.*?)</item>} $html - parents
foreach {junk child} $parents {
regexp -nocase {<title>(.*?) Horoscope} $child - title
regexp -nocase {<link>(.*?)<\link>} $child - link
regexp -nocase {<desc>(.*?)<\link>} $child - desc
set ::horoscope($title) "$desc @ $link"
} |
The code above would let you create an array based on the sign itself, each array element would be composed of "horoscope @ link". This is how an rss based approach should be. Then you don't need to make an http request for every query. You time the query to start checking for an update based upon the header field returned from the reply like below.
Quote: | <speechles> !webby http://feeds.astrology.com/dailyoverview --header
<sp33chy> Astrology.com Daily Overview Horoscopes ( http://cli.gs/mAq1D )( 200; text/xml; utf-8; 25423 bytes )
<sp33chy> Server=GFE/2.0; Last-Modified=Tue, 15 Dec 2009 23:02:29 GMT; Expires=Tue, 15 Dec 2009 23:02:29 GMT; ETag=UbQ446fscwam0RXAMW7hNN8HDGI; Date=Tue, 15 Dec 2009 23:02:29 GMT; Cache-Control=private, max-age=0 |
Together with the last-modified header attribute, you can know exactly when you should download new html. But polling the site wouldn't need to occur until it is 1 hour until midnight on their servers time. Which appears to be GMT 0 making this exercise pretty easy.
1) initialize and gather required elements:
2) parse html from daily and chinese sites
3) add elements to horocopes arrays, store last-modified timestamp taken from header
Then when a user types anything, you simply compare against the array. If it's not an array element, then it's not a valid sign. This makes it much more intuitive. When a sign is found, we simply return what is stored within the array making it incredibly fast. No http waits at all for any user.
At one hour before midnight GMT 0 (england) time, the script will set a state telling itself that every 5 minutes or so it should start checking the site's headers (http::geturl with -validate option, wouldn't make sense to read the body until we need it) against its stored timestamp, if it isn't equal. It's time to initialize, make another http::geturl without validating and store our array and timestamp so we can do this again in 24 or so hours.
Perhaps this is the real christmas gift...
and while it's possible I could do this, I like waiting to see if incith pays attention to this thread... ;D
-> Update: Well, seems they already knew people might do this so appears they do some devious playing with header attributes... Quote: | <speechles> !webby http://feeds.astrology.com/dailyoverview --validate
<sp33chy> Validated: http://feeds.astrology.com ( http://is.gd/5p58d )( 200; text/xml; utf-8; 0 bytes )
<sp33chy> Server=GFE/2.0; Last-Modified=Tue, 15 Dec 2009 23:39:57 GMT; Expires=Tue, 15 Dec 2009 23:39:57 GMT; ETag=UbQ446fscwam0RXAMW7hNN8HDGI; Date=Tue, 15 Dec 2009 23:39:57 GMT; Cache-Control=private, max-age=0
<sp33chy> X-XSS-Protection=0; X-Content-Type-Options=nosniff
<speechles> !webby http://feeds.astrology.com/dailyoverview --validate
<sp33chy> Validated: http://feeds.astrology.com ( http://cli.gs/497hU )( 200; text/xml; utf-8; 0 bytes )
<sp33chy> Server=GFE/2.0; Last-Modified=Tue, 15 Dec 2009 23:38:17 GMT; Expires=Tue, 15 Dec 2009 23:40:08 GMT; ETag=UbQ446fscwam0RXAMW7hNN8HDGI; Date=Tue, 15 Dec 2009 23:40:08 GMT; Cache-Control=private, max-age=0
<sp33chy> X-XSS-Protection=0; X-Content-Type-Options=nosniff
<speechles> !webby http://feeds.astrology.com/dailyoverview --validate
<sp33chy> Validated: http://feeds.astrology.com ( http://tinyurl.com/klkhyv )( 200; text/xml; utf-8; 0 bytes )
<sp33chy> Server=GFE/2.0; Last-Modified=Tue, 15 Dec 2009 23:39:38 GMT; Expires=Tue, 15 Dec 2009 23:40:18 GMT; ETag=UbQ446fscwam0RXAMW7hNN8HDGI; Date=Tue, 15 Dec 2009 23:40:18 GMT; Cache-Control=private, max-age=0
<sp33chy> X-XSS-Protection=0; X-Content-Type-Options=nosniff | Note: a validate request cannot detect page body size or title, hence "validated" and "0 bytes" in those places.
The page is still the same, but the last-modified is a lie. Smells like cake to me. So the timestamp will NOT help here, but since they update before midnight rolls over just set a simple bind to "00 00 * * *", but of course this would only work for those using GMT 0. You must adjust the bind to fit your time-zone. Then every single day when midnight comes to england the script will initialize the horoscope array and update it for everyone. This means only one query is required (pretend we never timeout/error) every 24 hours regardless of how many people are hammering your bot.
Incith, are you listening?  _________________ speechles' eggdrop tcl archive
Last edited by speechles on Tue Dec 15, 2009 8:00 pm; edited 2 times in total |
|
Back to top |
|
 |
Trixar_za Op

Joined: 18 Nov 2009 Posts: 143 Location: South Africa
|
Posted: Tue Dec 15, 2009 7:54 pm Post subject: |
|
|
speechles wrote: | Here's a question.. If it is taken from an rss page, why do you need to query the website always? |
1.) Because I wanted to make as few changes as possible to the original code - to simplify the changes for people.
2.) I'm just starting in TCL coding - lol
Oh and I fixed the ram/goat/sheep bug too... same link as before ^^ |
|
Back to top |
|
 |
cache Master
Joined: 10 Jan 2006 Posts: 306 Location: Mass
|
Posted: Wed Dec 16, 2009 12:30 am Post subject: |
|
|
Trixar_za wrote: | speechles wrote: | Here's a question.. If it is taken from an rss page, why do you need to query the website always? |
1.) Because I wanted to make as few changes as possible to the original code - to simplify the changes for people.
2.) I'm just starting in TCL coding - lol
Oh and I fixed the ram/goat/sheep bug too... same link as before ^^ |
Can you paste the full code? Top part you are asking us to change don't match with what we all have and this is this:
Code: | proc fetch_html {input} {
set query "http://horoscopes.astrology.com/"
set input [string tolower $input]
regsub -- "(?q)${incith::horoscope::command_char}" $input {} input
|
|
|
Back to top |
|
 |
Trixar_za Op

Joined: 18 Nov 2009 Posts: 143 Location: South Africa
|
Posted: Wed Dec 16, 2009 4:20 am Post subject: |
|
|
cache wrote: |
Can you paste the full code? Top part you are asking us to change don't match with what we all have and this is this:
Code: | proc fetch_html {input} {
set query "http://horoscopes.astrology.com/"
set input [string tolower $input]
regsub -- "(?q)${incith::horoscope::command_char}" $input {} input
|
|
Maybe I'm using an older version... I'll download the latest and have look quickly.
Ok... The newest svn version uses th same proc's as mine, but comes with the botnet support added in.
Yours seems to be in your fetch_html proc too where mine isn't. If you upload yours I would have a quick look if you want.
Oh and for those that are interested... to fix the ram/sheep/goat bug you need to change:
Code: | # ram or goat becomes sheep
if {$input(query) == "ram" || $input(query) == "goat"} {
set input(query) "sheep"
} | to Code: | # ram or sheep becomes goat
if {$input(query) == "ram" || $input(query) == "sheep"} {
set input(query) "goat"
} |
Last edited by Trixar_za on Wed Dec 16, 2009 4:48 am; edited 3 times in total |
|
Back to top |
|
 |
cache Master
Joined: 10 Jan 2006 Posts: 306 Location: Mass
|
Posted: Thu Dec 17, 2009 2:11 am Post subject: |
|
|
Thank you very much.. for some reason version 1.2 worked till now, I had no idea there was a new version to edit. |
|
Back to top |
|
 |
pogue Voice

Joined: 17 May 2009 Posts: 28
|
Posted: Sun Dec 20, 2009 2:10 pm Post subject: |
|
|
Can someone paste the updated code on pastebin or something?
Thanks in advance,
pogue _________________ Helpful Tools:
|
|
Back to top |
|
 |
Trixar_za Op

Joined: 18 Nov 2009 Posts: 143 Location: South Africa
|
Posted: Mon Dec 21, 2009 12:32 pm Post subject: |
|
|
You can see the pastebin'd script here
and you can download the already modified copy here
Hope that helps  |
|
Back to top |
|
 |
Trixar_za Op

Joined: 18 Nov 2009 Posts: 143 Location: South Africa
|
Posted: Tue Jan 05, 2010 4:19 am Post subject: |
|
|
I'm thinking of expanding this script a bit to include some of the other feeds provided by http://www.astrology.com/rss
Just with the daily horoscopes you get dailyoverview, dailyquickie, dailyextended, dailyastroslam (one of my favorites), dailysinglelove, dailycoupleslove, dailyflirt, dailyteenhoroscope, dailybeautyscope, dailybabyscope, dailycatscope, dailydogscope and dailyhomeandgarden which all use a similar feed format, while dailygayscope, dailyfoodscope, dailylesbianscope, dailyfinancescope, dailygreenscope and dailymomscope use a different layout, which the script can probably be adjusted for. There is also weekly and monthly feeds (although, luckly less than the daily ones).
The reason I'm asking here first is because this to see if there is any interest for such a script and secondl, maybe people should choose what they want in there first  |
|
Back to top |
|
 |
pogue Voice

Joined: 17 May 2009 Posts: 28
|
Posted: Sun Jan 17, 2010 8:39 pm Post subject: |
|
|
Trixar_za wrote: | The reason I'm asking here first is because this to see if there is any interest for such a script and secondl, maybe people should choose what they want in there first  |
Sounds like a lot of neat functionality, but I don't think personally I would use it. I just use the astrology script kind of as a laugh or when no one is talking to start conversations.
I can think of some other cool websites that would be fun to grab RSS feeds and other stuff off of though...
But, thanks for updating this nonetheless!
pogue _________________ Helpful Tools:
|
|
Back to top |
|
 |
achilles1900 Voice
Joined: 21 Apr 2008 Posts: 30
|
Posted: Tue Aug 31, 2010 9:44 am Post subject: |
|
|
Hi Trixar,
just wanted to say thanks a lot for fixing this, it helped me out today.
Really appreciate your work helping the TCL scripting community. By the way, i want to start to learn TCL myself, can you give me any pointers on where to go to start?
thanks in advance,
Achilles |
|
Back to top |
|
 |
Trixar_za Op

Joined: 18 Nov 2009 Posts: 143 Location: South Africa
|
Posted: Thu Sep 02, 2010 8:23 pm Post subject: |
|
|
If you really want my advice about this, then I would suggest reading this article about TCL: http://antirez.com/articoli/tclmisunderstood.html
Next try looking at simple scripts and try figuring out how they work. Eggdrop adds a whole host of it's own commands and syntax which requires some getting used to.
Start by figuring out how they work, then try modifying the code to work like you want it to. Complicated scripts that don't follow normal programming logic is best (like some stats scripts), then take everything you learned from that and write a completely original script from scatch. That's how I did it. _________________ http://www.trixarian.net/Projects |
|
Back to top |
|
 |
cache Master
Joined: 10 Jan 2006 Posts: 306 Location: Mass
|
Posted: Mon Apr 29, 2013 12:44 pm Post subject: |
|
|
This horoscope script stopped working. It has been repeating the same old horoscope a few days now but the website is showing new ones each day. Anyone else? |
|
Back to top |
|
 |
crazyVTr Voice
Joined: 05 Jun 2012 Posts: 2
|
Posted: Mon Apr 29, 2013 1:59 pm Post subject: |
|
|
the feeds need to be updated
i got mine working again by changing:
Code: |
# valid, set our url
set input(url) "http://feeds.astrology.com/dailyoverview"
foreach sign [split ${incith::horoscope::en_chinese} " "] {
if {$input(query) == $sign} {
set input(url) "http://feeds.astrology.com/dailychinese"
break
}
}
|
to this
Code: |
# valid, set our url
set input(url) "http://www.astrology.com/horoscopes/daily-horoscope.rss"
foreach sign [split ${incith::horoscope::en_chinese} " "] {
if {$input(query) == $sign} {
set input(url) "http://www.astrology.com/horoscopes/daily-chinese.rss"
break
}
}
|
|
|
Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|