This is the new home of the egghelp.org community forum.
All data has been migrated (including user logins/passwords) to a new phpBB version.


For more information, see this announcement post. Click the X in the top right-corner of this box to dismiss this message.

incith:google :)

Support & discussion of released scripts, and announcements of new releases.
Post Reply
t
testebr
Halfop
Posts: 86
Joined: Thu Dec 01, 2005 12:22 pm

Post by testebr »

incith:google-1.8.6

Always return "Sorry, no search results were found." for video search.

Any problem?

Thanks
e
euphoriac
Voice
Posts: 4
Joined: Thu Jul 06, 2006 5:19 pm

Post by euphoriac »

Bug spotted with 'sponsored links' in google search.

i.e. ".g planet earth" gives:

Code: Select all

BBC - Science & Nature - Planet Earth @ http://www.bbc.co.uk/nature/animals/planetearth/ | Earth @ http://seds.lpl.arizona.edu/nineplanets/nineplanets/earth.html | </p><p class=g><a href="/url?q=http://ww @ /search?q=planet+earth+clothing&revid=2076090134&sa=X&oi=revisions_inline&ct=revision&cd=1>planet
User avatar
incith
Master
Posts: 275
Joined: Sat Apr 23, 2005 2:16 am
Location: Canada

Post by incith »

Thanks.. also, if anyone is still getting class=l problem, I still have a temporary version @ http://incith.com:88/~incith/eggdrop/incith-google.tcl
User avatar
speechles
Revered One
Posts: 1398
Joined: Sat Aug 26, 2006 10:19 pm
Location: emerald triangle, california (coastal redwoods)

Post by speechles »

Here's something I promised long ago, but it's not quite as promised, because I did what I preferred.

the !review parser I was going to make use the complicated gamespot form submittal engine, instead I found it much easier to use single query and field matching. Forms submittal would have been nicer but its more susceptible to breaking.

anyways, to make it short and sweet, here's what u get:

Code: Select all

#   !google [define:|spell:] <search terms> <1+1> <1 cm in ft>  #
#      <patent ##> <weather city|zip> <??? airport>             #
#   !images <search terms>                                      #
#   !groups <search terms> ------------------ currently broken  #
#   !news <search terms>                                        #
#   !local <what> near <where>                                  #
#   !localuk <what> near <where>                                #
#   !book <search terms> -------------------- currently broken  #
#   !video <search terms>                                       #
#   !fight <word(s) one> vs <word(s) two>                       #
#   !youtube <search terms>                                     #
#   !atomfilms <search terms> --------------- currently broken  #
#   !ifilms <search terms> ------------------ currently broken  #
#   !gamespot <search terms>                                    #
#   !gamefaqs <system> in <region>                              #
#   !blog <search terms>                                        #
#   !ebay <search terms>                                        #
#   !ebayfight <word(s) one> vs <word(s) two>                   #
#   !wikipedia <search terms>                                   #
#   !locate <ip or hostmask>                                    #
#   !review <gamename> [@ <system>]                             #
#   !torrent <search terms>                                     #
#   !top <system>                                               #
#   !popular <system>                                           #
If you like it, praise incith, because without his script, I wouldn't have found the desire to plug-in sites to it. As well as madwoota big thanks to him as well for keeping things running. What I've done is only to enhance the script for gaming channels. It's messy, dirty, hack-ridden and bloated for sure, but for the most part it works, if you find use for it please thank incith and madwoota rather than me.

http://ereader.kiczek.com/UNOFFICIAL-in ... -v1.94.tcl
The version numbering used is to avoid conflicting with any of the official tcl's.

testebr and euphoriac see below

Code: Select all

<speechles> !g planet earth
<sp33chy> 68,800,000 results | Earth @ http://seds.lpl.arizona.edu/nineplanets/nineplanets/earth.html | BBC - Science & Nature - Planet Earth @ http://www.bbc.co.uk/nature/animals/planetearth/ | BBC - Science & Nature - Planet Earth @ http://www.bbc.co.uk/sn/tvradio/programmes/planetearth/ | Earth - Wikipedia, the free encycloped @ http://en.wikipedia.org/wiki/Earth
<speechles> !v funny
<sp33chy> 901,643 videos | Funny Animals (... Clips of an @ http://video.google.com/videoplay?docid=-6768191643962653988 | Funny Commercials II (So many fo @ http://video.google.com/videoplay?docid=-4686887310667479716 | Asian Backstreet Boys Funny Vid @ http://video.google.com/videoplay?docid=-5721216010568488162 | NTU Student survey - Funny comment @ http://video.google.com/videoplay?docid=4677717832230761610
User avatar
incith
Master
Posts: 275
Joined: Sat Apr 23, 2005 2:16 am
Location: Canada

Post by incith »

Damn man, that's crazy!

I still stand by my firm beliefs that 2.0 is due out soon! =P

I'm glad this script has kept so much attention in the eggdrop community, and I'd also like to take this space to thank Google for not harassing me one bit so far for this script scraping their site (of course I'm sure they don't even notice the hits).
d
darkwolf
Voice
Posts: 9
Joined: Mon Feb 26, 2007 1:13 pm

Post by darkwolf »

Would there be anyway to get a better output for the gamefaqs result please.

The best i could get is using a "\n" so it create a new line, but would have been nice to have Game date then all the game from this date and so on.
Something like this...

with the \n :

XBOX360 North America (USA)
02/26 Major League Baseball 2K7
02/27 Bullet Witch
Dance Dance Revolution Universe
Samurai Warriors 2 Empires
03/01 Alone in the Dark
Battlefield: Bad Company
03/06 Def Jam: Icon
Tom Clancy's Ghost Recon Advanced Warfighter 2
03/13 Call of Duty 3 (Gold Edition)

with date and game for each date(The way it should be):

XBOX360 North America (USA)

02/26 Major League Baseball 2K7
02/27 Bullet Witch - Dance Dance Revolution Universe - Samurai Warriors 2 Empires
03/01 Alone in the Dark - Battlefield: Bad Company
03/06 Def Jam: Icon - Tom Clancy's Ghost Recon Advanced Warfighter 2
03/13 Call of Duty 3 (Gold Edition)

Im not good enuff in tcl too end up with a format like this but would be really nice if someone could do that.

thanks
User avatar
speechles
Revered One
Posts: 1398
Joined: Sat Aug 26, 2006 10:19 pm
Location: emerald triangle, california (coastal redwoods)

Post by speechles »

Of course there is a way to do that, I thought of doing that at first but it was ugly to me, but I left the way to add it back intact, merely #commented out. :)

Code: Select all

        } elseif {[string len $game] > 1}  {
          append output "${incith::google::seperator}${game}"
          #append output ",${game}"
        }
This is the part that handles games which do not have a date preceeding them, notice that presently it just adds a seperator (whatever you picked) and below it #commented out, is the part which would merely add a comma and game.
This is all you have to change

Code: Select all

        } elseif {[string len $game] > 1} {
          #append output "${incith::google::seperator}${game}"
          append output "- ${game}"
        }
now with \n newline set as seperator, and the hypen as the seperator for games without dates, you have what you want.
with date and game for each date(The way it should be):
The way it should be depends on taste, you have yours, and I have mine.. at least we agree on something ;)
Last edited by speechles on Mon Feb 26, 2007 11:02 pm, edited 2 times in total.
d
darkwolf
Voice
Posts: 9
Joined: Mon Feb 26, 2007 1:13 pm

Post by darkwolf »

thanks for the quick answer, and sorry about the comment i means the way i would like ;) ..

nice script!
User avatar
speechles
Revered One
Posts: 1398
Joined: Sat Aug 26, 2006 10:19 pm
Location: emerald triangle, california (coastal redwoods)

Post by speechles »

another small annoyance found, since I prefer bot to give top 4 results.. heh
http://www.google.com/search?q=pft&hl=en&safe=off

If you try that URL, you will see google gives junk results between the 3rd and the 4th results that throws off the parser.

edit:
solved.. before it was only able to parse past a couple of these, such as the 'planet earth' kind (as complained about above).. but I have now got the parser to skip all those 'sponsored' links completely..
http://ereader.kiczek.com/UNOFFICIAL-in ... -v1.94.tcl
User avatar
incith
Master
Posts: 275
Joined: Sat Apr 23, 2005 2:16 am
Location: Canada

Post by incith »

madwoota has updated google to 1.8.6a and submitted it to the Tcl archive. As always, it can be downloaded @ http://xrl.us/incithgoogle (the latest version he has available, released or not).
(19:58:07) <@incith> !g planet earth
(19:58:09) <@visitant> BBC - Science & Nature - Planet Earth @ http://www.bbc.co.uk/nature/animals/planetearth/ | Earth @ http://seds.lpl.arizona.edu/nineplanets ... earth.html | Planet Earth Clothing @ http://www.planetearthstreetwear.com/index.php | Planet Earth Clothing: Planet Earth @ http://www.skatewarehouse.com/MAPLANETEARTH.html
User avatar
speechles
Revered One
Posts: 1398
Joined: Sat Aug 26, 2006 10:19 pm
Location: emerald triangle, california (coastal redwoods)

Post by speechles »

That doesn't really fix it tho, since your allowing google to inject paid results into your searching.
<speechles> !g planet earth
<sp33chy> 111,000,000 results | Earth @ http://seds.lpl.arizona.edu/nineplanets ... earth.html | BBC - Science & Nature - Planet Earth @ http://www.bbc.co.uk/nature/animals/planetearth/ | BBC - Science & Nature - Planet Earth @ http://www.bbc.co.uk/sn/tvradio/programmes/planetearth/ | Earth - Wikipedia, the free encycloped @ http://en.wikipedia.org/wiki/Earth
This is much closer to the actual google results (yes, I have subresults enabled, I like them :)) when shown without helpful nor paid results. You can see this by simply adding &start=1 to the query string, this will make google skip the very 1st result it uses as 0. Google thinks you have already got helpful hints, seen the paid results, and are moving through the results just starting at the 2nd result to display.

Start at 1st result --> http://www.google.com/search?hl=en&q=pl ... gle+Search
Start at 2nd result --> http://www.google.com/search?hl=en&q=pl ... ch&start=1

Notice the difference?
Top url, google is overly helpful even going so far as to suggest things you might want to click that are related (ie, See results for: planet earth clothing, this is paid advertising) within the search results effectively skewing your results. While bottom url, shows how results 2-4 should actually look, which is identical to how the parser of mine displayed it :wink:. To each his own, just keeping everyone aware of the differences :)
User avatar
rosc2112
Revered One
Posts: 1454
Joined: Sun Feb 19, 2006 8:36 pm
Location: Northeast Pennsylvania

Post by rosc2112 »

Just an idea, if you wanted a cleaner result from google, without the ads and garbage, take a look at http://www.scroogle.org/scraper.html which is a google scraper, it produces cleaner output and no ads. Might help with all the parser conflicts, although scroogle doesn't have the image search and other features from google.
User avatar
incith
Master
Posts: 275
Joined: Sat Apr 23, 2005 2:16 am
Location: Canada

Post by incith »

There is also http://www.google.com/xhtml and http://www.google.com/imode as pointed out by a user in my channel, but these do not search images or do define: lookups etc as I recall.
User avatar
rosc2112
Revered One
Posts: 1454
Joined: Sun Feb 19, 2006 8:36 pm
Location: Northeast Pennsylvania

Post by rosc2112 »

Actually, both of those have image search, I just looked:

http://www.google.com/xhtml/search?mres ... ite=images

http://www.google.com/imode/search?mres ... ite=images


and define:

http://www.google.com/xhtml/search?mres ... ite=search

http://www.google.com/imode/search?mres ... ite=search


Something else that might be helpful in this script's development:
http://code.google.com/apis/
Maybe one of the existing api's would simplify things for the script.
User avatar
incith
Master
Posts: 275
Joined: Sat Apr 23, 2005 2:16 am
Location: Canada

Post by incith »

I have made a small fix to the script today which will solve some errors/issues of "Illegal characters in URL path" messages.

It can be downloaded @ http://incith.com:88/~incith/eggdrop/incith-google.tcl until madwoota adds it into the CVS (http://xrl.us/incithgoogle) or the next publically released version of google.

diff -Narub:

Code: Select all

--- incith-google.tcl   2007-02-27 01:12:26.000000000 -0700
+++ eggdrop/scripts/incith-google.tcl   2007-03-02 16:39:17.000000000 -0700
@@ -1048,6 +1048,7 @@
       regexp -nocase -- {^(.+?) near (.+?)$} $input - search location
       # for the rest
       regsub -all -- {\+} $input {%2B} input
+      regsub -all -- {\"} $input {%22} input
       regsub -all -- { } $input {+} input

       # GOOGLE
@@ -1092,7 +1093,7 @@
       # beware, changing the useragent will result in differently formatted html from Google.
       set ua "Lynx/2.8.5rel.1 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.7e"
       set http [::http::config -useragent $ua]
-      set http [::http::geturl $query -timeout [expr 1000 * 10]]
+      set http [::http::geturl "$query" -timeout [expr 1000 * 10]]
       set html [::http::data $http]

       # generic pre-parsing
Post Reply