This is the new home of the egghelp.org community forum.
All data has been migrated (including user logins/passwords) to a new phpBB version.


For more information, see this announcement post. Click the X in the top right-corner of this box to dismiss this message.

UNOFFICIAL incith-google 2.1x (Nov30,2o12)

Support & discussion of released scripts, and announcements of new releases.
Post Reply
User avatar
speechles
Revered One
Posts: 1398
Joined: Sat Aug 26, 2006 10:19 pm
Location: emerald triangle, california (coastal redwoods)

Post by speechles »

Just a quick quote from a fellow on irc just patching his eggdrop with utf-8 and testing the script for me.
<speechles> <@anahel> speechles with thommey utf patch it's looks great <-- haw, toldja it would
<speechles> you might even get lucky and utf-8 input works too ;P
<anahel> yeah it works too :)
<anahel> tested it with polish and japanese and it worked :)
So all of you experiencing any issues and able to utf-8 patch your bot should of course investigate doing this. Afterwards, you can safely enable the below config setting:

Code: Select all

    # enable dirty decoding? This attempts to use the regular "dirty" method
    # of rendering html elements which works well with iso8859-1 and other
    # latin variants. This does not work well at all with russian, japanese,
    # and any other non-latin variants. So keep this at 0 if you want a truly
    # multi-language bot, but keep in mind you may see unrendered &x12345; html
    # elements. This is because I don't know of a method to transcode these
    # to proper utf-8 characters yet.. :P
    # ------
    variable dirty_decode 0
With a properly utf-8 patched bot, this option can safely be set to 1 and you will experience no rendering mistakes on either input or output.

Code: Select all

    variable encoding_conversion_input 0
    variable encoding_conversion_output 1
...
    variable automagic 1
...
    variable utf8workaround 1
Change all of the above config options to 0 as well when using a properly utf-8 patched bot. If you do still experience issues (when using thommey's utf-8 patch) mention it here so they can be resolved. Consider yourselves, beta testers.. ;)
User avatar
Anahel
Halfop
Posts: 48
Joined: Fri Jul 03, 2009 6:18 pm
Location: Dom!

Post by Anahel »

speechles wrote:Just a quick quote from a fellow on irc just patching his eggdrop with utf-8 and testing the script for me.
<speechles> <@anahel> speechles with thommey utf patch it's looks great <-- haw, toldja it would
<speechles> you might even get lucky and utf-8 input works too ;P
<anahel> yeah it works too :)
<anahel> tested it with polish and japanese and it worked :)
:D

here's the result of thommey utf-8 patch and speechles modifications:

Code: Select all

<tomek> !wiki .ja japan
<~Nyaa> ジャパン | ジャパン (Japan) は、英語で日本を意味する単語。 @ http://ja.wikipedia.org/wiki/%E3%82%B8%E3%83%A3%E3%83%91%E3%83%B3
<tomek> !tr ja@en 私
<~Nyaa> Google says: (ja->en) Translation: Japanese » English
<~Nyaa> I
<tomek> !tr ja@en 日本人
<~Nyaa> Google says: (ja->en) Translation: Japanese » English
<~Nyaa> Japanese
<tomek> !tr pl@en gość
<~Nyaa> Google says: (pl->en) Translation: Polish » English
<~Nyaa> dude
<tomek> !tr en@pl leaf
<~Nyaa> Google says: (en->pl) Translation: English » Polish
<~Nyaa> Liść
<tomek> !wiki .bg bulgaria
<~Nyaa> България — Уикипедия | Република България е държава в Европа. Разположена е в източната част на Балканския полуостров и заема 22% от неговата територия. Площта ѝ е 110 843км², от които 110 510 км² суша и 333 км² водна площ. Населението е около 7640000 души (2007). Столица на 
<tomek> !wiki polska
<~Nyaa> Polska – Wikipedia, wolna encyklopedia | Polska, oficjalnie Rzeczpospolita Polska – państwo położone w Europie Środkowej nad Morzem Bałtyckim. Graniczy z Niemcami (na zachodzie), Czechami i Słowacją (na południu), Ukrainą i Białorusią (na wschodzie), na północnym wschodzie z Litwą oraz na północy z Rosją (obwód kaliningradzki). Ponadto polska granica wyłącznej strefy ekonomicznej na Bałtyku graniczy
<~Nyaa> ze strefami Danii i Szwecji. Pod względem powierzchni zajmuje 68. miejsce na świecie i dziewiąte w Europie. Pod względem zaludnienia zajmuje 33. miejsce na świecie. Kraj jest podzielony na 16 województw, które dzielą się na powiaty i gminy. Za umowną datę założenia państwa polskiego jest często przyjmowany rok 966, kiedy władca Mieszko I przyjął chrześc @ http://pl.wikipedia.org/wiki/Polska
<tomek> !tr en@ru russia
<~Nyaa> Google says: (en->ru) Translation: English » Russian
<~Nyaa> Россия
<tomek> !g ぉぃ
<~Nyaa> 8,700,000 Results | 当分「未定」らしい...(ぉぃ @ http://ww4.tiki.ne.jp/~hasuike/ | 中村葵ブログ「中村葵の*ぽかぽか*ぁぉぃ日和(*´∀`*)」by Ameba @ http://ameblo.jp/aoi-nakamura/ | あ゛ぁやっちゃったなぁ… ぉぃ… な毎日w @ http://ameblo.jp/gintoki-sakata-vol2/ | 仕事中に寝る(ぉぃ…) 仕事中に寝る(ぉぃ…)とは、仕事中に寝る(ぉ  @ http://www.karadakara.com/dict/keyword/
s
shadrach
Halfop
Posts: 74
Joined: Fri Dec 14, 2007 6:29 pm

Post by shadrach »

Does thommey's patch work for 1.6.19ctcpfix? Do I have to change anything? Code refers to 1.6.18.
User avatar
Anahel
Halfop
Posts: 48
Joined: Fri Jul 03, 2009 6:18 pm
Location: Dom!

Post by Anahel »

shadrach wrote:Does thommey's patch work for 1.6.19ctcpfix? Do I have to change anything? Code refers to 1.6.18.
i'm using 1.6.19+ctcp+ssl but i needed to compile bot again, but to apply patch i needed to edit files manually using patch -p0 < didnt work (it patched only one file)

so you need to download source again, apply thommey patch and compile it again
M
MellowB
Voice
Posts: 24
Joined: Wed Jan 23, 2008 6:02 am
Location: Germany
Contact:

Post by MellowB »

The future is now indeed!
Wonderful work @ UTF-8 support, works great with my patched eggdrop. :D
Thanks for your continuing great work here, much appreciated.
On the keyboard of life, always keep one finger on the ESC key.
a
ajc13
Voice
Posts: 4
Joined: Tue Oct 13, 2009 11:53 pm

Post by ajc13 »

Looking for some assistance, my apologies if this is the wrong spot.

When I attempt to invoke '!google' I receive the following:
Tcl error [incith::google::public_message]: can't read "state(body)": no such variable

Suggestions/redirections?

incith-google 1.9.9t (Sep25,2oo9)

running eggdrop v1.6.19+ctcpfix
OS: Linux 2.6.28-15-server
Tcl library: /usr/share/tcltk/tcl8.5
Tcl version: 8.5.6 (header version 8.5.6)
Tcl is threaded.

[23:05] Incith:Google compression test successful, found Trf package! Gzip enabled.
[23:05] - UNOFFICIAL incith:google-1.9.9t loaded.
User avatar
neofutur
Voice
Posts: 6
Joined: Fri Oct 02, 2009 9:38 pm
Location: irc://chat.freenode.net#bitcoin-hosting
Contact:

Post by neofutur »

ajc13 wrote:Looking for some assistance, my apologies if this is the wrong spot.

When I attempt to invoke '!google' I receive the following:
Tcl error [incith::google::public_message]: can't read "state(body)": no such variable
same here, same message :
Tcl error [incith::google::public_message]: can't read "state(body)": no such variable

it seems google change their homepage recently to obfuscate results . . .

wget --user-agent="Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.3) Gecko/2008092416 Firefox/3.0.3" "http://www.google.com/search?btnI=&q=Deprecated Function"

recently became very different than :

wget --user-agent="Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.3) Gecko/2008092416 Firefox/3.0.3" "http://www.google.fr/search?btnI=&q=Deprecated Function"

the google.com page seems now very obfuscated
User avatar
speechles
Revered One
Posts: 1398
Joined: Sat Aug 26, 2006 10:19 pm
Location: emerald triangle, california (coastal redwoods)

Post by speechles »

ajc13 wrote:Looking for some assistance, my apologies if this is the wrong spot.

When I attempt to invoke '!google' I receive the following:
Tcl error [incith::google::public_message]: can't read "state(body)": no such variable

Suggestions/redirections?

incith-google 1.9.9t (Sep25,2oo9)

running eggdrop v1.6.19+ctcpfix
OS: Linux 2.6.28-15-server
Tcl library: /usr/share/tcltk/tcl8.5
Tcl version: 8.5.6 (header version 8.5.6)
Tcl is threaded.

[23:05] Incith:Google compression test successful, found Trf package! Gzip enabled.
[23:05] - UNOFFICIAL incith:google-1.9.9t loaded.
Appears it found and enabled gzip. You can try disabling support for this.

Code: Select all

# Change this to 0 to disable gzip completely
variable use_gzip 1
If the error still persists, then I'll need more information from you. Such as right after this occurs, and the file ig-debug.txt is created in your eggdrop's root. What does your copy of it contain? Is it empty? Btw, ig-debug.txt contains the gathered html after a successful get/strip. Gzip inflation occurs before this file is written/created. Gzip inflation will only occur if the sending website has indicated the data is gzip encoded as well. Hopefully merely turning off gzip support solves it, although for me it works well. The only difference is I'm using the zlib package to support it. Haven't hide time to test using Trf nor had any users using Trf complain. So this may be the first problem concerning those using that package. The method (headerless unzip) using Trf to support gzip was borrowed from scottey's rss synd script and as such is expected to work the same in either script.

Also, will be a new version shortly to fix a few issues I've found and corrected in ebay (shipping/bid fix), google (result totals work again), and youtube (HD fixes). There are still bound to be tiny little inconsistencies here or there and since the scope of this script is so large I focus more time keeping the larger things that work right doing so, than the minor few which aren't. Stay tuned ;P
a
ajc13
Voice
Posts: 4
Joined: Tue Oct 13, 2009 11:53 pm

Post by ajc13 »

Thanks for the response.
speechles wrote: Appears it found and enabled gzip. You can try disabling support for this.
It initially complained that Trf was not found.

This is an Ubuntu host, so I installed the libtrf-tcl package (2.1.2~20071113-2).

Setting the disable and retesting.

On retest, I get the bot blocking hanging (high cpu%), ig-debug.txt contains... an ugly line from google...
User avatar
speechles
Revered One
Posts: 1398
Joined: Sat Aug 26, 2006 10:19 pm
Location: emerald triangle, california (coastal redwoods)

Post by speechles »

neofutur wrote:wget --user-agent="Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.3) Gecko/2008092416 Firefox/3.0.3" "http://www.google.fr/search?btnI=&q=Deprecated Function"

the google.com page seems now very obfuscated
This is most likely because the IP used is still one in use by the script/bot. If you've allowed gzip and the script can find support for it in either Trf/zlib packages or finds the commands it needs already available it will affix a header attribute to each query it makes. This attribute tells the website to send the reply back gzipped (compressed). What your seeing with wget is probably their reply sent to you compressed as well since it's made using the same IP as the script which just made a gzip request earlier. Make sense?
User avatar
speechles
Revered One
Posts: 1398
Joined: Sat Aug 26, 2006 10:19 pm
Location: emerald triangle, california (coastal redwoods)

Post by speechles »

ajc13 wrote:On retest, I get the bot blocking hanging (high cpu%), ig-debug.txt contains... an ugly line from google...
Ugly line? Does it look like compressed data?
btw, It's always going to be just a single line. The script strips all newlines, the .txt is made after the html is cleaned up and right before it's sent back to the main procedure to do further processing.

And.. does just !google not work. Have you tried any other triggers? Do any of these suffer similarly? It's easier for me to spot the source of the problem since I'm not experiencing it if you could do a little detective work as well. Have you changed "debugnick" within the config to your nickname. If so, does the bot message you the query it just made? What was the query string? With these answers I should be able to correct the problem.
User avatar
neofutur
Voice
Posts: 6
Joined: Fri Oct 02, 2009 9:38 pm
Location: irc://chat.freenode.net#bitcoin-hosting
Contact:

Post by neofutur »

speechles wrote:
neofutur wrote:wget --user-agent="Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.3) Gecko/2008092416 Firefox/3.0.3" "http://www.google.fr/search?btnI=&q=Deprecated Function"

the google.com page seems now very obfuscated
This is most likely because the IP used is still one in use by the script/bot. If you've allowed gzip and the script can find support for it in either Trf/zlib packages or finds the commands it needs already available it will affix a header attribute to each query it makes. This attribute tells the website to send the reply back gzipped (compressed). What your seeing with wget is probably their reply sent to you compressed as well since it's made using the same IP as the script which just made a gzip request earlier. Make sense?
no i made it from another Ip adress, and the was not gzipped
User avatar
neofutur
Voice
Posts: 6
Joined: Fri Oct 02, 2009 9:38 pm
Location: irc://chat.freenode.net#bitcoin-hosting
Contact:

Post by neofutur »

speechles wrote:
ajc13 wrote: Tcl error [incith::google::public_message]: can't read "state(body)": no such variable
Appears it found and enabled gzip. You can try disabling support for this.

Code: Select all

# Change this to 0 to disable gzip completely
variable use_gzip 1
this workaround worked for me !
the script is working with variable use_gzip 0

and I get the same error again if i go back to use_gzip 1

thanks for your answers !

to help you i also tried the debugnick
the debug works when I have use_gzip 0 but i receive nothing whent having use_gzip 1

same for ig-debug.txt, the file is written when I have use_gzip 0 but nothing is written in ig-debug.txt when I have use_gzip 1
Last edited by neofutur on Wed Oct 14, 2009 7:41 pm, edited 1 time in total.
a
ajc13
Voice
Posts: 4
Joined: Tue Oct 13, 2009 11:53 pm

Post by ajc13 »

speechles wrote: Ugly line? Does it look like compressed data?
btw, It's always going to be just a single line. The script strips all newlines, the .txt is made after the html is cleaned up and right before it's sent back to the main procedure to do further processing.

And.. does just !google not work. Have you tried any other triggers? Do any of these suffer similarly? It's easier for me to spot the source of the problem since I'm not experiencing it if you could do a little detective work as well. Have you changed "debugnick" within the config to your nickname.
Am trying mate, appreciate the patience.

Yes, the single line of dense html, javascript.

After setting the debug tried .google utah and bot retrieved a maps line and received a msg from the bot with the query.

I do '.google google' and it does not post the query in private and the bot then falls offline - it does manage to create your ig-debug.txt though.

It appears to get stuck processing that result?

That result posted:
http://pastebin.com/m2392959a
t
tscolin
Voice
Posts: 1
Joined: Thu Oct 15, 2009 2:13 pm

Post by tscolin »

i get this error when using google fight

[14:52] Tcl error [incith::google::public_message]: can't read "matches1": no such variable

?fight doesnt work :(
Post Reply