| View previous topic :: View next topic |
| Author |
Message |
speechles Revered One

Joined: 26 Aug 2006 Posts: 1398 Location: emerald triangle, california (coastal redwoods)
|
Posted: Wed Feb 02, 2011 7:04 pm Post subject: |
|
|
| spithash wrote: | ok, adding an extra } fixed it, but it's not working at all now.
| Code: | # parse the html
while {$results < $incith::google::youtube_results} {
# somewhat extenuated regexp due to allowing that there might be an image next to the title
if {[regexp -nocase {<span class="video-time">(.*?)</span.*?href="/watch\?v=(.+?)".+?title=".+?">(.+?)</a>.*?id="video\-description.*?>(.*?)</p.*?class="date\-added">(.+?)</span.*?class="viewcount">(.+?)</span} $html - ded4 cid desc ded ded2 ded3]} {
if {[string match "*</span>*" $desc]} {
regexp -nocase {<span class="video-time">(.*?)</span.*?href="/watch\?v=(.+?)".+?title="(.*?)">.*?id="video\-description.*?">(.*?)</p.*?class="date\-added">(.+?)</span.*?class="viewcount">(.+?)</span} $html - ded4 cid desc ded ded2 ded3
}
regsub -nocase {<span class="img">.*?</div> </div>} $html "" html
}
}
|
my script looks like this ^
But after I !youtube search, it just shows/does nothing
EDIT: the bot ping timeouts and never comes back after the searching. |
Buyer beware, you can't guess at how to fix it. Your bot is endlessly looping that while, forever... It is harder than one thinks to alter a script and have it function correctly, isn't it? Yes. In this case it is....
Why? Because you've merely changed the scrape, not the scrub as well. That isn't an inline regexp you see. That is your plain jane ordinary ol regular one that will continue to match. There is a corresponding scrubber (in this case, the regsub below) that goes hand-in-hand with this type of scraping method. If the regsub cannot scrub, then the regexp will continue to match the exact same parts of text. Forever. I didn't make it this way, it was made this way originally by incith. Here is how you should likely alter that regsub and fix the scrub and that nasty endless looping. Change the regsub below: | Code: | | regsub -nocase {<span class="img">.*?</div> </div>} $html |
"<span class="img>" and "</div> </div>" used to encapsulate each item. It no longer does, this will also need correcting. Hopefully this weekend I'll have a correct fix for this soon, until then try changing that above regsub.. to this: | Code: | | regsub -nocase {<span class="video-time">.*?</span.*?href="/watch\?v=.+?".+?title=".+?">.+?</a>.*?id="video\-description.*?>.*?</p.*?class="date\-added">.+?</span.*?class="viewcount">.+?</span} $html "" html |
This is a complex scrubber that wastes clock cycles, but until I get around to fixing it properly. See if this works. _________________ speechles' eggdrop tcl archive |
|
| Back to top |
|
 |
neocratic Voice
Joined: 16 May 2010 Posts: 15
|
Posted: Fri Feb 04, 2011 1:40 pm Post subject: Re: About !g time country |
|
|
| speechles wrote: |
There will be shortly. It's just with this script I need to devote a serious slice of continuous time. It can't be short bursts of 15 minutes here or there. This weekend I will have that time to eliminate some of the problems that have over time resurfaced: google time, wikipedia, wikimedia, youtube, etc ... These all have issues in one way or another. When fixing these I will likely find even more issues and correct these along the way. This is why I tend to let things stack up before releasing a fix because I want to evolve it forward correcting long standing issues (like no bold in results when utf-8 patched?), inconsistent encodings, etc. The things that in the long run will create a better end product. Rushing to fix regex parsing bugs is a short term fix with no evolution to me..
But suffice to say, you don't need to read any of that diatribe above if you don't want to. It's just words. But expect a new version of this script this weekend. In that it will most assuredly correct the "time" problem you are experiencing.  |
Thanks a lot for the reply, i now have understood what you are trying to tell.I will be waiting for the next version update  |
|
| Back to top |
|
 |
spithash Master

Joined: 12 Jul 2007 Posts: 248 Location: Libera
|
Posted: Mon Feb 07, 2011 1:43 pm Post subject: |
|
|
speechles: do you have the youtube fix somewhere uploaded?
I either must did something wrong idk, but I tried what you said on your previous post without any luck =/
I see sp33chy is working great though..  _________________ Libera ##rtlsdr & ##re - Nick: spithash
Click here for troll.tcl |
|
| Back to top |
|
 |
bfoos Voice
Joined: 30 Sep 2010 Posts: 6
|
Posted: Mon Feb 07, 2011 1:48 pm Post subject: |
|
|
| spithash wrote: | speechles: do you have the youtube fix somewhere uploaded?
I either must did something wrong idk, but I tried what you said on your previous post without any luck =/
I see sp33chy is working great though..  |
http://forum.egghelp.org/viewtopic.php?p=95867#95867 |
|
| Back to top |
|
 |
spithash Master

Joined: 12 Jul 2007 Posts: 248 Location: Libera
|
Posted: Mon Feb 07, 2011 1:57 pm Post subject: |
|
|
| bfoos wrote: | !yt was more broken than that. A better temporary solution is to set...
variable youtube_results 0
Then add...
"yt:g:site:youtube.com %search%"
Under Custom Trigger Phrasing.
speechles is due to address this issue amongst others in an upcoming update. |
Actually to be honest, it was my fault for not reading that post..
It worked great after doing so..
Thanks! _________________ Libera ##rtlsdr & ##re - Nick: spithash
Click here for troll.tcl |
|
| Back to top |
|
 |
pogue Voice

Joined: 17 May 2009 Posts: 28
|
Posted: Thu Feb 17, 2011 3:06 am Post subject: |
|
|
I'm seeing some problems in the wikipedia lookup now. I attempted to setup debug to see if there was any error, but nothing was sent to me.
Here is the query, all queries produce the same result:
| Quote: | [12:58am] <~pogue> !wiki suez canal
[12:58am] <+BodyBuildingBot> Jump to: navigation, search |
I am using 2.0.0a
Info on the bot:
| Quote: | I am BodyBuild, running eggdrop v1.6.19: 13 users (mem: 841k).
Online for 1 day, 00:54 (background) - CPU: 00:06 - Cache hit: 4.0%
Admin: Kelso
Config file: bbbot.conf
OS: Linux 2.6.18-194.17.1.el5
Tcl library: /usr/share/tcl8.4
Tcl version: 8.4.13 (header version 8.4.13)
Tcl is threaded. |
Here is the full text of the script I'm using (only alterations in the options section @ the beginning)
http://tcl.pastebin.com/8gd9GE3R
Help would be appreciated!
Thanks,
pogue _________________ Helpful Tools:
|
|
| Back to top |
|
 |
spithash Master

Joined: 12 Jul 2007 Posts: 248 Location: Libera
|
Posted: Thu Feb 17, 2011 3:23 pm Post subject: |
|
|
They changed wikipedia' s website, that's why you get this error. _________________ Libera ##rtlsdr & ##re - Nick: spithash
Click here for troll.tcl |
|
| Back to top |
|
 |
spithash Master

Joined: 12 Jul 2007 Posts: 248 Location: Libera
|
Posted: Fri Feb 25, 2011 3:09 pm Post subject: |
|
|
OK, speechles fixed the wiki and added the youtube temporary fix in a kinda working pre-release so I thought it would be great to share it to you.
Make sure you guys set the script up because it is edited by me the way I have it loaded.
NOTE: MY BOT IF UTF-8 PATCHED SO YOU NEED TO CHANGE THOSE BACK TO DEFAULT (SEE A PREVIOUS RELEASE OR SOMETHING)
| Code: | variable dirty_decode 1
# enable gzip compression for bandwidth savings? Keep in mind
# this semi-breaks some of the present utf-8 work-arounds and
# eggdrop may mangle encodings when gzip compression that it
# doesn't when uncompressed html it used (default). A setting
# of 0 defaults to uncompressed html, a 1 or higher gzip.
# ------
# NOTE: If you do not have Trf or zlib packages setting this
# to 0 is recommened. Leaving it at 1 is fine as well, as the
# script will attempt to find these commands or packages every
# rehash or restart. But to keep gzip from ever being used it
# is best to set the below variable to 0.
# NOTE2: If you have Trf or zlib packages present, then this
# should always be set to 1. You save enormous bandwidth and
# time using this. If your bot is patched and you have Trf/zlib
# then you should definitely leave this at 1 and you will never
# suffer issues.
# ------
variable use_gzip 0
# THIS IS TO BE USED TO DEVELOP A BETTER LIST FOR USE BELOW.
# To work-around certain encodings, it is now necessary to allow
# the public a way to trouble shoot some parts of the script on
# their own. To use these features involves the two settings below.
# -- DEBUG INFORMATION GOES BELOW --
# set debug and administrator here
# this is used for debugging purposes
# ------
variable debug 1
variable debugnick spithashhh
# AUTOMAGIC
# with this set to 1, the bottom encode_strings setting will become
# irrelevant. This will make the script follow the charset encoding
# the site is telling the bot it is.
# This DOES NOT affect wiki(media/pedia), it will not encode automatic.
# Wiki(media/pedia) still requires using the encode_strings section below.
# ------
# NOTE: If your bot is utf-8 pathced, leave this option at 1, the only
# time to change this to 0 is if your having rendering problems.
# ------
variable automagic 1
# UTF-8 Work-Around (for eggdrop, this helps automagic)
# If you use automagic above, you may find that any utf-8 charsets are
# being mangled. To keep the ability to use automagic, yet when utf-8
# is the charset defined by automagic, this will make the script instead
# follow the settings for that country in the encode_strings section below.
# ------
# NOTE: If you bot is utf-8 patched, set this to 0. Everyone else, use 1.
# ------
variable utf8workaround 0
|
So anyway, speechles is way too busy to make a complete release but soon enough, he will get back to that.
Until then, play with this one, here it is:
http://bsdunix.info/spithash/nagger/incith-google-v2.00a+wikiANDyoutubeTEMPfix.tcl _________________ Libera ##rtlsdr & ##re - Nick: spithash
Click here for troll.tcl |
|
| Back to top |
|
 |
pogue Voice

Joined: 17 May 2009 Posts: 28
|
Posted: Tue Mar 01, 2011 3:51 am Post subject: |
|
|
| spithash wrote: | | OK, speechles fixed the wiki and added the youtube temporary fix in a kinda working pre-release so I thought it would be great to share it to you. |
Thanks spithash & speechles! _________________ Helpful Tools:
|
|
| Back to top |
|
 |
Mabus4444 Halfop
Joined: 30 Oct 2006 Posts: 51
|
Posted: Mon Mar 07, 2011 10:51 am Post subject: |
|
|
2.0 fixes the youtube problem but the wiki problem still isn't fixed for me. I get this error in the console;
Tcl error [incith::google::public_message]: Unknown option -urlencoding, must be: -accept, -proxyfilter, -proxyhost, -proxyport, -useragent _________________ http://www.dalnetdebates.com/ |
|
| Back to top |
|
 |
Trixar_za Op

Joined: 18 Nov 2009 Posts: 143 Location: South Africa
|
|
| Back to top |
|
 |
Mabus4444 Halfop
Joined: 30 Oct 2006 Posts: 51
|
Posted: Sat Mar 12, 2011 7:45 pm Post subject: |
|
|
I'm using http.tcl version 2.5.2
I tried loading your copy instead, and restarted the bot. Same error message. _________________ http://www.dalnetdebates.com/ |
|
| Back to top |
|
 |
spithash Master

Joined: 12 Jul 2007 Posts: 248 Location: Libera
|
|
| Back to top |
|
 |
Mabus4444 Halfop
Joined: 30 Oct 2006 Posts: 51
|
Posted: Tue Mar 22, 2011 11:07 am Post subject: |
|
|
Thanks for the updated version.
The problem persists however, tried a rehash and a full restart to no avail. I get the following message in the console;
Tcl error [incith::google::public_message]: Unknown option -urlencoding, must be: -accept, -proxyfilter, -proxyhost, -proxyport, -useragent _________________ http://www.dalnetdebates.com/ |
|
| Back to top |
|
 |
speechles Revered One

Joined: 26 Aug 2006 Posts: 1398 Location: emerald triangle, california (coastal redwoods)
|
Posted: Thu Mar 24, 2011 1:46 am Post subject: |
|
|
| Mabus4444 wrote: | Thanks for the updated version.
The problem persists however, tried a rehash and a full restart to no avail. I get the following message in the console;
Tcl error [incith::google::public_message]: Unknown option -urlencoding, must be: -accept, -proxyfilter, -proxyhost, -proxyport, -useragent |
My good sir, the answer is simple. The answer is clear, the answer is close, and the answer is here: | Code: | | set http [::http::config -useragent $ua -urlencoding "utf-8"] |
Change that, to look like this... | Code: | | set http [::http::config -useragent $ua] |
Also, there might be one or two of these to change: | Code: | | set http [::http::config -useragent $ua -urlencoding "utf-8"] |
To look like this: | Code: | | set http [::http::config -useragent $ua] |
Done... Ready for more?
Now before you begin, and apply all these changes.
Use the version below to update and exchange them.
::::: >> @everyone, especially Mabus4444
New Version: Incith:Google v2.0.0b
The version above corrects several small bugs, and enhances mediawiki/wikimedia and now parses wikia sites 100% as well. This brings a plethora of new custom trigger-phrases built-in, with literally thousands more for you to design yourself.
| Code: | "fg:wm:.familyguy.wikia.com %search%"
"ad:wm:.americandad.wikia.com %search%"
"sp:wm:.southpark.wikia.com %search%"
"sw:wm:.starwars.wikia.com %search%"
"na:wm:.naruto.wikia.com %search%"
"in:wm:.inuyasha.wikia.com %search%"
"gr:wm:.gremlins.wikia.com %search%"
"wow:wm:.wowwiki.com %search%"
"smf:wm:.smurf.wikia.com %search%"
"sm:wm:.sailormoon.wikia.com %search%"
"pk:wm:.pokemon.wikia.com %search%"
"ss:wm:.strawberryshortcake.wikia.com %search%"
"mlp:wm:.mlp.wikia.com %search%"
"lps:wm:.lps.wikia.com %search%"
"ant:wm:.ants.wikia.com %search%"
"gm:wm:.gaming.wikia.com %search%"
"nt:wm:.nothing.wikia.com %search%"
"ff:wm:.finalfantasy.wikia.com %search%" |
All of these "custom trigger" phrases above allow short-cuts to access wikimedia long-names. If you need any explanation of how to construct custom trigger phrases ask. Triggers with very nested and complex combinations are possible which may not be overly apparent to the mere user of this script.
If you experience issues, shout them out. Yes, youtube is still technically broken. It is merely wrapped through google, with some custom trigger-phrasing logic to make it give the appearance it does work. This _will_ eventually be addressed when time comes. This demonstrates the power of custom trigger phrases and the potential they have to do wonderful things. So remember, youtube doesn't work. It will soon, until then, investigate the !video or !v trigger which does work. Everybody forgets about that trigger... _________________ speechles' eggdrop tcl archive |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|