egghelp.org community Forum Index
[ egghelp.org home | forum home ]
egghelp.org community
Discussion of eggdrop bots, shell accounts and tcl scripts.
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

forked: UNOFFICIAL incith-google 2.0.0c (Sep9,2o11)
Goto page 1, 2  Next
 
Post new topic   Reply to topic    egghelp.org community Forum Index -> Script Support & Releases
View previous topic :: View next topic  
Author Message
xREVx
Voice


Joined: 25 Jul 2007
Posts: 15

PostPosted: Sun Feb 05, 2012 10:52 am    Post subject: Reply with quote

oh i just thought you were gone, speechles. glad to hear from you!! Very Happy

in the mean time, here's a quick fix for google results:

add these lines below the code which the comment says "# added because of recent google changes, needed to clean-up *.google.* links":

Code:
regsub -all {%20class=(.*)$} $link { } link
regsub -all {</(.+)>} $desc { } desc
regsub -all {<div(.+)$} $desc { } desc
regsub -all {</(.*)$} $desc { } desc
regsub -all {<a href=(.*)$} $desc { } desc


so your code will look like this:

Code:
# added because of recent google changes, needed to clean-up *.google.* links
if {[string match "*url\?*" $link]} {
regexp -- {url\?q=(.+?)$} $link - link
regexp -- {(.+?)\&sig=} $link - link
regexp -- {(.+?)\&usg=} $link - link
regexp -- {\?url=(.+?)$} $link - link
}

# quick fix
regsub -all {%20class=(.*)$} $link { } link
regsub -all {</(.+)>} $desc { } desc
regsub -all {<div(.+)$} $desc { } desc
regsub -all {</(.*)$} $desc { } desc
regsub -all {<a href=(.*)$} $desc { } desc


this should get us rolling while speechles doesn't release the official fix Wink

before the quick fix:
Quote:
<~User> !g test
<~Bot> Test.com Web Based Testing and Certification Software v2.0</a></h3><div clas @ http://test.com/%20class=l%20onmousedown=return%20rwt
(this,'','','','1','AFQjCNFOu11ntRBzX7MsPNhB_fDzErp8qg','','0CDQQFjAA',null,event)


after:
Quote:
<~User> !g test
<~Bot> Test.com Web Based Testing and Certification Software v2.0 @ http://test.com/


Last edited by xREVx on Sun Feb 05, 2012 7:22 pm; edited 1 time in total
Back to top
View user's profile Send private message
Arkadietz
Halfop


Joined: 14 Jul 2006
Posts: 67
Location: cat /dev/zero > /dev/null;

PostPosted: Sun Feb 05, 2012 3:52 pm    Post subject: Reply with quote

10x a lot xREVx
_________________
On a unix system everything is a file ; if something is not a file , it is a proccess.
Back to top
View user's profile Send private message Send e-mail Yahoo Messenger
xREVx
Voice


Joined: 25 Jul 2007
Posts: 15

PostPosted: Sun Feb 05, 2012 7:35 pm    Post subject: Reply with quote

np mate!

just added another line of code on my previous post, check it out Smile
Back to top
View user's profile Send private message
tommytom
Voice


Joined: 09 Sep 2011
Posts: 16

PostPosted: Thu Feb 23, 2012 2:02 am    Post subject: Reply with quote

I wasn't having that problem, but I will add it anyways in case it comes up (can't stand when HTML is spewed all over IRC like that).

I need to get back to fixing these things. I was on 1.99 (and fixed it) for so long I still need to understand how this one works.

I really want weather working, but I did it the wrong way the last time. I meant to put more time up for this script myself but haven't, so I know where speechles is coming from.
Back to top
View user's profile Send private message
tommytom
Voice


Joined: 09 Sep 2011
Posts: 16

PostPosted: Thu Feb 23, 2012 4:15 am    Post subject: Reply with quote

normal search fix
search for this around line 1647 (mine might vary)
Code:
# regular search
and then modify this:
Code:
        # regular search
        } else {
          if {![regexp -- {class=g(?!b).*?<a href="(.+?)".*?>((?!<).+?)</a>} $html - link desc]} {
            if {[regexp -- {class=r.*?<a href="(.+?)".*?>((?!<).+?)</a>} $html - link desc ]} {
              regsub -- {class=r.*?<a href=".+?".*?>(?!<).+?</a>} $html "" html
            }
          } else {
            regsub -- {class=g(?!b).*?<a href=".+?".*?>.+?</a>} $html "" html
To this:
Code:
        # regular search
        } else {
          if {![regexp -- {class="?g(?!b).*?<a href="(.+?)".*?>((?!<).+?)</a>} $html - link desc]} {
            if {[regexp -- {class="?r.*?<a href="(.+?)".*?>((?!<).+?)</a>} $html - link desc ]} {
              regsub -- {class="?r.*?<a href=".+?".*?>(?!<).+?</a>} $html "" html
            }
          } else {
            regsub -- {class="?g(?!b).*?<a href=".+?".*?>.+?</a>} $html "" html


Alternatively, you can manually find/replace all instances of:
Code:
class=r
with:
Code:
class="?r"?
and do the same for "g" or any others that might vary. Please not this won't work in the wildcard-only "match" parts. This only works in regex strings!

The single ? means the letter to the left may or may not exist. It will match either and only one character can fit in there (unlike .*?). This will stop the breakage of parts of the script due to the introduction or removal of the quotes around the class id. I strongly suggest doing it to all instance of "r", but I'm not doing it just yet as I am still troubleshooting other parts and don't want to make regressions elsewhere.

Before:
Code:
<~TommyTom> !g average penis length
<~TTBot> 1,410,000 results
<~TommyTom> !g test
<~TTBot> 3,410,000,000 results | Test Your Awareness: Do The Test - YouTube @ http://www.youtube.com/watch?v=Ahg6qcgoay4
<~TommyTom> !g test pdf
<~TTBot> 1,820,000,000 results


As you can see, you get no results, or only one (usually videos, it seems).

After:
Code:
<~TommyTom> !g average penis length
<~TTBot> 1,410,000 results | Human penis size - Wikipedia, the free encyclopedia <di @ http://en.wikipedia.org/wiki/Human_penis_size | Human penis size - Wikipedia, the free encyclopedia <di @ http://en.wikipedia.org/wiki/Human_penis_size | Human penis size - Wikipedia, the free encyclopedia <di @ http://en.wikipedia.org/wiki/Human_penis_size
<~TommyTom> !g test
<~TTBot> 3,410,000,000 results | Test.com Web Based Testing and Certification Software v2.0 @ http://test.com/ | Test.com Web Based Testing and Certification Software v2.0 @ http://test.com/ | Test.com Web Based Testing and Certification Software v2.0 @ http://test.com/
<~TommyTom> !g test pdf
<~TTBot> 1,820,000,000 results | [PDF] PDF Test Page www.educati @ http://www.education.gov.yk.ca/pdf/pdf-test.pdf | [PDF] PDF Test Page www.educati @ http://www.education.gov.yk.ca/pdf/pdf-test.pdf | [PDF] PDF Test Page www.educati @ http://www.education.gov.yk.ca/pdf/pdf-test.pdf


I think there are some truncated <div>s in there, so they don't get stripped out. Probably need some cleanup code BEFORE the desc truncation.

=================
Time fix

~line 1512
find:
Code:
        # time:
        } elseif {[string match "*src=\"http://www.google.com/chart*chc=localtime*" $html] == 1} {
          regexp -nocase -- {src="http://www.google.com/chart\?chs=.*?chc=localtime.*?><td valign=[a-z]+>(.+?)</table>} $html - desc
          regsub -- {<br>} $desc ". " desc
          regsub -all {<.*?>} $desc "" desc
          regsub -- {chc=localtime} $html {} html
replace with:
Code:
        # time:
        } elseif {[string match "*class=\"g tpo\"*class=\"s rbt\"*class=obcontainer*" $html] == 1} {
          regexp -nocase -- {class="g tpo".*?class="s rbt".*?class=obcontainer.*?<table.*?<td.*?>(.+?)</table>} $html - desc
          regsub -- {<br>} $desc ". " desc
          regsub -all {<.*?>} $desc "" desc
          regsub -- {class="g tpo".*?class="s rbt".*?class=obcontainer.*?<table.*?<td.*?>.+?</table>} $html {} html

Before:
Code:
<~TommyTom> !g time in new york
<~TTBot> 11,400,000,000 results | Current time in New York, United States - daylight savings time 2012 ... @ http://24timezones.com/world_directory/current_new_york_time.php | Current time in New York, United States - daylight savings time 2012 ... @ http://24timezones.com/world_directory/current_new_york_time.php | Current time in New York, United States - daylight savings time 2012 ... @
<~TTBot> http://24timezones.com/world_directory/current_new_york_time.php


After:
Code:
<~TommyTom> !g time in new york
<~TTBot> 3:38am Thursday (EST) - Time in New York, NY


Since it's plain-text now (no URLs or images), I removed the cleanup code.

Be careful with this one because I don't know if those class IDs will change or get reused. I tried to match on "time in" as well even with the bold tags, but it wouldn't so it's not as solid of a match as I would like. Wish they had put some kinda of image URL to match on...

======
Going to bed now. Was looking into the weird "apple" search result (probably has to do with it showing the map of an apple store) and also the "define:" area. Need to figure out what match is being triggered for apple (probably just going into the wrong area because of all the "answers" and the ad(s)) and would be extremely helpful if I could see what "define:" output should look like (old logs or old posts, if anyone has any) as I don't quite get the code in there a don't recall what it looks like (plus, it's been broken since I found this script, so I dunno if it's changed).

Edit:
Fixed the regsub in time: to allow to get "long answer" if you have that option set (default is short).


Last edited by tommytom on Fri Feb 24, 2012 4:32 pm; edited 3 times in total
Back to top
View user's profile Send private message
Trixar_za
Op


Joined: 18 Nov 2009
Posts: 143
Location: South Africa

PostPosted: Thu Feb 23, 2012 9:59 am    Post subject: Reply with quote

The trouble with the replacement of class=r and class=g is that using ? like that only works with regex and not string matches (which uses glob). That could potentially break the script. The second probably is that some of the regex searches for the whole thing, so class="?r> wouldn't work, it needs to be class="?r"?>. Just a thought Wink

Well done on the rest of it through Smile
_________________
http://www.trixarian.net/Projects
Back to top
View user's profile Send private message Visit poster's website
tommytom
Voice


Joined: 09 Sep 2011
Posts: 16

PostPosted: Thu Feb 23, 2012 3:51 pm    Post subject: Reply with quote

Yes, you are right. That's why I didn't do it. What I really meant was "add it manually/automatically to each properly", the properly was assumed and didn't check into it. I realized the "?r" situation later (I will go strike out my previous post).

You could probably build a regex s&r in certain editors, but that's still asking for trouble. If it ain't broke, don't fix it. I guess just add the "?r"? type stuff when it comes up (new one that changed and it got broken by this minuet problem).

Also, to add to the crap it would have broke doing a s&r, it would break class="r blah". Well, it would still match, but there is no need for the ? in there since it will always have quotes because of the space (AFAIK).

Sooooo many things broken still. "population of", "define:", "apple" (and others like it), etc. Seems like they are all falling into a "short answer" trap and only showing one result or none at all.

I put my old weather fix into my script just to have weather working (bit lackluster and I undo some stuff I shouldn't have, but I gotta learn how it is supposed to look.. no one has posted a log/ss for me to see).

Edit:
I revised it. And you are absolutely correct about the regular wildcard match string (glob?). I used class=r myself and didn't even think of this. Would have broken them.
Back to top
View user's profile Send private message
tommytom
Voice


Joined: 09 Sep 2011
Posts: 16

PostPosted: Fri Feb 24, 2012 12:01 am    Post subject: "more answers" fix Reply with quote

"more answers" fix

@line ~1426
Find this:
Code:
        # more answers
        } elseif {[string match "*\{google.rrep('answersrep'*" $html]} {
          regexp -- {<div id=res class=med role=main>.*?<h3 class=r>(.*?)</h3>} $html - desc


Change to:
Code:
        # more answers
        } elseif {[string match "*\{google.rrep('answersrep'*" $html]} {
          regexp -- {class=\"g answers.+?>.+?class=\"?r\"?>(.+?)<} $html - desc


before:
Code:
<~TommyTom> !g shrek release date
(desc variable error in console. No output in IRC.)


after:
Code:
<~TommyTom> !g shrek release date
<~TTBot> Best guess for Shrek Release Date is May 18, 2001


No sure if it's supposed to be that plain. No bolding. In Opera, the date is bold.
Back to top
View user's profile Send private message
tommytom
Voice


Joined: 09 Sep 2011
Posts: 16

PostPosted: Fri Feb 24, 2012 4:43 pm    Post subject: Reply with quote

I refixed the "time:" section in this post: http://forum.egghelp.org/viewtopic.php?p=98848#98848

Please recopy that part.
I left the stripping in (don't think it's needed) and put that last regsub back in and replaced it with the new regex that finds it.

Didn't realize at the time that the regsub is to strip the match criteria so that the next loop(s) will find the "long answer" search results (normal results, not answers).

Honestly, I think this is inefficient. Could you not set a $answerFound variable or something so that the second loop will not go into that section?

You could skip all the elseifs as well making the code more optimal.

Each regex/wildcard match update will only require updating that one line, not the regsub at the end as well to strip it. Each of those would just have "set answerFound 1" at the end and never have to edit that part again.

(psuedo code.. don't know TCL that well)
Code:
set answerFound 0
if (!$answerFound){
  #do answer stuff in here
  if (someanswer regex match){
    #blah blah
    set answerFound 1
  }
} else {
  #regular search results here
}


Not familiar enough with this code yet to say if that is possible, but it should be with some restructuring.
Back to top
View user's profile Send private message
speechles
Revered One


Joined: 26 Aug 2006
Posts: 1398
Location: emerald triangle, california (coastal redwoods)

PostPosted: Fri Feb 24, 2012 11:49 pm    Post subject: Reply with quote

Quote:
<~TommyTom> !g average penis length
<~TTBot> 1,410,000 results | Human penis size - Wikipedia, the free encyclopedia <di @ http://en.wikipedia.org/wiki/Human_penis_size | Human penis size - Wikipedia, the free encyclopedia <di @ http://en.wikipedia.org/wiki/Human_penis_size | Human penis size - Wikipedia, the free encyclopedia <di @ http://en.wikipedia.org/wiki/Human_penis_size
<~TommyTom> !g test
<~TTBot> 3,410,000,000 results | Test.com Web Based Testing and Certification Software v2.0 @ http://test.com/ | Test.com Web Based Testing and Certification Software v2.0 @ http://test.com/ | Test.com Web Based Testing and Certification Software v2.0 @ http://test.com/
<~TommyTom> !g test pdf
<~TTBot> 1,820,000,000 results | [PDF] PDF Test Page www.educati @ http://www.education.gov.yk.ca/pdf/pdf-test.pdf | [PDF] PDF Test Page www.educati @ http://www.education.gov.yk.ca/pdf/pdf-test.pdf | [PDF] PDF Test Page www.educati @ http://www.education.gov.yk.ca/pdf/pdf-test.pdf


This is not fixing anything. Notice it's the same 3 results. Instead of clogging my thread with all this FIX-THIS-PART bullshit, please CREATE YOUR OWN THREAD. I plan on fixing this script correctly, myself. Hence this thread should not contain a flood of fixes by anyone else. Does this make it clear? Do we all understand? I've had some work issues, health issues, and other real life things happening in my life. Please take it upon yourself to create yourself a new thread and stop cluttering mine with this junk...

Tommytom, thanks for the effort but it is not correct, breaks the multi-language featueres, and appears rough and messy. Worst of all, it is not based on the latest google which I have not made public yet Razz. It uses webby's encoders to correct encodings. And it fixes quite a few issues. The more people tamper with my thread. The longer I shall delay releasing that version here.

When I release fixes, It will be a complete script. If you want to continue this "edit here", "find this"... Please do it in ANOTHER thread... Please GO BACK and REMOVE your posts from my thread. Re-create them in your own. This makes my thread look like utter SH!T having all this bullsh!t....

You said yourself, you don't understand the code and are making shots in the dark at fixing it. I know the code, I have a debug version which makes this easy. I know the script like the back of my hand.

Can a moderator please remove all of these posts, this one included, any that were made after this date --> Posted: Sun Feb 05, 2012 12:05 am

Thanks to that moderator aplenty. Wink
_________________
speechles' eggdrop tcl archive
Back to top
View user's profile Send private message
spithash
Master


Joined: 12 Jul 2007
Posts: 248
Location: Libera

PostPosted: Sat Feb 25, 2012 12:28 am    Post subject: Reply with quote

speechles is right. this doesn't fix anything. it's just a mess up. it will only show the same result 3 times.

I don't want to be rude to anybody but, I will say this clearly: patience is gold.

there are reasons some things don't happen when we want them to happen. most of the time it means that if it happens when we want them to happen, disaster comes along. despite that we always, but always we have to double think.

why is this script one of the most wanted?
the answer is that because it's so awesome that everyone is using it.

another question is that, why don't we all let the coder himself fix it when he is ready? BECAUSE WE'RE ALLL TRYING TO STEAL SOME OF HIS GLORY/AND/OR RUIN THE SCRIPT. seriously, guys, I would never speak to you like this but it came to a point that nobody is patient to things that are nothing to us but a hobby

I wish I didn't offend anybody.

peace.

PS: I personally admire some people's efford on fixing stuff, but as speechles said, keep it out of here. hence this is the official thread for the script.
_________________
Libera ##rtlsdr & ##re - Nick: spithash
Click here for troll.tcl
Back to top
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger
nml375
Revered One


Joined: 04 Aug 2006
Posts: 2857

PostPosted: Sat Feb 25, 2012 12:40 pm    Post subject: Reply with quote

Moderated: Split from original thread UNOFFICIAL incith-google 2.0.0c (Sep9,2o11).

/NML_375

_________________
NML_375, idling at #eggdrop@IrcNET
Back to top
View user's profile Send private message
tommytom
Voice


Joined: 09 Sep 2011
Posts: 16

PostPosted: Sun Feb 26, 2012 10:57 am    Post subject: Reply with quote

Man, you guys are pretty ungrateful.
Some guy tries to help fix things and you boot him out.

I guess I will just leech then.

I'm not fixing it perfectly, sure, but I am giving hotfixes for those that actually want the script to work (even partially), not 90-99% broken.

I don't see why you have a unreleased script if it is so perfect/better than my fixes. Kinda rude yourself to keep something like to yourself if it works.

Anyways, if anyone wants me to continue to share, PM or reply.
Doubt I will be back.

Edit: Thanks for the feedback on the duplicated results (didn't notice in my haste), but I still don't like the attitude. I would have made it better, but NVM.
Back to top
View user's profile Send private message
xREVx
Voice


Joined: 25 Jul 2007
Posts: 15

PostPosted: Sun Feb 26, 2012 11:21 am    Post subject: Reply with quote

No need to walk away like that, tommytom. This is a forked thread so we should be allowed to do whatever we want here.

This could be the thread where people get fixes faster Wink

I'm grateful for what he has been doing, I've been using the script for quite a while now but I also agree he's pretty arrogant sometimes, and maybe because he thinks he's so great he's also too afraid of making mistakes, and that could be why he takes ages to release a fix...

I've shared my quick fix with the only intention of helping the community, because I thought it would be a bad attitude on my part if I kept the fix to myself while there are many others using the script and needing the fix.

Anyways, please keep doing it. Also, as a suggestion, maybe it'd be easier for everyone if you hosted the whole tcl file and just said what you've changed in it Smile
Back to top
View user's profile Send private message
tommytom
Voice


Joined: 09 Sep 2011
Posts: 16

PostPosted: Sun Feb 26, 2012 11:59 am    Post subject: Reply with quote

Well, my fixes are experimental and only to get parts of it working. I have edited my copy so much, I will eventually have to take a clean copy and put my own fixes back in. I don't want to share this mangled copy I have.

I will make a fully edited beta script or something later maybe but my script has some custom bug fixes only for my editor (notepad++) that no one else would need.

Do you have a better suggestion for a syntax highlighting editor (not many for TCL, I'm sure)? notepad++ isn't that great (for TCL anyways). elseif isn't colored and unclosed regex quotes break the coloring (until another quote is found, the lines between are colorless as if the whole this is a quoted text). I currently have to add ;#" at the end of problem regex lines to fixes the lines after them.

I'm more of a utilitarian. I get it working, nothing more.
If I have to break a few things to get it working partially (over not at all), then I will. However, I don't want to turn that into a collection of these and call it some special script. If speechles wants to take my regexs or something to fix the main one "properly" without breaking the other languages (I don't care for that. We have google translate script(s) and mostly everyone speaks/reads English and 100% do in my channel). My goal was to get the important parts working, even partially, and speechles could take SOMETHING out of it if he hadn't fixed it already.

That said, if you want to take a clean copy (pre-fork), apply the fixes, and share it, then feel free. I really don't care. I'm fixing it for myself (and my channel) and posting how I did it. If someone has a better way of doing it, I don't care. Do it and share it if you like.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    egghelp.org community Forum Index -> Script Support & Releases All times are GMT - 4 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Forum hosting provided by Reverse.net

Powered by phpBB © 2001, 2005 phpBB Group
subGreen style by ktauber