| View previous topic :: View next topic |
| Author |
Message |
speechles Revered One

Joined: 26 Aug 2006 Posts: 1398 Location: emerald triangle, california (coastal redwoods)
|
Posted: Sat Jan 12, 2008 6:22 pm Post subject: |
|
|
I can actually, do one better than that. I've expanded the Wikipedia section of the script to now allow for regional dialects within each country. Below is a small demonstration of how this looks on irc. | Quote: | <speechles> !w .sr@sr-el dilan dog
<sp33chy> Dilan Dog | Dilan Dog (engl. Dylan Dog) je lik iz stripa koji je stvorio italijanski pisac i novinar Ticijano Sklavi za italijansku izdavačku kuću Serđo Boneli Editore, tj. Boneli Komiks. Prvi broj je izašao u Italiji 1986. Dilan Dog je već nakon nekoliko brojeva postigao veliku popularnost. Početkom 1990ih, Dilan je u celom svetu izlazio u mesečnom tiražu većem od 1.000.000 primeraka. Dark
<sp33chy> Hors Komiks je izdavao englesku verziju Dilana Doga, a u bivšoj Jugoslaviji objavljivao ga je novosadski Dnevnik. Trenutno Ludens objavljuje stripove u Hrvatskoj i Srbiji. @ http://sr.wikipedia.org/sr-el/%D0%94%D0%B8%D0%BB%D0%B0%D0%BD_%D0%94%D0%BE%D0%B3 [1 Redirect(s)] |
!wikipedia [.country.code[@region-dialect]] <search terms>
The syntax hasn't changed but been expanded upon. Parts not bolded aren't required, in cases where they are omitted defaults will be applied.
This is only an Wikipedia update to address the problem with languages such as serbian above (Wikimedia addition coming later), this is just something in the meantime up for a beta test... You WILL need to check the encode strings section, and change the entry given for sr-el. Otherwise it will be as I have it presently set to fall to utf-8, which displays badly on irc as you see in my quote above. Your better off using a proper encoding. All regional dialects encodings and country encodings can be added/expanded upon in this way.
Get the new script HERE <v1.9.7a>. If you have problems or see an obvious bug or two let's me hear about it.
Also, as a side-note you might not already know. To force this as default for latinc serbians in the script set the config as such variable wiki_country "sr@sr-el". Wikipedia will now behave this way without using switches, this changes the default behavior. Hope you understand.  |
|
| Back to top |
|
 |
speechles Revered One

Joined: 26 Aug 2006 Posts: 1398 Location: emerald triangle, california (coastal redwoods)
|
Posted: Mon Jan 14, 2008 4:09 pm Post subject: |
|
|
Okay, had some time to catch Wikimedia up to the level where Wikipedia now is. Wikimedia now has now been expanded with some interesting things.
!wikimedia [.wiki.site.to.parse.com[@[+]language] <search terms>
There are now 3 possible ways to use wikimedia (for those not native english):
!wikimedia .sr.wikipedia.org - standard wiki search using standard eggdrop encoding.
!wikimedia .sr.wikipedia.org@sr-el - standard wiki search using sr-el encoding.
!wikimedia .sr.wikipedia.org@+sr-el - complete /sr-el/ search as well as using sr-el encoding.
| example from irc wrote: | <speechles> !wm .sr.wikipedia.org@+sr-el something
<sp33chy> Something to Remember | Something to Remember (srp. Нешто за памћење) je Madonina kompilacija najboljih balada, izdata 7. novembra 1995. godine od strane Warner Bros. Records. Album sadrži i tri nove pesme: I Want You, You'll See i One More Chance, kao i remiks pesme Love Don't Live Here Anymore sa albuma Like a Virgin. Prodata je u oko 8 miliona primeraka. @
<sp33chy> http://sr.wikipedia.org/sr-el/Something_to_Remember [1 Redirect(s)] |
Now I know the encoding above isn't correct, it's using UTF-8 for sr-el at the moment (serbian users should correct this with the encode_strings section of the config). The @ signifies you wish to use encoding for that language, the @+ signifies a complete search with encodings for that language. These actions follow the encode_strings section. So now it is possible to use customized encoding tags with wikimedia.
Get the new script HERE <v1.9.7a>
Additionally, to force a language together with a wikisite as default (which requires no switches) set your config as such: variable wikimedia_site "wiki.yoursite.net@+lang". Where lang is defined in the encode_strings section contained in the config, this requires users to tailor themselves.
One day I'll write a users manual of sorts to explain the expanded options available under each trigger. So that at least the person setting up the bot understands its complete functionality and intended behaviors. 
Last edited by speechles on Mon Jan 21, 2008 4:45 pm; edited 1 time in total |
|
| Back to top |
|
 |
BeBoo Halfop
Joined: 26 Sep 2007 Posts: 42
|
Posted: Tue Jan 15, 2008 12:18 pm Post subject: |
|
|
Having some problems with weather. Normal searching works fine but I get this in DCC chat with bot:
11:17 optix: [08:17] Tcl error [incith::google::public_message]: can't read "w5": no such variable
Did google tweak their site again?
Thanks!! |
|
| Back to top |
|
 |
speechles Revered One

Joined: 26 Aug 2006 Posts: 1398 Location: emerald triangle, california (coastal redwoods)
|
Posted: Tue Jan 15, 2008 2:30 pm Post subject: |
|
|
| BeBoo wrote: | Having some problems with weather. Normal searching works fine but I get this in DCC chat with bot:
11:17 optix: [08:17] Tcl error [incith::google::public_message]: can't read "w5": no such variable
Did google tweak their site again?
Thanks!! |
Wow, seems overnight google decided to change a ton of <p class=e> into <div class=e>. In fixing all these and checking what I could I found this affected most of the Google OneBox results. This has now been corrected as you can see below. Everything should work as it did before. | Quote: | <speechles> !g .ca weather:ontario
<sp33chy> 14,600 Results | Weather for Ontario: -17C, Wind: NE at 6 km/h, Humidity: 86%; Forecast: Tue, Cloudy (-13C|-18C); Wed, Mostly sunny (-3C|-6C); Thu, Chance of snow (-3C|-8C)
<speechles> !g misspeled werds intentianly
<sp33chy> Did you mean: misspelled words intentionly. No standard web pages containing all your search terms were found. Your search - misspeled werds intentianly - did not match any documents.
<speechles> !g IBM
<sp33chy> 261,000,000 Results | IBM: (INTL BUSINESS MACH) = $102.37 -0.56 (-0.54%) Jan 15 12:58pm ET Mkt Cap: 141.06B @ http://finance.google.com/finance?q=IBM | News results for IBM @ http://news.google.com/news?q=IBM | IBM United States @ http://www.ibm.com/ | IBM Support & downloads - United State @ http://www.ibm.com/support/
<speechles> !g spell:i like to misspel stuf
<sp33chy> Did you mean: i like to misspell stuff |
Notice weather in the first query? Or the third query above with IBM? weather, stocks, and news are onebox results. All onebox results should function properly (except for flights and phone numer lookups, both silly for irc use). So irc based google now mirrors/looks/works exactly as web based google does.
Get the new script HERE <1.9.7c>
Also included are Ebay results that contain 'ebay store' results. If the amount of auctions found is less than the amount you have set to display, the script will now show these 'ebay store' results after the initial auction results. Underlined-store-names help indicate them clearly, also includes a similar style as the "did you mean" which google has always used.
Seems !local also had some issues with google's recent webpage changes, this has been addressed as well... enjoy |
|
| Back to top |
|
 |
MellowB Voice
Joined: 23 Jan 2008 Posts: 24 Location: Germany
|
Posted: Wed Jan 23, 2008 6:09 am Post subject: |
|
|
So let this be my first posting around here then. :S
First of, big thanks for all your great work you invest in this script there speechles.
The googlescript sure is one of the most useful scripts for eggdrops anyway and highly used/abused on our channel. ^^
So yeah, great work there. :]
But actually why i'm posting here is for a shameless request.
Well, is there any way that you could get that script done in a default UTF-8 mode? Eggdrops work pretty fine so far with UTF-8 and more and more IRC Clients move to default UTF-8 display too. It shall and hopefully will become the de-facto standard on most IRC channels in the future.
So yeah, it would be great if the google script could accept search requests for all different modes in UTF-8 and also display the results as such. Most of the pages queryd are in UTF-8 in the first place anyway so that would just be a logical step. So is there some way that this will happen in the future or is there even some way already? Or would it be too much work to change the whole script to UTF-8?
Would be glad to hear about that so yeah, cya around. ^^ _________________ On the keyboard of life, always keep one finger on the ESC key. |
|
| Back to top |
|
 |
speechles Revered One

Joined: 26 Aug 2006 Posts: 1398 Location: emerald triangle, california (coastal redwoods)
|
Posted: Wed Jan 23, 2008 9:31 pm Post subject: |
|
|
| MellowB wrote: | | Would be glad to hear about that so yeah, cya around. ^^ | The problem is eggdrop has obvious flaws when using UTF-8, bugs you might call them, causing eggdrop to behave irrationally. The work around is either patching eggdrop's source as mentioned before here, or construct an encoding conversion system to "work-around" by pointing each language intelligently to its correct encoding (not using strictly UTF-8 for all). This script can do the later. Intelligent conversions depend entirely on your encode_strings config settings. That lookup table is the same as each country switch you would use. If you want UTF-8 for all, just set "com" to UTF-8 as well whatever country your using in your defaults to an encoding in the encode_strings section. If you leave any of these fields blank, eggdrop will use it's internal encoding system which looks similar to UTF-8.
Example: Say, you have "com" for google default and perhaps "com" for wikipedia, and then for ebay you might use "fr". Keep in mind this is an example, not what normal users do. But anyways, in the encode_strings section to force UTF-8 for both of these, you would need | Code: | variable encode_strings {
com:utf-8
fr:utf-8
# ... etc etc etc on and on as you see fit add more ...
} |
This forces the script to interpret all calls using a "com" switch to use UTF-8 for encoding, and the same for calls with "fr".
Now if a user types instead !google .it pizza it will now use the internal eggdrop encoding because "it" is an undefined encode_string. I haven't had time enough to test every encoding with eggdrop, which is why the encode_string table is so small. But feel free to add more as you see fit to correct this limitation. If you can get over the initial learning curve of how to set up this script (there are alot of variables, indeed) than you should be able to understand what I've said.. hopefully  |
|
| Back to top |
|
 |
MellowB Voice
Joined: 23 Jan 2008 Posts: 24 Location: Germany
|
Posted: Thu Jan 24, 2008 5:34 am Post subject: |
|
|
Thanks for the help there, i'll sure try that today later.
I think i already fixed some encoding issues with the Eggdrop itself (i remember changing some source in TCL back then) so maybe my bot is ready already just the script itself was the culprit so far. So yeah, lets see if i can get that done today, i was already able to get Japanese results with the wikimedia encoding language triggers.
But beside that, YouTube search bailed today. Seems they did some huge changes to the page in the last few hours or whatever, getting a
| Code: | | [10:26] Tcl error [incith::google::public_message]: can't read "reply": no such variable |
from the bot now if i do a !yt search. So yeah, looking for an update. ^^ _________________ On the keyboard of life, always keep one finger on the ESC key. |
|
| Back to top |
|
 |
speechles Revered One

Joined: 26 Aug 2006 Posts: 1398 Location: emerald triangle, california (coastal redwoods)
|
Posted: Thu Jan 24, 2008 9:29 am Post subject: |
|
|
| MellowB wrote: | But beside that, YouTube search bailed today. Seems they did some huge changes to the page in the last few hours or whatever, getting a
| Code: | | [10:26] Tcl error [incith::google::public_message]: can't read "reply": no such variable |
from the bot now if i do a !yt search. So yeah, looking for an update. ^^ |
Youtube corrected, get at any v1.9.7c link above.. have a fun  |
|
| Back to top |
|
 |
MellowB Voice
Joined: 23 Jan 2008 Posts: 24 Location: Germany
|
Posted: Sat Jan 26, 2008 11:47 am Post subject: |
|
|
Ok, just updated and the encoding strings thing is nice, works like a charm.
But there is still a problem if we give the bot some unicode request like:
| Quote: | [16:42:34] <MellowB> !yt 榎本くるみ
[16:42:36] <Cocco> No Videos found for '' |
or
| Quote: |
[16:43:45] <daniel[> !yt Hkon
[16:43:46] <Cocco> No Videos found for ''
|
Could this be fixed or is that a problem with my bot not being able to read the unicode from the channel correctly?
Output now works fine like:
_________________ On the keyboard of life, always keep one finger on the ESC key. |
|
| Back to top |
|
 |
speechles Revered One

Joined: 26 Aug 2006 Posts: 1398 Location: emerald triangle, california (coastal redwoods)
|
Posted: Sun Jan 27, 2008 3:15 pm Post subject: |
|
|
| MellowB wrote: | Ok, just updated and the encoding strings thing is nice, works like a charm.
But there is still a problem if we give the bot some unicode request like:
| Quote: | [16:42:34] <MellowB> !yt 榎本くるみ
[16:42:36] <Cocco> No Videos found for '' |
or
| Quote: |
[16:43:45] <daniel[> !yt Hkon
[16:43:46] <Cocco> No Videos found for ''
|
Could this be fixed or is that a problem with my bot not being able to read the unicode from the channel correctly?
Output now works fine like:
|
Had some time finally to dig into why this was working correctly for some triggers and not for some of the others. I've finally figured out the reason, and once I get off from work tonight I can thoroughly bugtest it and eliminate debug code.. But for something to see look below, it does indeed fix the problem.. YAY! Simply adding a urlencoder/encoding routine solves the problem (input: is the debug code) and yeah... my irc client doesn't support utf-8 so it looks funny, but believe me, this is correct!.. This will also be added to wikimedia/wikimedia to support their codes which use . instead of %
Tonight I'll have a new script for everyone to try correcting this long-standing flaw.  |
|
| Back to top |
|
 |
MellowB Voice
Joined: 23 Jan 2008 Posts: 24 Location: Germany
|
Posted: Sun Jan 27, 2008 4:34 pm Post subject: |
|
|
Horray! Good news.
Looking forward to that and huge thanks for your awesome work there. <3 _________________ On the keyboard of life, always keep one finger on the ESC key. |
|
| Back to top |
|
 |
speechles Revered One

Joined: 26 Aug 2006 Posts: 1398 Location: emerald triangle, california (coastal redwoods)
|
Posted: Mon Jan 28, 2008 2:54 pm Post subject: |
|
|
| Quote: | <speechles> !g HkHkHkHkHk
<sp33chy> Your search - HåkHåkHåkHåkHåk - did not match any documents.
<speechles> !g Hkon
<sp33chy> 300,000 Results | Lie, Håkon Wium @ http://people.opera.com/howcome/ | Haakon VII of Norway - Wikipedia, the @ http://en.wikipedia.org/wiki/Haakon_VII_of_Norway | Haakon County, South Dakota - Wikipedi @ http://en.wikipedia.org/wiki/Haakon_County,_South_Dakota | MySpace.com - Haakon Ellingsen - NO - @ http://www.myspace.com/haakonellingsen
<speechles> !w Hkon
<sp33chy> Haakon | Look up Haakon in Wiktionary, the free dictionary. Haakon (also spelled Hkon, Hakon, Hkon, Hkan or Haco) is the modern Norwegian form of the Old Norwegian masculine first name Hkon meaning "High Son" from h (high) and konr (son). Haakon was the name of seven kings of Norway (see Norwegian royalty). King Haakon I of Norway, Haakon the Good. King Haakon Magnusson of Norway. King
<sp33chy> Haakon II of Norway, Haakon Herdebrei. King Haakon III of Norway, Haakon Sverreson. King Haakon IV of Norway, Haakon the Old. King Haakon V of Norway, Haakon V Magnusson. King Haakon VI of Norway, Haakon VI Magnusson. King Haakon VII of Norway, Christian Frederik Carl Georg Valdemar Axel. Haakon, Crown Prince of Norway, @ http://en.wikipedia.org/wiki/H%C3%A5kon
<speechles> !w Hkon#HkonHkonHkonHkon
<sp33chy> Wikipedia Error: Manual Sub-tag (H.C3.A5konH.C3.A5konH.C3.A5konH.C3.A5kon) not found in body of html @ http://en.wikipedia.org/wiki/H%C3%A5kon . |
Finally this script is able to cope with any language as far as input or output. You can tell by the 4th query in the quote above that I've also added the ability for wikipedia/wikimedia to seamlessly handle encodings on the fly. Subtag look-ups are now fully decoded when displayed. Now when searching for a subtag you can use the decoded text and the script will encode your subtag for comparison, this works beautifully. It makes it invisible and seamless to the user, the script is actually doing alot behind the scenes decoding/encoding things.
Get the new script HERE <v1.9.8>.
If you spot any bugs, or obvious short comings, please feel free to shout them out and get them corrected. Remember to 'Have a fun' and 'Enjoy'.
EDIT: For those curious, and for posterity.. here is the code added: | Code: | # URL Encode
# Encodes anything not a-zA-z0-9 into %00 strings...
#
proc urlencode {text type} {
set url ""
foreach byte [split [encoding convertto utf-8 $text] ""] {
scan $byte %c i
if {[string match {[%<>"]} $byte] || $i <= [expr {32 + $type * 32}] || $i > 127} {
append url [format %%%02X $i]
} else {
append url $byte
}
}
if {$type == 1} {
return [string map {%2D - %2E . % .} $url]
} else {
return $url
}
}
# Wikipedia/Wikimedia subtag-decoder...
# decodes those silly subtags
#
proc subtagDecode {text} {
set url ""
regsub -all {\+} $text { } text
regsub -all {[][\\\$]} $text {\\&} text
regsub -all {\.([0-9a-fA-F][0-9a-fA-F])} $text {[format %c 0x\1]} text
set text [subst $text]
regsub -all "\r\n" $text "\n" text
foreach byte [split [encoding convertto utf-8 $text] ""] {
scan $byte %c i
if { $i <= 32 } {
append url [format %%%02X $i]
} else {
append url $byte
}
}
return [string map {% .} $url]
} |
Last edited by speechles on Mon Jan 28, 2008 4:57 pm; edited 1 time in total |
|
| Back to top |
|
 |
Domin Halfop

Joined: 10 Jun 2006 Posts: 72
|
Posted: Mon Jan 28, 2008 3:51 pm Post subject: |
|
|
Wouldt it be possible to have the script look for english wiki when it dont find any in local language ?
That wouldt be a nice feature
Thanks for the greate script  _________________ Regards
Domin @ efnet |
|
| Back to top |
|
 |
speechles Revered One

Joined: 26 Aug 2006 Posts: 1398 Location: emerald triangle, california (coastal redwoods)
|
Posted: Mon Jan 28, 2008 4:08 pm Post subject: |
|
|
| Domin wrote: | Wouldt it be possible to have the script look for english wiki when it dont find any in local language ?
That wouldt be a nice feature  |
Not sure that I understand what you mean. If you mean if the word isn't found in some language instead look on english wikipedia? You realize this would take a very very long time don't you? Because wikipedia involves traversal and redirect hunting it isn't practical for me, unless I am misunderstanding what you mean. Perhaps you can clarify?
Keep in mind you can already search any wikipedia language using the country switch, which is why I'm not sure what you mean.. heh  |
|
| Back to top |
|
 |
Domin Halfop

Joined: 10 Jun 2006 Posts: 72
|
Posted: Tue Jan 29, 2008 2:33 pm Post subject: |
|
|
Well what i ment was the if i set the local switch for wiki to "da" wich is my native language, and then do:
!wiki tcpip
there will be no results, but if i just set it to "en" it will find english descriptions
What i was suggestion was that you couldt use local first and if non is found it will display the english version if that exists.
But i have no idea how big a task this is, it was just a suggestion from my part  _________________ Regards
Domin @ efnet |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|