| View previous topic :: View next topic |
| Author |
Message |
chronoz Voice
Joined: 19 May 2011 Posts: 2
|
Posted: Thu May 19, 2011 4:21 pm Post subject: Issues with TCL encoding on Eggdrop |
|
|
I have installed Eggdrop on a new Debian server, but it keeps having issues with processing special characters.
Eggdrop is running utf-8. I have even manually enforced TCL encoding to utf-8 in the script. And I have tried recompiling Eggdrop with instructions from http://eggwiki.org/Utf-8.
| Code: | 22:00 <@me> !tr fr I have prepared lots of cookies for the entire family.
22:00 <@bot> J'ai préparé beaucoup de biscuits pour toute la famille.
22:00 <@me> !tr ar The special characters are processed.
22:00 <@bot> êêÃE ÃEùçÃDìé çÃDãÃÂñÃA çÃDîçõé. |
(Also see a previous Question asked, that did not get solved: http://stackoverflow.com/questions/6008280/issues-with-tcl-encoding-on-eggdrop)
| Code: | namespace eval gTranslator {
# Factor this out into a helper
proc getJson url {
set tok [http::geturl $url]
set res [json::json2dict [http::data $tok]]
http::cleanup $tok
return $res
}
# How to decode _decimal_ entities; WARNING: high magic factor within!
proc decodeEntities str {
set str [string map {\[ {\[} \] {\]} \$ {\$} \\ \\\\} $str]
subst [regsub -all {&#(\d+);} $str {[format %c \1]}]
}
bind pub - !tr gTranslator::translate
proc translate { nick uhost handle chan text } {
package require http
package require json
set lngto [string tolower [lindex [split $text] 0]]
set text [http::formatQuery q [join [lrange [split $text] 1 end]]]
set dturl "http://ajax.googleapis.com/ajax/services/language/detect?v=1.0&q=$text"
set lng [dict get [getJson $dturl] responseData language]
if { $lng == $lngto } {
putserv "PRIVMSG $chan :\002Error\002 translating $lng to $lngto."
return 0
}
set trurl "http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&langpair=$lng%7c$lngto&$text"
putlog $trurl
set res [getJson $trurl]
putlog $res
#putserv "PRIVMSG $chan :Language detected: $lng"
set translated [decodeEntities [dict get $res responseData translatedText]]
putserv "PRIVMSG $chan :[encoding convertto utf-8 $translated]"
}
} |
|
|
| Back to top |
|
 |
speechles Revered One

Joined: 26 Aug 2006 Posts: 1398 Location: emerald triangle, california (coastal redwoods)
|
Posted: Fri May 20, 2011 4:54 pm Post subject: |
|
|
| Quote: | <speechl3s> !tr @fr I am a stranger --swapo
<sp33chy> Google: (auto->fr) Romanian to French translation
<sp33chy> Je suis un étranger
<speechl3s> !tr @fr I am a stranger
<sp33chy> Google: (auto->fr) Romanian to French translation
<sp33chy> Je suis un étranger
<speechl3s> !tr @zh I am a stranger
<sp33chy> Google: (auto->zh) Malay to Chinese translation
<sp33chy> 我是一个蠕虫
<speechl3s> !tr en@zh I am a stranger
<sp33chy> Google: (en->zh) English to Chinese translation
<sp33chy> 我是个陌生人
<speechl3s> !tr @ar I am a stranger
<sp33chy> Google: (auto->ar) English to Arabic translation
<sp33chy> أنا غريب
<speechl3s> !tr @fr I have prepared lots of cookies for the entire family.
<sp33chy> Google: (auto->fr) English to French translation
<sp33chy> J'ai préparé beaucoup de biscuits pour toute la famille.
<speechl3s> !tr @fr I have prepared lots of cookies for the entire family. --swapo
<sp33chy> Google: (auto->fr) English to French translation
<sp33chy> J'ai préparé beaucoup de biscuits pour toute la famille.
<speechl3s> !tr @ar The special characters are processed.
<sp33chy> Google: (auto->ar) English to Arabic translation
<sp33chy> تتم معالجة الأحرف الخاصة.
<speechl3s> !tr تتم معالجة الأحرف الخاصة
<sp33chy> Google: (auto->en) Arabic to English translation
<sp33chy> Special characters are processed |
Let me (the google magician) explain a few things. Notice my translate script works. Correctly, and at the same time I can break it, and make it work like yours does.
So let me explain what you are doing wrong....
You've patched your bot. Hopefully correctly.. Let's move on..
| Code: | | set dturl "http://ajax.googleapis.com/ajax/services/language/detect?v=1.0&q=$text" |
First off, the url has no google encodings within it. Why do you not use &oe= and &ie= to define your input and output encodings to google? Also, your format query has "q" and $text within it. Why do you then add q=$text??! O_o;;?!
Change that to look like it does below...
| Code: | | set dturl "http://ajax.googleapis.com/ajax/services/language/detect?v=1.0&ie=utf-8&oe=utf-8$text" |
Then... | Code: | | set text [http::formatQuery q [join [lrange [split $text] 1 end]]] |
You http::formatQuery before you've ever configured your url encoding to utf-8.. WHY?
Change it to look like below...
| Code: | http::config -urlencoding {utf-8}
set text [http::formatQuery q [join [lrange [split $text] 1 end]]] |
Finally, and this is the one likely causing your issue...
| Code: | | putserv "PRIVMSG $chan :[encoding convertto utf-8 $translated]" |
Why are you converting encoding at all? Your bots system encoding is already utf-8 if you've properly patched your eggdrop. Doing that again will cause double conversion and mangle your characters..
Change this as well to look like below:
| Code: | | putserv "PRIVMSG $chan :$translated" |
Finally, change this line below | Code: | | package require http |
to this one.. | Code: | | package require http 2.5 |
url-encoding was added as an option in http version 2.5.
Doing all that I say, making these changes. You will notice your translate script now works. No thanks required. I've found out how to make perfect utf-8 several years ago  _________________ speechles' eggdrop tcl archive |
|
| Back to top |
|
 |
chronoz Voice
Joined: 19 May 2011 Posts: 2
|
Posted: Sat May 21, 2011 7:54 pm Post subject: |
|
|
Thanks! Removing the [encoding convertto utf-8 $translated] indeed solved the issue. I added it before recompiling eggdrop for utf-8 to see if it would solve the issue.
Recompiling eggdrop for utf-8 actually solved the issue, while adding [encoding convertto utf-8 $translated] broke it again.
Appreciate the thorough response! |
|
| Back to top |
|
 |
speechles Revered One

Joined: 26 Aug 2006 Posts: 1398 Location: emerald triangle, california (coastal redwoods)
|
Posted: Sat May 28, 2011 8:16 pm Post subject: |
|
|
Also, seems your script needs food badly, it is about to die. Remember, don't shoot food. Red elf needs food badly. Red elf is dead. When does this happen you ask? Well... deprecation has already begun in the form of rate limits. On December 1st, 2011 it's put to rest entirely.
See here -> http://code.google.com/apis/language/translate/overview.html
It's dead jim.
But keep in mind, that there is another way...
"It is important to stress that the Translate API is not the Google Translate web site, nor the Google Translate Web element."
So Incith:Google's translate function will continue to function as it always has...
/me smiles gleefully and feels a bit sad for chronoz at the same time.. _________________ speechles' eggdrop tcl archive |
|
| Back to top |
|
 |
|