This is the new home of the egghelp.org community forum.
All data has been migrated (including user logins/passwords) to a new phpBB version.


For more information, see this announcement post. Click the X in the top right-corner of this box to dismiss this message.

UTF-8 input being garbled [solved]

General support and discussion of Eggdrop bots.
Post Reply
a
artaslove
Voice
Posts: 4
Joined: Tue Jun 14, 2016 3:43 pm

UTF-8 input being garbled [solved]

Post by artaslove »

Hello,

I'm running eggdrop 1.8, patched to include UTF-8 support. My system LANG is set to UTF-8. My chat client supports UTF-8. The server I'm connecting to supports UTF-8. I have no troubles at all sending any sort of valid UTF-8 character to the channel my bot is connected to.

I have a simple script that connects to Microsoft translate and returns the translated result. It is working for many languages already. However when translating from Russian, Japanese, Chinese and possibly others it encounters a problem.

in my proc I experimented with "putlog $text" immediately at the beginning of the proc, where $text is everything the bot thinks the user has entered for that binding.

For example:

User enters: "!trans ru|en очень хорошо"
logfile shows: "ru|en >G5=L E>@>H>"

Naturally this does not work out to be the correct translation.

If I urlencode очень хорошо to %D0%BE%D1%87%D0%B5%D0%BD%D1%8C%20%D1%85%D0%BE%D1%80%D0%BE%D1%88%D0%BE and send it to the translator I get the expected result of "very well".

For some reason some UTF-8 characters work for example:
User enters: "!trans fr|en très bien"
logfile shows: "fr|en tr▒s bien"

However the character is correctly percent-encoded to %C8%A8 in that case and I get the expected result of "very well".

Thanks for any insight.
Last edited by artaslove on Tue Jun 14, 2016 9:32 pm, edited 1 time in total.
a
artaslove
Voice
Posts: 4
Joined: Tue Jun 14, 2016 3:43 pm

Post by artaslove »

Solved by http://forum.egghelp.org/viewtopic.php?t=18879

I was trying to use the solution at http://eggwiki.org/Bugs/Utf-8 which did not work in my case.
a
artaslove
Voice
Posts: 4
Joined: Tue Jun 14, 2016 3:43 pm

Post by artaslove »

Upon further investigation, while the script posted above did solve the issue with utf-8 input being garbled, it introduced some other problems with utf-8 output that I am still working out.
a
artaslove
Voice
Posts: 4
Joined: Tue Jun 14, 2016 3:43 pm

Post by artaslove »

I ended up getting the latest eggdrop 1.8 from github, which doesn't require the script above.

Everything is working well now.
Post Reply