This is the new home of the egghelp.org community forum.
All data has been migrated (including user logins/passwords) to a new phpBB version.


For more information, see this announcement post. Click the X in the top right-corner of this box to dismiss this message.

UTF-8 fix script

Support & discussion of released scripts, and announcements of new releases.
Post Reply
J
Johannes13
Halfop
Posts: 46
Joined: Sun Oct 10, 2010 11:38 am

UTF-8 fix script

Post by Johannes13 »

Ok, for people that can't use the utf-8 patch because they can't compile eggdrop (usually on shell providers)

Code: Select all

######
# Copyright Johannes Kuhn <#John @ quakenet>
# This fixes the utf-8 issue on an eggdrop without patch.
# Feel free to distribute and or use.
# No warranty.
#

# Background:
#
# The problem is that eggdrop sometimes treats things as utf-8 strings
# And sometimes as simple byte array.
# Almost each string is passed to the tcl interp
# Witch calls an eggdrop command and this calls again the eggdrop interp.
# When eggdrop passes a string to the interp, it calls Tcl_Eval.
# Tcl_Eval trats the input string as utf-8
# But when a eggdrop command is called, it only uses the lower 8 bit
# This leads to data loss.
# 
# 
# This script converts all data that should be passed to an eggdrop command
# to utf-8, so only the lowest 8 bit are used. When Tcl_Eval is called again
# it can convert the data back to utf.


package require Tcl 8.5

encoding system utf-8
# Ok, here is a problem:
# We need all eggdrop commands.
# The good thing is that all the eggdrop commands are in the global namespace.
# The difficulty is to disingush between eggdrop commands
# And Tcl commands.
# To find out if it is a tcl command I just create an other interp, look at the commands there
# and skip them
# To make sure that this works, source this script as first script.
# Otherwise there might be extra commands in the global namespace that we don't know.
proc initUtf8 {} {
	rename initUtf8 {}
	set i [interp create]
	set tcmds [interp eval $i {info commands}]
	interp delete $i
	set procs [info procs]
	foreach cmd [info commands] {
		if {	$cmd ni $tcmds && $cmd ni $procs
			&& "${cmd}_orig" ni [info commands] 
			&& ![string match *_orig $cmd]
		} {
			# Eggdrop command.
			rename $cmd ${cmd}_orig
			interp alias {} $cmd {} fixutf8 ${cmd}_orig
		}
	}
}
initUtf8
proc fixutf8 args {
	set cmd {}
	foreach arg $args {
		lappend cmd [encoding convertto utf-8 $arg]
	}
	catch {{*}$cmd} res opt
	dict incr opt -level
	return -opt $opt $res
}
It requires at least Tcl 8.5 (I can write a version for 8.4) and this script needs to be sourced after all modules you want to load has been loaded and before any other script is loaded.

It will increase the CPU and memory usage a bit, but better than no utf-8 patch, right?

PS.: module loading/unloading should not be done at runtime, because new commands are not replaced.

Ok, fixed some things: Don't replace *_orig commands with *_orig_orig. Should fix the .rehash
Last edited by Johannes13 on Sat Jul 13, 2013 6:15 pm, edited 3 times in total.
User avatar
spithash
Master
Posts: 248
Joined: Thu Jul 12, 2007 9:21 am
Location: Libera
Contact:

Post by spithash »

The script works like a charm my friend and I personally thank you for it :)

The only issue I have with it, is when I rehash the bot, it gives me this error and crashes the bot:

Code: Select all

[16:10] #spithash# rehash
Rehashing.
[16:10] Rehashing ...
[16:10] AllProtection v4.7 successfully unloaded...
[16:10] Désallocation des ressources de Public Quotes System...
[16:10] Listening at telnet port 2600 (all).
[16:10] Loading language "en" from language/gseen.en.lang...
[16:10] Tcl error in file 'eggdrop.conf':
[16:10] can't rename to "dcclist_orig": command already exists
    while executing
"rename $cmd ${cmd}_orig"
    (procedure "initUtf8" line 12)
    invoked from within
"initUtf8"
    (file "scripts/utf.tcl" line 53)
    invoked from within
"source scripts/utf.tcl"
    (file "eggdrop.conf" line 1390)
[16:10] * CONFIG FILE NOT LOADED (NOT FOUND, OR ERROR)
It works fine with a .restart though ;)
Libera ##rtlsdr & ##re - Nick: spithash
Click here for troll.tcl
J
Johannes13
Halfop
Posts: 46
Joined: Sun Oct 10, 2010 11:38 am

Post by Johannes13 »

Yeah, sorry about that.

I changed the script, which should fix that issue.

Note: you get problems with other scripts that creates tcl commands that are not procs (with TclOO, namespace ensemble, interp alias etc..)
User avatar
spithash
Master
Posts: 248
Joined: Thu Jul 12, 2007 9:21 am
Location: Libera
Contact:

Post by spithash »

That's great news :)

Where exactly is the fixed version of it?
Libera ##rtlsdr & ##re - Nick: spithash
Click here for troll.tcl
User avatar
caesar
Mint Rubber
Posts: 3776
Joined: Sun Oct 14, 2001 8:00 pm
Location: Mint Factory

Post by caesar »

He edited his first post, so it's safe to assume he overwrote the previous one. :)
Once the game is over, the king and the pawn go back in the same box.
User avatar
spithash
Master
Posts: 248
Joined: Thu Jul 12, 2007 9:21 am
Location: Libera
Contact:

Post by spithash »

Oh I see, thanks for the heads up ;)

Indeed, it works like a charm :)
Libera ##rtlsdr & ##re - Nick: spithash
Click here for troll.tcl
User avatar
Madalin
Master
Posts: 310
Joined: Fri Jun 24, 2005 11:36 am
Location: Constanta, Romania
Contact:

Post by Madalin »

It seems to be a problem i think. I loaded this tcl for utf-8 support before all my scripts (it works) but if i rehash the utf-8 character is replace by something else

For example i added ''Ñ'' i used list and it ok yet after rehash i had something like ''Ã&#131;Â&#131;Ã&#130;Â&#131;Ã&#131;Â&#130;Ã&#130;Â&#131;Ã&#131;Â&#131;Ã&#130;Â&#130;Ã&#131;Â&#130;Ã&#130;Â&#131;Ã&#131;Â&#131;Ã&#130;Â&#131;Ã&#131;Â&#130;Ã&#130;Â&#130;Ã&#131;Â&#131;Ã&#130;Â&#130;Ã&#131;Â&#130;Ã&#130;Â&#145;'' and at every rehash that list was getting biger and biger so anyone knows what the problem is?

Because im thinking more and more to compile an eggdrop using utf-8 instead of using this fix script
User avatar
spithash
Master
Posts: 248
Joined: Thu Jul 12, 2007 9:21 am
Location: Libera
Contact:

Post by spithash »

yes you always have to .restart

Let us wait there's anyone available to help with this issue.
Libera ##rtlsdr & ##re - Nick: spithash
Click here for troll.tcl
User avatar
spithash
Master
Posts: 248
Joined: Thu Jul 12, 2007 9:21 am
Location: Libera
Contact:

Post by spithash »

I think that after updating my 1.8 source to Eggdrop v1.8.0+preinit, the script stopped working because it now fully supports UTF now.

Thing is that I needed that script to enable utf because that way I didn't have bolding issues with the incith-google script :(
Libera ##rtlsdr & ##re - Nick: spithash
Click here for troll.tcl
User avatar
spithash
Master
Posts: 248
Joined: Thu Jul 12, 2007 9:21 am
Location: Libera
Contact:

Post by spithash »

Well I talked to thommey and he gave me this snapshot which is before the utf change in the source. I tried it and with this script I have utf again and bolding in incith-google. for whoever is interested in fetching it, here it is:

http://lib.so/eggdrop-1.8.prebytearray.tar.gz (dead link)

Enjoy.
Libera ##rtlsdr & ##re - Nick: spithash
Click here for troll.tcl
User avatar
CrazyCat
Revered One
Posts: 1215
Joined: Sun Jan 13, 2002 8:00 pm
Location: France
Contact:

Post by CrazyCat »

coming really late...

This script seems really usefull, can I post it on http://blog.eggdrop.fr ? A lot of users are borred with the utf-8 troubles.

@spithash: you seems to bee french-speaking, why aren't you on my forum ? :D
Post Reply