View previous topic :: View next topic |
Author |
Message |
JPB Voice
Joined: 04 Jul 2011 Posts: 2
|
Posted: Mon Jul 04, 2011 1:45 pm Post subject: More info on the crash "glibc detected" |
|
|
Folks -
Here's more info on the crash in eggdrop. What you are seeing is 'glibc detected' - meaning that your C library you use is detecting a memory corruption issue. This is a good thing, because memory should not be corrupted.
I have a system built from scratch, and I can recreate the crashing bots at will by using Tcl/Tk 8.5.10. If I back out to Tcl/Tk 8.5.9, everything works fine and dandy. It just takes minutes for me to swap back and forth, compiling and installing Tcl/Tk, then rebuilding Eggdrop. I do know that my header files are updating properly, etc. There appears to be some sort of issue with Eggdrop and the latest TCL - what, I do not know.
But this is why some people see a problem, and some don't. Many haven't upgraded to Tcl 8.5.10 yet. |
|
Back to top |
|
 |
nml375 Revered One
Joined: 04 Aug 2006 Posts: 2857
|
Posted: Mon Jul 04, 2011 2:12 pm Post subject: |
|
|
Hi JPB,
Could you try to get a coredump and do a backtrace on the crash?
See if the crash occurs within the add_builtins function in tclhash.c, if that's the case, then something must've been broken in the Tcl_ScanElement/Tcl_ConvertElement function pair of v8.5.10 (which seems to have been heavily re-written in 8.5.10). _________________ NML_375, idling at #eggdrop@IrcNET |
|
Back to top |
|
 |
nml375 Revered One
Joined: 04 Aug 2006 Posts: 2857
|
Posted: Mon Jul 04, 2011 2:40 pm Post subject: |
|
|
One more thing,
Could you try editing the add_builtins function (tclhash.c) like below, and see if that sorts the issue with tcl8.5.10
Code: | void add_builtins(tcl_bind_list_t *tl, cmd_t *cc)
{
int k, i;
char p[1024], *l;
cd_tcl_cmd table[2];
table[0].name = p;
table[0].callback = tl->func;
table[1].name = NULL;
for (i = 0; cc[i].name; i++) {
egg_snprintf(p, sizeof p, "*%s:%s", tl->name,
cc[i].funcname ? cc[i].funcname : cc[i].name);
k = TCL_DONT_USE_BRACES;
l = nmalloc(Tcl_ScanElement(p, &k));
Tcl_ConvertElement(p, l, k | TCL_DONT_USE_BRACES);
table[0].cdata = (void *) cc[i].func;
add_cd_tcl_cmds(table);
bind_bind_entry(tl, cc[i].flags, cc[i].name, l);
nfree(l);
}
} |
_________________ NML_375, idling at #eggdrop@IrcNET |
|
Back to top |
|
 |
JPB Voice
Joined: 04 Jul 2011 Posts: 2
|
Posted: Mon Jul 04, 2011 3:33 pm Post subject: I had the same stack trace.... |
|
|
in tclHash as the other users have reported.
I tried your change; it did not help. Still crashes in call to nfree in add_builtins, that you already know about.
Want any more data? |
|
Back to top |
|
 |
nml375 Revered One
Joined: 04 Aug 2006 Posts: 2857
|
Posted: Mon Jul 04, 2011 4:34 pm Post subject: |
|
|
Unfortunately, I still can't reproduce this with tcl8.5.10.
Could you once again modify the add_builtins function as below, and then post the added debug output here?
Code: | void add_builtins(tcl_bind_list_t *tl, cmd_t *cc)
{
int k, i, size;
char p[1024], *l;
cd_tcl_cmd table[2];
table[0].name = p;
table[0].callback = tl->func;
table[1].name = NULL;
for (i = 0; cc[i].name; i++) {
egg_snprintf(p, sizeof p, "*%s:%s", tl->name,
cc[i].funcname ? cc[i].funcname : cc[i].name);
size = Tcl_ScanElement(p, &k);
putlog(LOG_MISC, "*", "Allocating %u bytes for builtin \"%s\", flags: %u", size, p, k);
l = nmalloc(size);
Tcl_ConvertElement(p, l, k | TCL_DONT_USE_BRACES);
table[0].cdata = (void *) cc[i].func;
add_cd_tcl_cmds(table);
bind_bind_entry(tl, cc[i].flags, cc[i].name, l);
nfree(l);
}
} |
_________________ NML_375, idling at #eggdrop@IrcNET |
|
Back to top |
|
 |
thommey Halfop
Joined: 01 Apr 2008 Posts: 74
|
Posted: Fri Jul 08, 2011 5:53 pm Post subject: |
|
|
Hey,
I tracked down the bug and it happens because of a behavioural change between Tcl8.5.9 and Tcl8.5.10.
(If you care for details: Tcl_ScanElement used to overestimate the required space, it was rewritten and apparently doesn't do that always anymore. Whether or not the terminating '\0' for strings is included in the estimate is the issue here, eggdrop's code assumes it is while the real return values of Tcl_ScanElement indicate otherwise.)
Here's a patch (patch -p1 < this.patch) to fix the issue:
Code: |
diff -urN eggdrop1.6.20/src/tclhash.c eggdrop1.6.20.fix/src/tclhash.c
--- eggdrop1.6.20/src/tclhash.c 2010-06-29 17:52:24.000000000 +0200
+++ eggdrop1.6.20.fix/src/tclhash.c 2011-07-08 23:45:37.000000000 +0200
@@ -1264,7 +1264,7 @@
for (i = 0; cc[i].name; i++) {
egg_snprintf(p, sizeof p, "*%s:%s", tl->name,
cc[i].funcname ? cc[i].funcname : cc[i].name);
- l = nmalloc(Tcl_ScanElement(p, &k));
+ l = nmalloc(Tcl_ScanElement(p, &k)+1);
Tcl_ConvertElement(p, l, k | TCL_DONT_USE_BRACES);
table[0].cdata = (void *) cc[i].func;
add_cd_tcl_cmds(table);
@@ -1282,7 +1282,7 @@
for (i = 0; cc[i].name; i++) {
egg_snprintf(p, sizeof p, "*%s:%s", table->name,
cc[i].funcname ? cc[i].funcname : cc[i].name);
- l = nmalloc(Tcl_ScanElement(p, &k));
+ l = nmalloc(Tcl_ScanElement(p, &k)+1);
Tcl_ConvertElement(p, l, k | TCL_DONT_USE_BRACES);
Tcl_DeleteCommand(interp, p);
unbind_bind_entry(table, cc[i].flags, cc[i].name, l);
|
This has been fixed in Eggdrop1.6.21, please upgrade instead.
Last edited by thommey on Mon Nov 07, 2011 8:20 pm; edited 1 time in total |
|
Back to top |
|
 |
LadyCuddles Voice
Joined: 12 Jul 2011 Posts: 3 Location: SLC, Utah, USA
|
Posted: Tue Jul 12, 2011 5:57 pm Post subject: |
|
|
Can someone post the deb package with the patch already in it? That way those of us who prefer not to make/make install, and download the -dev packages, can just get the updated bot as we usually do...
Thanks for any, and all, help  |
|
Back to top |
|
 |
thommey Halfop
Joined: 01 Apr 2008 Posts: 74
|
Posted: Tue Jul 12, 2011 10:02 pm Post subject: |
|
|
Both Debian (unstable) and Ubuntu (oneiric) are still serving an Eggdrop version that has been superseded by a new release for about year now. They seem to be unmaintained or at least not interested in keeping up-to-date. Are there inofficial sources for ".deb"s for Eggdrop packages in their latest stable release (1.6.20)? Otherwise you'd just be patching an old Eggdrop version with a fix and probably keep doing this over and over again with bugs that get fixed in Eggdrop releases. Maybe a better place to ask for an update and/or patched version is the Debian bugtracker? (because of a higher chance of success and the greater benefit as everyone who's using their package will see this bug with the new tcl release) |
|
Back to top |
|
 |
LadyCuddles Voice
Joined: 12 Jul 2011 Posts: 3 Location: SLC, Utah, USA
|
Posted: Thu Jul 14, 2011 7:07 am Post subject: |
|
|
thommey, I don't profess to be a guru when it comes to compiling, nor a c coder, or for that matter, a master of decyphering diff output, but, your fix is made in the rem_ routine, and not the add_ routine, am I correct? And from what I can tell, the only change is being made ONLY in the rem_ routine by adding the "+1", right???
I am trying out going for the source tarball, since as you said, the deb packages are almost a full minor version behind. |
|
Back to top |
|
 |
thommey Halfop
Joined: 01 Apr 2008 Posts: 74
|
|
Back to top |
|
 |
fatalerror Voice
Joined: 24 Jul 2011 Posts: 1
|
Posted: Sun Jul 24, 2011 1:01 pm Post subject: Debian packages |
|
|
Hi there!
I am the Debian developer in charge for the eggdrop Debian package. I'm terribly sorry it took me so long to package eggdrop 1.6.20 and to notice this bug in particular. Please do not take this as lack of interest; life hasn't been easy for the last couple of years, but I intend to keep closer contact from now on.
The x86 .deb files for eggdrop 1.6.20 can already be downloaded from http://people.debian.org/~gpastore
These packages have just been uploaded to the Debian Archive and should land on unstable/sid shortly. They've also been uploaded with the urgency attribute set to 'high', so that the fix reaches testing/wheezy soon enough. |
|
Back to top |
|
 |
Rynet Voice
Joined: 12 Jun 2007 Posts: 4
|
Posted: Fri Sep 30, 2011 9:30 pm Post subject: |
|
|
Run "export MALLOC_CHECK_=4" and it will work. |
|
Back to top |
|
 |
nml375 Revered One
Joined: 04 Aug 2006 Posts: 2857
|
Posted: Sat Oct 01, 2011 10:21 am Post subject: |
|
|
Actually, that's just sweeping the problem under the rug, and hoping things won't break later on. The bug is well known, and it has been patched/fixed.
Telling malloc/free to ignore the issue will cause crashes further down the execution on certain system setups (see http://forum.egghelp.org/viewtopic.php?t=18528#97131).
Just to emphasize what thommey already pointed out, the eggdrops provided by Ubuntu still uses 1.6.19 (havn't checked the status of Debian though), and there's been quite a few other bugfixes since then. If a patched 1.6.20 package is not available for your distribution, you'd almost always be better off compiling the bot yourself (with the patch applied). _________________ NML_375, idling at #eggdrop@IrcNET |
|
Back to top |
|
 |
neofutur Voice

Joined: 02 Oct 2009 Posts: 6 Location: irc://chat.freenode.net#bitcoin-hosting
|
Posted: Mon May 28, 2012 4:37 pm Post subject: bugreport on gentoo |
|
|
I just hit this bug, and filed a bugreport on the gentoo bugtracking tool :
eggdrop crash after upgrading to tcl-8.5.10-r1
1.6.21 is already available on gentoo :
http://packages.gentoo.org/package/net-irc/eggdrop
but still marked as unstable.
if anyone hit the problem on gentoo , 2 solutions :
* downgrade tcl to dev-lang/tcl-8.5.9 for eggdrop-1.6.19
* unmask the "unstable" eggdrop-1.6.21
feel free to post on the gentoo bugreport to have the package maintainers bump the stable version to eggdrop-1.6.21  |
|
Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|