egghelp.org community Forum Index
[ egghelp.org home | forum home ]
egghelp.org community
Discussion of eggdrop bots, shell accounts and tcl scripts.
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

More info on the crash "glibc detected"

 
Post new topic   Reply to topic    egghelp.org community Forum Index -> Eggdrop Help
View previous topic :: View next topic  
Author Message
JPB
Voice


Joined: 04 Jul 2011
Posts: 2

PostPosted: Mon Jul 04, 2011 1:45 pm    Post subject: More info on the crash "glibc detected" Reply with quote

Folks -

Here's more info on the crash in eggdrop. What you are seeing is 'glibc detected' - meaning that your C library you use is detecting a memory corruption issue. This is a good thing, because memory should not be corrupted.

I have a system built from scratch, and I can recreate the crashing bots at will by using Tcl/Tk 8.5.10. If I back out to Tcl/Tk 8.5.9, everything works fine and dandy. It just takes minutes for me to swap back and forth, compiling and installing Tcl/Tk, then rebuilding Eggdrop. I do know that my header files are updating properly, etc. There appears to be some sort of issue with Eggdrop and the latest TCL - what, I do not know.

But this is why some people see a problem, and some don't. Many haven't upgraded to Tcl 8.5.10 yet.
Back to top
View user's profile Send private message
nml375
Revered One


Joined: 04 Aug 2006
Posts: 2835

PostPosted: Mon Jul 04, 2011 2:12 pm    Post subject: Reply with quote

Hi JPB,
Could you try to get a coredump and do a backtrace on the crash?
See if the crash occurs within the add_builtins function in tclhash.c, if that's the case, then something must've been broken in the Tcl_ScanElement/Tcl_ConvertElement function pair of v8.5.10 (which seems to have been heavily re-written in 8.5.10).
_________________
NML_375, idling at #eggdrop@IrcNET
Back to top
View user's profile Send private message
nml375
Revered One


Joined: 04 Aug 2006
Posts: 2835

PostPosted: Mon Jul 04, 2011 2:40 pm    Post subject: Reply with quote

One more thing,
Could you try editing the add_builtins function (tclhash.c) like below, and see if that sorts the issue with tcl8.5.10

Code:
void add_builtins(tcl_bind_list_t *tl, cmd_t *cc)
{
  int k, i;
  char p[1024], *l;
  cd_tcl_cmd table[2];

  table[0].name = p;
  table[0].callback = tl->func;
  table[1].name = NULL;
  for (i = 0; cc[i].name; i++) {
    egg_snprintf(p, sizeof p, "*%s:%s", tl->name,
                 cc[i].funcname ? cc[i].funcname : cc[i].name);
    k = TCL_DONT_USE_BRACES;
    l = nmalloc(Tcl_ScanElement(p, &k));
    Tcl_ConvertElement(p, l, k | TCL_DONT_USE_BRACES);
    table[0].cdata = (void *) cc[i].func;
    add_cd_tcl_cmds(table);
    bind_bind_entry(tl, cc[i].flags, cc[i].name, l);
    nfree(l);
  }
}

_________________
NML_375, idling at #eggdrop@IrcNET
Back to top
View user's profile Send private message
JPB
Voice


Joined: 04 Jul 2011
Posts: 2

PostPosted: Mon Jul 04, 2011 3:33 pm    Post subject: I had the same stack trace.... Reply with quote

in tclHash as the other users have reported.

I tried your change; it did not help. Still crashes in call to nfree in add_builtins, that you already know about.

Want any more data?
Back to top
View user's profile Send private message
nml375
Revered One


Joined: 04 Aug 2006
Posts: 2835

PostPosted: Mon Jul 04, 2011 4:34 pm    Post subject: Reply with quote

Unfortunately, I still can't reproduce this with tcl8.5.10.
Could you once again modify the add_builtins function as below, and then post the added debug output here?

Code:
void add_builtins(tcl_bind_list_t *tl, cmd_t *cc)
{
  int k, i, size;
  char p[1024], *l;
  cd_tcl_cmd table[2];

  table[0].name = p;
  table[0].callback = tl->func;
  table[1].name = NULL;
  for (i = 0; cc[i].name; i++) {
    egg_snprintf(p, sizeof p, "*%s:%s", tl->name,
                 cc[i].funcname ? cc[i].funcname : cc[i].name);
    size = Tcl_ScanElement(p, &k);
    putlog(LOG_MISC, "*", "Allocating %u bytes for builtin \"%s\", flags: %u", size, p, k);
    l = nmalloc(size);
    Tcl_ConvertElement(p, l, k | TCL_DONT_USE_BRACES);
    table[0].cdata = (void *) cc[i].func;
    add_cd_tcl_cmds(table);
    bind_bind_entry(tl, cc[i].flags, cc[i].name, l);
    nfree(l);
  }
}

_________________
NML_375, idling at #eggdrop@IrcNET
Back to top
View user's profile Send private message
thommey
Halfop


Joined: 01 Apr 2008
Posts: 73

PostPosted: Fri Jul 08, 2011 5:53 pm    Post subject: Reply with quote

Hey,

I tracked down the bug and it happens because of a behavioural change between Tcl8.5.9 and Tcl8.5.10.
(If you care for details: Tcl_ScanElement used to overestimate the required space, it was rewritten and apparently doesn't do that always anymore. Whether or not the terminating '\0' for strings is included in the estimate is the issue here, eggdrop's code assumes it is while the real return values of Tcl_ScanElement indicate otherwise.)

Here's a patch (patch -p1 < this.patch) to fix the issue:

Code:

diff -urN eggdrop1.6.20/src/tclhash.c eggdrop1.6.20.fix/src/tclhash.c
--- eggdrop1.6.20/src/tclhash.c   2010-06-29 17:52:24.000000000 +0200
+++ eggdrop1.6.20.fix/src/tclhash.c   2011-07-08 23:45:37.000000000 +0200
@@ -1264,7 +1264,7 @@
   for (i = 0; cc[i].name; i++) {
     egg_snprintf(p, sizeof p, "*%s:%s", tl->name,
                  cc[i].funcname ? cc[i].funcname : cc[i].name);
-    l = nmalloc(Tcl_ScanElement(p, &k));
+    l = nmalloc(Tcl_ScanElement(p, &k)+1);
     Tcl_ConvertElement(p, l, k | TCL_DONT_USE_BRACES);
     table[0].cdata = (void *) cc[i].func;
     add_cd_tcl_cmds(table);
@@ -1282,7 +1282,7 @@
   for (i = 0; cc[i].name; i++) {
     egg_snprintf(p, sizeof p, "*%s:%s", table->name,
                  cc[i].funcname ? cc[i].funcname : cc[i].name);
-    l = nmalloc(Tcl_ScanElement(p, &k));
+    l = nmalloc(Tcl_ScanElement(p, &k)+1);
     Tcl_ConvertElement(p, l, k | TCL_DONT_USE_BRACES);
     Tcl_DeleteCommand(interp, p);
     unbind_bind_entry(table, cc[i].flags, cc[i].name, l);


This has been fixed in Eggdrop1.6.21, please upgrade instead.


Last edited by thommey on Mon Nov 07, 2011 8:20 pm; edited 1 time in total
Back to top
View user's profile Send private message
LadyCuddles
Voice


Joined: 12 Jul 2011
Posts: 3
Location: SLC, Utah, USA

PostPosted: Tue Jul 12, 2011 5:57 pm    Post subject: Reply with quote

Can someone post the deb package with the patch already in it? That way those of us who prefer not to make/make install, and download the -dev packages, can just get the updated bot as we usually do...

Thanks for any, and all, help Smile
Back to top
View user's profile Send private message
thommey
Halfop


Joined: 01 Apr 2008
Posts: 73

PostPosted: Tue Jul 12, 2011 10:02 pm    Post subject: Reply with quote

Both Debian (unstable) and Ubuntu (oneiric) are still serving an Eggdrop version that has been superseded by a new release for about year now. They seem to be unmaintained or at least not interested in keeping up-to-date. Are there inofficial sources for ".deb"s for Eggdrop packages in their latest stable release (1.6.20)? Otherwise you'd just be patching an old Eggdrop version with a fix and probably keep doing this over and over again with bugs that get fixed in Eggdrop releases. Maybe a better place to ask for an update and/or patched version is the Debian bugtracker? (because of a higher chance of success and the greater benefit as everyone who's using their package will see this bug with the new tcl release)
Back to top
View user's profile Send private message
LadyCuddles
Voice


Joined: 12 Jul 2011
Posts: 3
Location: SLC, Utah, USA

PostPosted: Thu Jul 14, 2011 7:07 am    Post subject: Reply with quote

thommey, I don't profess to be a guru when it comes to compiling, nor a c coder, or for that matter, a master of decyphering diff output, but, your fix is made in the rem_ routine, and not the add_ routine, am I correct? And from what I can tell, the only change is being made ONLY in the rem_ routine by adding the "+1", right???

I am trying out going for the source tarball, since as you said, the deb packages are almost a full minor version behind.
Back to top
View user's profile Send private message
thommey
Halfop


Joined: 01 Apr 2008
Posts: 73

PostPosted: Thu Jul 14, 2011 12:47 pm    Post subject: Reply with quote

The fix has to be applied to both methods, add_ and rem_. The full diff is visible with enough context to see the function names here (against eggdrop1.8cvs, so don't worry about slight differences and line numbers being off):

http://cvs.eggheads.org/viewvc/eggdrop1.8/src/tclhash.c?r1=1.3.2.3&r2=1.3.2.4&diff_format=l

The fix is indeed just adding +1 in those two spots.
Back to top
View user's profile Send private message
fatalerror
Voice


Joined: 24 Jul 2011
Posts: 1

PostPosted: Sun Jul 24, 2011 1:01 pm    Post subject: Debian packages Reply with quote

Hi there!

I am the Debian developer in charge for the eggdrop Debian package. I'm terribly sorry it took me so long to package eggdrop 1.6.20 and to notice this bug in particular. Please do not take this as lack of interest; life hasn't been easy for the last couple of years, but I intend to keep closer contact from now on.

The x86 .deb files for eggdrop 1.6.20 can already be downloaded from http://people.debian.org/~gpastore

These packages have just been uploaded to the Debian Archive and should land on unstable/sid shortly. They've also been uploaded with the urgency attribute set to 'high', so that the fix reaches testing/wheezy soon enough.
Back to top
View user's profile Send private message
Rynet
Voice


Joined: 12 Jun 2007
Posts: 4

PostPosted: Fri Sep 30, 2011 9:30 pm    Post subject: Reply with quote

Run "export MALLOC_CHECK_=4" and it will work.
Back to top
View user's profile Send private message
nml375
Revered One


Joined: 04 Aug 2006
Posts: 2835

PostPosted: Sat Oct 01, 2011 10:21 am    Post subject: Reply with quote

Actually, that's just sweeping the problem under the rug, and hoping things won't break later on. The bug is well known, and it has been patched/fixed.
Telling malloc/free to ignore the issue will cause crashes further down the execution on certain system setups (see http://forum.egghelp.org/viewtopic.php?t=18528#97131).

Just to emphasize what thommey already pointed out, the eggdrops provided by Ubuntu still uses 1.6.19 (havn't checked the status of Debian though), and there's been quite a few other bugfixes since then. If a patched 1.6.20 package is not available for your distribution, you'd almost always be better off compiling the bot yourself (with the patch applied).
_________________
NML_375, idling at #eggdrop@IrcNET
Back to top
View user's profile Send private message
neofutur
Voice


Joined: 02 Oct 2009
Posts: 6
Location: irc://chat.freenode.net#bitcoin-hosting

PostPosted: Mon May 28, 2012 4:37 pm    Post subject: bugreport on gentoo Reply with quote

I just hit this bug, and filed a bugreport on the gentoo bugtracking tool :

eggdrop crash after upgrading to tcl-8.5.10-r1

1.6.21 is already available on gentoo :
http://packages.gentoo.org/package/net-irc/eggdrop
but still marked as unstable.

if anyone hit the problem on gentoo , 2 solutions :

* downgrade tcl to dev-lang/tcl-8.5.9 for eggdrop-1.6.19
* unmask the "unstable" eggdrop-1.6.21

feel free to post on the gentoo bugreport to have the package maintainers bump the stable version to eggdrop-1.6.21 Wink
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    egghelp.org community Forum Index -> Eggdrop Help All times are GMT - 4 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Forum hosting provided by Reverse.net

Powered by phpBB © 2001, 2005 phpBB Group
subGreen style by ktauber