egghelp.org community Forum Index
[ egghelp.org home | forum home ]
egghelp.org community
Discussion of eggdrop bots, shell accounts and tcl scripts.
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Bots are not coming back from ping timeout
Goto page 1, 2  Next
 
Post new topic   Reply to topic    egghelp.org community Forum Index -> Eggdrop Help
View previous topic :: View next topic  
Author Message
eXtremer
Op


Joined: 07 May 2008
Posts: 138

PostPosted: Wed Sep 03, 2008 3:39 am    Post subject: Bots are not coming back from ping timeout Reply with quote

My Bots are not coming from ping timeout although I set the "autobotchk", it passed 45 minutes and still no rejoin, why ?

*** Quits: Bot1 (~Bot1@Bot1.users.undernet.org) (Ping timeout)
*** Quits: Bot2 (~Bot2@Bot2.users.undernet.org) (Ping timeout)

Here is what in the ".autobotchk703288692.32227" file:

Code:
 DO NOT EDIT THIS FILE - edit the master and reinstall.
# (.autobotchk2921987196.13566 installed on Fri Aug  1 13:44:55 2008)
# (Cron version -- $Id: crontab.c,v 2.13 1994/01/17 03:20:37 vixie Exp $)
# DO NOT EDIT THIS FILE - edit the master and reinstall.
# (.autobotchk4016117852.13588 installed on Sat May 24 15:17:50 2008)
# (Cron version -- $Id: crontab.c,v 2.13 1994/01/17 03:20:37 vixie Exp $)
0,30 * * * * /home/egg/eggdrop/Bo1.botchk
0,10,20,30,40,50 * * * * /home/egg/eggdrop/Bot2.botchk


What's wrong ?
Back to top
View user's profile Send private message
eXtremer
Op


Joined: 07 May 2008
Posts: 138

PostPosted: Wed Sep 03, 2008 3:39 am    Post subject: Reply with quote

Now I killed the pid from both Bots and manually started them Confused
Back to top
View user's profile Send private message
nml375
Revered One


Joined: 04 Aug 2006
Posts: 2829

PostPosted: Wed Sep 03, 2008 9:47 am    Post subject: Reply with quote

Sounds like you've got a broken script.
_________________
NML_375, idling at #eggdrop@IrcNET
Back to top
View user's profile Send private message
eXtremer
Op


Joined: 07 May 2008
Posts: 138

PostPosted: Wed Sep 03, 2008 1:59 pm    Post subject: Reply with quote

nml375 wrote:
Sounds like you've got a broken script.


What to do ? Shocked
Back to top
View user's profile Send private message
nml375
Revered One


Joined: 04 Aug 2006
Posts: 2829

PostPosted: Wed Sep 03, 2008 6:47 pm    Post subject: Reply with quote

Does this happen on a regular basis?
_________________
NML_375, idling at #eggdrop@IrcNET
Back to top
View user's profile Send private message
eXtremer
Op


Joined: 07 May 2008
Posts: 138

PostPosted: Thu Sep 04, 2008 2:07 am    Post subject: Reply with quote

nml375 wrote:
Does this happen on a regular basis?

Rarely, I saw 3-4 times in 3 months...
Back to top
View user's profile Send private message
nml375
Revered One


Joined: 04 Aug 2006
Posts: 2829

PostPosted: Thu Sep 04, 2008 9:45 am    Post subject: Reply with quote

That would make it rather hard tracking down the source of this behaviour :/
At best, try to see some pattern in the disconnects; ie some script or function being used every time prior the bot disconnects, etc.

I did write a watchdog-enabled botchk-script some long time ago, which should be able to handle "frozen" bots. I'll see if I can find those files lying around somewhere...
_________________
NML_375, idling at #eggdrop@IrcNET
Back to top
View user's profile Send private message
eXtremer
Op


Joined: 07 May 2008
Posts: 138

PostPosted: Fri Sep 05, 2008 3:22 am    Post subject: Reply with quote

nml375 wrote:
I did write a watchdog-enabled botchk-script some long time ago, which should be able to handle "frozen" bots. I'll see if I can find those files lying around somewhere...


I'll really apreciate Wink
Back to top
View user's profile Send private message
nml375
Revered One


Joined: 04 Aug 2006
Posts: 2829

PostPosted: Sat Sep 06, 2008 7:46 am    Post subject: Reply with quote

I was able to find the watchdog-enabled botchecker..
The watchdog.tcl script should be loaded in your bot, and will simply update the atime timestamp of the pidfile once every minute.

The botchk.tcl script should be invoked from crontab or such. It will do a series of tests to determine whether your eggdrop is still running, or if it's frozen, and tries to take appropriate measures.
Unfortunately, due to the way filesystem operation works, this script will also alter the atime timestamp whenever accessing the pidfile, and thus must not be called more often than the hung/gone/dead-time. Recommended is 6+ minutes, I use 10 myself. You might get away with 5 minutes.

There is a tweak in the end of the script that tries to restore the timestamp, however it depends on an external application (touch), and is thus commented out by default. Should touch be available, you'll be able to run the script more often.

watchdog.tcl
Code:
#Watchdog part of botchk.tcl
#Simply makes eggdrop update the accesstime for it's pidfile
#on a regular basis... (kinda like "touch")

#Settings:
# botpid [path/]pid.bot
#  Tells the script which file that is the pidfile...
#  This should be autodetected by the script at startup,
#  but under certain circumstances, it might fail..
#  IF it does fail, just set this var manually, and it'll
#  work just fine :)
#  PS. The name of the pidfile is pid.$botnet-nick, or; if
#  botnet-nick isn't set; pid.$nick   DS

#set botpid pid.lamestbot



if {![info exists botpid]} {
 if {[info exists botnet-nick]} {
  set botpid "pid.${botnet-nick}"
  putlog "Setting botpid to $botpid using \$botnet-nick"
 } elseif {[info exists nick] && $nick != ""} {
  set botpid "pid.$nick"
  putlog "Setting botpid to $botpid using \$nick"
 } else {
  error "Unable to determine the name of the pid-file!\nPlease check your config-file or watchdog.tcl script..."
 }
}
proc touch {file} {
 set fileID [open $file "RDONLY CREAT"]
 catch {gets $fileID}
 close $fileID
}

proc watchdog_touch {min hour day month year} {
 global botpid
 touch $botpid
# putlog "Touching $botpid"
}

bind time - "* * * * *" watchdog_touch


botchk.tcl
Code:
#!/usr/bin/tclsh
### settings ###
#pidfile "pidfile of bot"
#set pidfile pid.botname

#userfile "userfile of bot"
#set userfile botname.user

#configfile "configfile of bot"
#set configfile botconfig

#botdir "home of your eggdrop"
#set botdir /home/somewhere

#silent 0/1
# Should we write eggdrop's output from it's startup to stdout?
set silent 0

#lockfile "file to prevent start of your eggdrop"
# Use this whenever you want to stop your bot for a longer while...
set lockfile /home/somewhere/some.lock

### Code ###
#Lets just check that our friendly user supplied all required settings :)
foreach {var what} [list pidfile "Name of pidfile" userfile "Name of userfile" configfile "Name of configfile" botdir "home of your eggdrop" silent "Whether or not to write any output from eggdrop to stdout during startup of bot" lockfile "file to prevent start of your eggdrop"] {
 if {![info exists $var]} {
  puts stdout "Hey buddy!\nYou messed up while configurating the botchk.tcl script!"
  puts stdout "Variable not set: $var - Explanation: $what"
  exit 1
 }
}

#Lets check whether we should check the bot at all...
if {[file exists $lockfile]} {
 exit 0
}

#proc: start_bot
#args: none
#desc: Checks whether the userfile exists
#      (if not, it will try to restore it from
#      backups, etc...) and then start up
#      the bot again.
#      Writes the output from eggdrop to stdout...
#      (if selected)

proc start_bot {} {
 global pidfile userfile configfile silent
 if {![file exists $userfile]} {
  if {[file exists "${userfile}~new"]} {
   file copy "${userfile}~new" $userfile
  } elseif {[file exists "${userfile}~bak"]} {
   file copy "${userfile}~bak" $userfile
  } else {
   puts stdout "Error: Can't find any userfile or backupfile!"
   exit 1
  }
 }
 if {[file executable $configfile]} {
  catch {exec ./${configfile}} temp
 } {
  catch {exec ./eggdrop $configfile} temp
 }
 if {$silent == 0} {
  puts stdout $temp
 }
}

#proc: restart_bot
#args: none
#desc: Checks if there's a bot running (suspected zombie),
#      if so it kills it, then calls start_bot

proc restart_bot {} {
 global pidfile userfile configfile
 set fileID [open $pidfile "RDONLY"]
 set temp [gets $fileID botpid]
 close $fileID
 file delete $pidfile
 if {[file exists ${pidfile}~]} {
  file delete ${pidfile}~
 }
 if {$temp > 0} {
  puts stdout "Read pid: $botpid Status: " nonewline
  if {[file exists "/proc/$botpid"] && [file owned "/proc/$botpid"]} {
   puts stdout "Exists!\nChecking if it matches our eggdrop: "
   set fileID [open "/proc/${botpid}/cmdline" "RDONLY"]
   gets $fileID temp
   close $fileID
   if {[string compare "$temp" "eggdrop\000./${configfile}"] == 0} {
    puts stdout "Matched! - killing..."
    puts stdout [exec kill -9 $botpid]
   }
  } {
   puts stdout "no such pid!"
  }
 }
 start_bot
}

#Lets go to bot's home...
cd $botdir

#Check if the pidfile exists, if not, call start_bot
if {![file exists $pidfile]} {
 puts stdout "Pidfile $pidfile does not exists!\nGuess bot is not running... Better start her up..."
 start_bot
} {
#Good, the pidfile is there...  Lets check how old it is..
#(4 minutes and 50 seconds should be enough...)
 if {[set time [expr [clock seconds] - [set atime [file atime $pidfile]]]] > 290} {

#Too old for comfort...  Lets see if this is the second time in a row that file is too old...
#(to prevent any problems caused by ex. change of the system clock...)
#(if pidfile~ exists, it's the second time in a row that the pid-file is too old...)

  puts stdout "Pidfile $pidfile is old... (Havn't been touched in [expr $time/60] minutes)\nChecking for ${pidfile}~... " nonewline
  if {[file exists "${pidfile}~"]} {

#It's there... Lets see if it's "real" (should contain the same pid as the real pidfile)
#Danger: reading the pidfile will change it's time, so for this script to work, the "age" check
#        must be less than the interval botchk.tcl is called (currently 10 secs lower than 5 minutes,
#        change "> 290" some lines above to something lower if you get problems)
#        (To put it in other words, don't call botchk.tcl more often than 5 minutes (unless you decrease
#        the "> 290"...)

   puts stdout "Found!\nValidating ${pidfile}~... " nonewline
   set fileID [open "${pidfile}~" "RDONLY"]
   gets $fileID pid1
   close $fileID
   set fileID [open $pidfile "RDONLY"]
   gets $fileID pid2
   close $fileID


#It's valid... lets restart bot...

   if {$pid1 == $pid2} {
    puts stdout "Valid!\nRestarting..."
    restart_bot
   } {

#It's not valid... better remove it...
    puts stdout "Not valid - removing... " nonewline
    file delete -- "${pidfile}~"
   }
  } {

#It doesn't exist... lets create it so that we know we've already had this
#problem the next time we check...

   puts stdout "Not found!\nCreating new ${pidfile}~... " nonewline
   file copy -- $pidfile "${pidfile}~"

  }
 } {
#pidfile is current, lets remove any stray pidfile~...
  if {[file exists ${pidfile}~]} {
   file delete -- "${pidfile}~"
  }
 }
}

#tweak to restore the timestamp after reading...
#depends on touch (havn't found anything in tcl that'll do the trick :/ )
#catch {exec touch -t [clock format $atime -format "%Y%m%d%H%M.%S"] $pidfile} msg

_________________
NML_375, idling at #eggdrop@IrcNET
Back to top
View user's profile Send private message
eXtremer
Op


Joined: 07 May 2008
Posts: 138

PostPosted: Sat Sep 06, 2008 10:54 am    Post subject: Reply with quote

Thanks nml375, so I need to use both tcl scripts ?
What about the autobotchk ? stop it ?
Back to top
View user's profile Send private message
nml375
Revered One


Joined: 04 Aug 2006
Posts: 2829

PostPosted: Sat Sep 06, 2008 11:54 am    Post subject: Reply with quote

Yup, you use both scripts. As said, watchdog.tcl should be loaded in your eggdrop.

The botchk.tcl replaces the botchk/autobotchk script and should be run from crontab or such.
_________________
NML_375, idling at #eggdrop@IrcNET
Back to top
View user's profile Send private message
eXtremer
Op


Joined: 07 May 2008
Posts: 138

PostPosted: Sun Sep 07, 2008 5:30 am    Post subject: Reply with quote

nml375 wrote:
Yup, you use both scripts. As said, watchdog.tcl should be loaded in your eggdrop.

The botchk.tcl replaces the botchk/autobotchk script and should be run from crontab or such.


ok, thanks.
Back to top
View user's profile Send private message
eXtremer
Op


Joined: 07 May 2008
Posts: 138

PostPosted: Thu Sep 11, 2008 2:03 am    Post subject: Reply with quote

Thanks again a lost nml375, the scripts that you gave work just perfectly, yesterday my DSL at work died (don't know why) so the internet died as well when I came and restarted the DSL and the i-net reappeared the Bots connected in a few seconds as I wanted.
Wink
Back to top
View user's profile Send private message
moff
Voice


Joined: 24 Jul 2008
Posts: 27

PostPosted: Thu Sep 11, 2008 8:06 pm    Post subject: Reply with quote

dumb question, soory, but how to call the botchk.tcl from cron?

with 0,10,20,30 * * * * tclsh /home/blah/botchk.tcl
or
just 0,10,20,30 * * * * /home/blah/botchk.tcl
?
Back to top
View user's profile Send private message
eXtremer
Op


Joined: 07 May 2008
Posts: 138

PostPosted: Fri Sep 12, 2008 3:15 am    Post subject: Reply with quote

I did not change anything in cron, just added the *.tcl and that's it.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    egghelp.org community Forum Index -> Eggdrop Help All times are GMT - 4 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Forum hosting provided by Reverse.net

Powered by phpBB © 2001, 2005 phpBB Group
subGreen style by ktauber