egghelp.org community Forum Index
[ egghelp.org home | forum home ]
egghelp.org community
Discussion of eggdrop bots, shell accounts and tcl scripts.
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Bots are not coming back from ping timeout
Goto page Previous  1, 2
 
Post new topic   Reply to topic    egghelp.org community Forum Index -> Eggdrop Help
View previous topic :: View next topic  
Author Message
moff
Voice


Joined: 24 Jul 2008
Posts: 27

PostPosted: Fri Sep 12, 2008 5:16 am    Post subject: Reply with quote

k, thx Smile
Back to top
View user's profile Send private message
nml375
Revered One


Joined: 04 Aug 2006
Posts: 2829

PostPosted: Tue Sep 16, 2008 4:19 pm    Post subject: Reply with quote

moff wrote:
dumb question, soory, but how to call the botchk.tcl from cron?

with 0,10,20,30 * * * * tclsh /home/blah/botchk.tcl
or
just 0,10,20,30 * * * * /home/blah/botchk.tcl
?


On most systems, the latter should work as the script uses the #! magic number. Should tclsh be installed in some different location, you'll have to go for the first one.
_________________
NML_375, idling at #eggdrop@IrcNET
Back to top
View user's profile Send private message
desired
Voice


Joined: 12 Sep 2011
Posts: 31

PostPosted: Fri Sep 16, 2011 12:08 pm    Post subject: Reply with quote

nml375 wrote:
Unfortunately, due to the way filesystem operation works, this script will also alter the atime timestamp whenever accessing the pidfile, and thus must not be called more often than the hung/gone/dead-time. Recommended is 6+ minutes, I use 10 myself. You might get away with 5 minutes.

I was thinking about a way to fix this... What about...

Instant of just open and close the file, you could write something into it - a timestamp.

Could it work?
_________________
eggdrop running on Android powered mobile phone - Yes, it is possible! - Very reliable.
Back to top
View user's profile Send private message
nml375
Revered One


Joined: 04 Aug 2006
Posts: 2829

PostPosted: Fri Sep 16, 2011 4:32 pm    Post subject: Reply with quote

Sure, as long as you don't use the actual pid-file, and also remember to include the actual pid number within the file.

Another way would be to alter the scripts to use mtime instead of atime, though this would require a write to the pid-file as opposed to a simple read...
This is something I wrote ages ago, and it's worked well enough for me all this time. If you'd like to modify it, by all means go a head already.
_________________
NML_375, idling at #eggdrop@IrcNET
Back to top
View user's profile Send private message
CP1832
Halfop


Joined: 09 Oct 2014
Posts: 51

PostPosted: Thu Sep 07, 2017 10:32 am    Post subject: Reply with quote

Well, I ran into the same issues as the original poster (my eggdrop freezes from time and time and won't reconnect, nor die), so I tried using nml375 scripts. The problem I ran into is that I found no way to alter the pid file's atime, so I modified watchdog.tcl to read and write back the pid into the pid file and then modified botchk.tcl to check the file's mtime. Here's watchdog.tcl:
Code:
#Watchdog part of botchk.tcl
#Simply makes eggdrop update the accesstime for it's pidfile
#on a regular basis... (kinda like "touch")

#Settings:
# botpid [path/]pid.bot
#  Tells the script which file that is the pidfile...
#  This should be autodetected by the script at startup,
#  but under certain circumstances, it might fail..
#  IF it does fail, just set this var manually, and it'll
#  work just fine :)
#  PS. The name of the pidfile is pid.$botnet-nick, or; if
#  botnet-nick isn't set; pid.$nick   DS

#set botpid pid.lamestbot

if {![info exists botpid]} {
 if {[info exists botnet-nick] && ${botnet-nick} != ""} {
  set botpid "pid.${botnet-nick}"
  putlog "Setting botpid to $botpid using \$botnet-nick"
 } elseif {[info exists nick] && $nick != ""} {
  set botpid "pid.$nick"
  putlog "Setting botpid to $botpid using \$nick"
 } else {
  putlog "Unable to determine the name of the pid-file!\nPlease check your config-file or watchdog.tcl script..."
 }
}

proc touch {file} {
 set in [open $file r]
 set pid [gets $in]
 close $in
 set out [open $file w]
 puts $out $pid
 close $out
}

proc watchdog {min hour day month year} {
 global botpid
 touch $botpid
}

foreach bind [binds watchdog] {lassign $bind type flags mask num proc; unbind $type $flags $mask $proc}
bind time - "* * * * *" watchdog

putlog "Loading bot watchdog"
And here's the modified botchk.tcl:
Code:
#!/usr/bin/tclsh
### settings ###
#pidfile "pidfile of bot"
#set pidfile pid.botname

#userfile "userfile of bot"
#set userfile botname

#configfile "configfile of bot"
#set configfile eggdrop.conf

#botdir "home of your eggdrop"
#set botdir /home/user/eggdrop

#silent 0/1
# Should we write eggdrop's output from it's startup to stdout?
set silent 0

#lockfile "file to prevent start of your eggdrop"
# Use this whenever you want to stop your bot for a longer while...
set lockfile /home/somewhere/some.lock

### Code ###
#Lets just check that our friendly user supplied all required settings :)
foreach {var what} [list pidfile "Name of pidfile" userfile "Name of userfile" configfile "Name of configfile" botdir "home of your eggdrop" silent "Whether or not to write any output from eggdrop to stdout during startup of bot" lockfile "file to prevent start of your eggdrop"] {
 if {![info exists $var]} {
  puts stdout "Hey buddy!\nYou messed up while configurating the botchk.tcl script!"
  puts stdout "Variable not set: $var - Explanation: $what"
  exit 1
 }
}

#Lets check whether we should check the bot at all...
if {[file exists $lockfile]} {
 exit 0
}

#proc: start_bot
#args: none
#desc: Checks whether the userfile exists
#      (if not, it will try to restore it from
#      backups, etc...) and then start up
#      the bot again.
#      Writes the output from eggdrop to stdout...
#      (if selected)

proc start_bot {} {
 global pidfile userfile configfile silent
 if {![file exists $userfile]} {
  if {[file exists "${userfile}~new"]} {
   file copy "${userfile}~new" $userfile
  } elseif {[file exists "${userfile}~bak"]} {
   file copy "${userfile}~bak" $userfile
  } else {
   puts stdout "Error: Can't find any userfile or backupfile!"
   exit 1
  }
 }
 if {[file executable $configfile]} {
  catch {exec ./${configfile}} temp
 } {
  catch {exec ./eggdrop $configfile} temp
 }
 if {$silent == 0} {
  puts stdout $temp
 }
}

#proc: restart_bot
#args: none
#desc: Checks if there's a bot running (suspected zombie),
#      if so it kills it, then calls start_bot

proc restart_bot {} {
 global pidfile userfile configfile
 set fileID [open $pidfile "RDONLY"]
 set temp [gets $fileID botpid]
 close $fileID
 file delete $pidfile
 if {[file exists ${pidfile}~]} {
  file delete ${pidfile}~
 }
 if {$temp > 0} {
  puts stdout "Read pid: $botpid Status: " nonewline
  if {[file exists "/proc/$botpid"] && [file owned "/proc/$botpid"]} {
   puts stdout "Exists!\nChecking if it matches our eggdrop: "
   set fileID [open "/proc/${botpid}/cmdline" "RDONLY"]
   gets $fileID temp
   close $fileID
   set temp [split $temp "\000"]
   set executable [file tail [lindex $temp 0]]
   set config [file tail [lindex $temp 1]]
   if {$executable eq "eggdrop" && $config eq $configfile} {
    puts stdout "Matched! - killing..."
    puts stdout [exec kill -9 $botpid]
   }
  } {
   puts stdout "no such pid!"
  }
 }
 start_bot
}

#Lets go to bot's home...
cd $botdir

#Check if the pidfile exists, if not, call start_bot
if {![file exists $pidfile]} {
 puts stdout "Pidfile $pidfile does not exists!\nGuess bot is not running... Better start her up..."
 start_bot
} {
#Good, the pidfile is there...  Lets check how old it is..
#(4 minutes and 50 seconds should be enough...)
 if {[set time [expr [clock seconds] - [set mtime [file mtime $pidfile]]]] > 290} {

#Too old for comfort...  Lets see if this is the second time in a row that file is too old...
#(to prevent any problems caused by ex. change of the system clock...)
#(if pidfile~ exists, it's the second time in a row that the pid-file is too old...)

  puts stdout "Pidfile $pidfile is old... (Havn't been touched in [expr $time/60] minutes)\nChecking for ${pidfile}~... " nonewline
  if {[file exists "${pidfile}~"]} {

#It's there... Lets see if it's "real" (should contain the same pid as the real pidfile)
#Danger: reading the pidfile will change it's time, so for this script to work, the "age" check
#        must be less than the interval botchk.tcl is called (currently 10 secs lower than 5 minutes,
#        change "> 290" some lines above to something lower if you get problems)
#        (To put it in other words, don't call botchk.tcl more often than 5 minutes (unless you decrease
#        the "> 290"...)

   puts stdout "Found!\nValidating ${pidfile}~... " nonewline
   set fileID [open "${pidfile}~" "RDONLY"]
   gets $fileID pid1
   close $fileID
   set fileID [open $pidfile "RDONLY"]
   gets $fileID pid2
   close $fileID


#It's valid... lets restart bot...

   if {$pid1 == $pid2} {
    puts stdout "Valid!\nRestarting..."
    restart_bot
   } {

#It's not valid... better remove it...
    puts stdout "Not valid - removing... " nonewline
    file delete -- "${pidfile}~"
   }
  } {

#It doesn't exist... lets create it so that we know we've already had this
#problem the next time we check...

   puts stdout "Not found!\nCreating new ${pidfile}~... " nonewline
   file copy -- $pidfile "${pidfile}~"

  }
 } {
#pidfile is current, lets remove any stray pidfile~...
  if {[file exists ${pidfile}~]} {
   file delete -- "${pidfile}~"
  }
 }
}

#tweak to restore the timestamp after reading...
#depends on touch (havn't found anything in tcl that'll do the trick :/ )
#catch {exec touch -t [clock format $mtime -format "%Y%m%d%H%M.%S"] $pidfile} msg
Finally, I added the following line to my crontab:
Code:
0,10,20,30,40,50 * * * * tclsh /home/user/eggdrop/botchk.tcl >/dev/null 2>&1


Last edited by CP1832 on Fri Oct 06, 2017 3:19 pm; edited 1 time in total
Back to top
View user's profile Send private message
CP1832
Halfop


Joined: 09 Oct 2014
Posts: 51

PostPosted: Tue Sep 12, 2017 10:46 am    Post subject: Reply with quote

I can confirm that the watchdog script is working, it's noticing when the eggdrop is frozen and starts a new instance, but for some reason it's not killing the previous instance. Does anyone have a clue why this script section isn't killing the frozen eggdrop?
Code:
proc restart_bot {} {
 global pidfile userfile configfile
 set fileID [open $pidfile "RDONLY"]
 set temp [gets $fileID botpid]
 close $fileID
 file delete $pidfile
 if {[file exists ${pidfile}~]} {
  file delete ${pidfile}~
 }
 if {$temp > 0} {
  puts stdout "Read pid: $botpid Status: " nonewline
  if {[file exists "/proc/$botpid"] && [file owned "/proc/$botpid"]} {
   puts stdout "Exists!\nChecking if it matches our eggdrop: "
   set fileID [open "/proc/${botpid}/cmdline" "RDONLY"]
   gets $fileID temp
   close $fileID
   if {[string compare "$temp" "eggdrop\000./${configfile}"] == 0} {
    puts stdout "Matched! - killing..."
    puts stdout [exec kill -9 $botpid]
   }
  } {
   puts stdout "no such pid!"
  }
 }
 start_bot
}
Back to top
View user's profile Send private message
nml375
Revered One


Joined: 04 Aug 2006
Posts: 2829

PostPosted: Tue Sep 12, 2017 2:05 pm    Post subject: Reply with quote

The console output from running the botchk.tcl script would be very helpful, as well as a detailed process listing (ps -l) to see the status of the process in question.
_________________
NML_375, idling at #eggdrop@IrcNET
Back to top
View user's profile Send private message
CP1832
Halfop


Joined: 09 Oct 2014
Posts: 51

PostPosted: Wed Sep 13, 2017 9:44 am    Post subject: Reply with quote

nml375 wrote:
The console output from running the botchk.tcl script would be very helpful, as well as a detailed process listing (ps -l) to see the status of the process in question.
I've already modified my crontab to be:
Code:
0,10,20,30,40,50 * * * * tclsh /home/user/eggdrop/botchk.tcl >> /home/user/eggdrop/botchk.log 2>&1
So the next time my bot gets brain freeze I'll be logging the output.
Back to top
View user's profile Send private message
CP1832
Halfop


Joined: 09 Oct 2014
Posts: 51

PostPosted: Fri Sep 15, 2017 10:07 am    Post subject: Reply with quote

Here's the output from my crontab's log file:
Code:
Pidfile pid.user is old... (Havn't been touched in 10 minutes)
Checking for pid.user~... Not found!
Creating new pid.user~... Pidfile pid.user is old... (Havn't been touched in 20 minutes)
Checking for pid.user~... Found!
Validating pid.user~... Valid!
Restarting...
Read pid: 15720 Status: Exists!
Checking if it matches our eggdrop:

Eggdrop v1.8.0+infiniteinfo (C) 1997 Robey Pointer (C) 2010 Eggheads
Here's the ps -l output
Code:
ps -l
F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
4 S  1483 16799 16628  0  82   2 -  1000 -      pts/256  00:00:00 bash
0 R  1483 19387 16799  0  82   2 -   726 -      pts/256  00:00:00 ps
And here's the ps aux output:
Code:
ps aux | grep -i user
user    2605  0.0  0.1  15684  5036 ?        SN   13:20   0:01 ./eggdrop eggdrop.conf
user   15720  0.0  0.1  24676  6200 ?        SN   Sep13   0:35 ./eggdrop eggdrop.conf
user   16799  0.0  0.0   4000  2036 pts/256  SN   14:02   0:00 -bash
user   20312  0.0  0.0   3140  1000 pts/256  RN+  14:05   0:00 ps aux
user   20313  0.0  0.0   2744   580 pts/256  SN+  14:05   0:00 grep --colour=auto -i user
user   24480  0.0  0.2  18684  8264 ?        SN   Sep14   0:40 ./eggdrop eggdrop.conf
Back to top
View user's profile Send private message
nml375
Revered One


Joined: 04 Aug 2006
Posts: 2829

PostPosted: Fri Sep 15, 2017 1:39 pm    Post subject: Reply with quote

Hi again,
Thanks for the input (and for correcting my incorrect options for "ps", must have been thinking of "ls").
From what I can tell, the following test is failing:
Code:
set fileID [open "/proc/${botpid}/cmdline" "RDONLY"]
gets $fileID temp
close $fileID
if {[string compare "$temp" "eggdrop\000./${configfile}"] == 0} {

It's a check that we're actually killing the correct process, by investigating the command line that started the process. Looking at the output from "ps aux", the command line is "./eggdrop eggdrop.conf", whereas the botchk-script expects it to be "eggdrop ./eggdrop.conf".

I do have an idea to improve this test, though it is untested:
Code:
set fileID [open "/proc/${botpid}/cmdline" "RDONLY"]
gets $fileID temp
close $fileID
set temp [split $temp "\000"]
set executable [file tail [lindex $temp 0]]
set config [file tail [lindex $temp 1]]

if {$executable eq "eggdrop" && $config eq $configfile} {

Should be more tolerant, as it strips any paths from both command and argument (the config file) before comparing.
_________________
NML_375, idling at #eggdrop@IrcNET
Back to top
View user's profile Send private message
CP1832
Halfop


Joined: 09 Oct 2014
Posts: 51

PostPosted: Mon Sep 18, 2017 9:39 am    Post subject: Reply with quote

Thanks again nml375, I've already modified my .tcl accordingly and will let you know how it works out once the bot gets stoned.
Back to top
View user's profile Send private message
CP1832
Halfop


Joined: 09 Oct 2014
Posts: 51

PostPosted: Wed Sep 20, 2017 12:26 am    Post subject: Reply with quote

Now the script is working and killing the stoned eggdrop.
Code:
Pidfile pid.user is old... (Havn't been touched in 10 minutes)
Checking for pid.user~... Not found!
Creating new pid.user~... Pidfile pid.user is old... (Havn't been touched in 20 minutes)
Checking for pid.user~... Found!
Validating pid.user~... Valid!
Restarting...
Read pid: 8467 Status: Exists!
Checking if it matches our eggdrop:
Matched! - killing...
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    egghelp.org community Forum Index -> Eggdrop Help All times are GMT - 4 Hours
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Forum hosting provided by Reverse.net

Powered by phpBB © 2001, 2005 phpBB Group
subGreen style by ktauber