egghelp.org community Forum Index
[ egghelp.org home | forum home ]
egghelp.org community
Discussion of eggdrop bots, shell accounts and tcl scripts.
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Trying to get some user stats of a forum with regexp. SOLVED

 
Post new topic   Reply to topic    egghelp.org community Forum Index -> Scripting Help
View previous topic :: View next topic  
Author Message
arcanedreams
Voice


Joined: 06 Sep 2007
Posts: 9

PostPosted: Thu Sep 06, 2007 2:56 am    Post subject: Trying to get some user stats of a forum with regexp. SOLVED Reply with quote

Ok, so ...the user types "!joined <username>"

Then the bot should check the site and get the user's join date. In the source code for the page that the query is set to, the join date appears like this:

Code:
<div style="padding:3px">
               Join Date: <strong>09-30-2005</strong>
            </div>



So..this is what I came up with. It doesn't work at all. Perhaps you guys can tell me where I went wrong?

This script is eventually going to be expanded upon to get the users post count, posts per day, birthday, ect...


Code:
bind pub - !joined joined
proc joined {nick host handle chan text} {


set query "http://forums.shooshtime.com/member.php?username=$text"
regexp {Join Date: <strong>(.*?)</strong>} $data - join

putserv "PRIVMSG $chan :$text joined on $join"
}



Thanks in advance!




It has come to my attention that the bot must log-in in order to be able to see the info it needs. So that is yet another dilemma I have to overcome.


Last edited by arcanedreams on Thu Sep 06, 2007 9:35 pm; edited 1 time in total
Back to top
View user's profile Send private message
rosc2112
Revered One


Joined: 19 Feb 2006
Posts: 1454
Location: Northeast Pennsylvania

PostPosted: Thu Sep 06, 2007 12:09 pm    Post subject: Reply with quote

Where is the geturl call? There's nothing in the $data var cos there's nothing getting it, like:

set data [::http::geturl $query]

As far as how to authenticate using the facilities available in tcl, I have not attempted it myself and don't know if its even possible (aside from a plain query/post type login). Maybe someone else has some info about that.
Back to top
View user's profile Send private message
arcanedreams
Voice


Joined: 06 Sep 2007
Posts: 9

PostPosted: Thu Sep 06, 2007 1:52 pm    Post subject: Reply with quote

Sorry, I'm new to the whole TCL thing.

What I have been doing is picking apart other scripts and figuring out how they work..then writing my own stuff based on that.

So far it has worked..until now.

Could you explain what the bit of code you posted does?


From looking at it I would assume that it sets the variable $data with whatever is in the [] brackets.

I am guessing that [::http::geturl $query] is a command that actually access the webpage in the $query variable.

So..I could actually do away with my $query variable all together and simply input the URL directly.


I am however still a little fuzzy on the regexp function. Is mine set up correctly?

The way I am understanding it is that it searches for the string that is in the {} brackets located in the $data variable.

It then inputs the (.*?) area into the variable $join.

Is the (.*?) a standard wildcard to take everything in that area..or does the *, and the ? signify different things? I know generally the * is a wildcard for any character, but I am not familiar with the . or the ?

So correct code would be:

Code:

bind pub - !joined joined

proc joined {nick host handle chan text} {

set data [::http::geturl http://forums.shooshtime.com/member.php?username=$text]


regexp {Join Date: <strong>(.*?)</strong>} $data - join

putserv "PRIVMSG $chan :$text joined on $join"
}


If I'm wrong let me know. I won't have a chance to test it out until I get out of class.

Thanks for your help though!
Back to top
View user's profile Send private message
arcanedreams
Voice


Joined: 06 Sep 2007
Posts: 9

PostPosted: Thu Sep 06, 2007 2:04 pm    Post subject: Reply with quote

Oh, and if anyone wants..they can try this version as a test for me if they really really wanted to, which shouldn't require any type of login at all.


Code:

bind pub - !shooshstats shooshstats

proc shooshstats {nick host handle chan text} {

set data [::http::geturl http://forums.shooshtime.com/]


regexp {<div>Threads: (.*?),  Posts: (.*?), Members: (.*?)</div>} $data - threads posts members

putserv "PRIVMSG $chan :There are currently $threads total threads, $posts total posts, and $members total members."
}
Back to top
View user's profile Send private message
r0t3n
Owner


Joined: 31 May 2005
Posts: 507
Location: UK

PostPosted: Thu Sep 06, 2007 3:09 pm    Post subject: Reply with quote

You should read the tcl http docs, you need to get the body/data of the website using http::data, you then need to clean it up using http::cleanup.



Code:
bind pub - !joined joined

proc joined {nick host handle chan text} {
    set query "[http::formatQuery username $text]"
    set token [::http::geturl http://forums.shooshtime.com/member.php -query $query]
    set data [http::data $token]
    if {[http::status] == "error"} {
        putserv "PRIVMSG $chan :Error grabbing join data of $text."
    } else {
        set found "0"
        foreach line [split $data \n] {
            if {$line != "" && [regexp -nocase {Join Date: <strong>(.*?)</strong>} $line full date]} {
                putserv "PRIVMSG $chan :$text joined on $date."
                set found "1"
                break
            }
        }
        if {!$found} {
            putserv "PRIVMSG $chan :Could not find join date for $text."
        }
    }
    http::cleanup $token
}


Not tested, give it a try.
_________________
r0t3n @ #r0t3n @ Quakenet
Back to top
View user's profile Send private message MSN Messenger
arcanedreams
Voice


Joined: 06 Sep 2007
Posts: 9

PostPosted: Thu Sep 06, 2007 7:27 pm    Post subject: Reply with quote

What is all that extra stuff for? I realize the error checking thing..but a lot of that is all new to me.

This is my finished tested and working code I came up with:

Code:
package require http

bind pub - !shooshstats shooshstats

proc shooshstats {nick host handle chan text} {

set query "http://forums.shooshtime.com"
set http [::http::geturl $query]
set html [::http::data $http]

regexp {<div>Threads: (.*?), Posts: (.*?), Members: (.*?)</div>} $html - threads posts members


putserv "PRIVMSG $chan :There are currently $threads total threads, $posts total posts, and $members total members."
}



Is there a reason I should do it the way you suggested over mine? Is it more stable or faster or what?

What happens if I dont use the cleanup function?
Back to top
View user's profile Send private message
rosc2112
Revered One


Joined: 19 Feb 2006
Posts: 1454
Location: Northeast Pennsylvania

PostPosted: Fri Sep 07, 2007 12:30 am    Post subject: Reply with quote

You have 2 threads with the same topic going, so pick one and merge them =p

If you don't clean up your open sockets, eventually you'll have lots and lots of open sockets and it'll probably slow the system down and may even eventually be unable to open more, I assume most os's do impose a certain finite limit on the number of open sockets they can handle (like open files.)

The other benefits in that type of code is handling http errors so you know what's going on, and don't think the bot is not working, It's just good practice to test for error conditions and handle them gracefully.

And yes the man pages will be very useful if you're taking bits and pieces of code from here and there.. Understand what the code does, then you'll be writing your own code without looking at docs in a very short time (regexp, and the http & socket code are a good challenge to start with, everything after that is easy Wink
Back to top
View user's profile Send private message
arcanedreams
Voice


Joined: 06 Sep 2007
Posts: 9

PostPosted: Fri Sep 07, 2007 3:17 am    Post subject: Reply with quote

I put SOLVED in the title of this one..as this thread was made for problems with the actual regexp function.

My new thread is devoted to being able to login and redirect without errors.

Wink

Oh, and I have added the ::http::cleanup thing..but I am not sure if it is working due to the $http variable that keeps increasing by 1 everytime I make a call.

But that can be discussed in my other thread.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    egghelp.org community Forum Index -> Scripting Help All times are GMT - 4 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Forum hosting provided by Reverse.net

Powered by phpBB © 2001, 2005 phpBB Group
subGreen style by ktauber