This is the new home of the egghelp.org community forum.
All data has been migrated (including user logins/passwords) to a new phpBB version.


For more information, see this announcement post. Click the X in the top right-corner of this box to dismiss this message.

Parsing webpages made easy

Issues often discussed about Tcl scripting. Check before posting a scripting question.
User avatar
sKy
Op
Posts: 194
Joined: Thu Apr 14, 2005 5:58 pm
Location: Germany

Post by sKy »

Quote:
Does somebody know how to load the core commands of eggdrop into a normal tcl script to then use them with egghttp.tcl (wich needs the eggdrop core functions)?

what are you talking about?
I guess he means. How he can use the eggdrop commands such as putlog (... connect, listen, ....) in a plain tcl environment (perhaps tclsh).

The answer:
As long i know it`s not possible. There is not "package provide Eggdrop" line. Eggdrop`s core commands are written in C and not tcl. However, eggdrop`s source is open source so it could be possible to extract them with some (unless) work.

The http package can be found here aswell too. http://www.tcl.tk/software/tcllib/

Imho we should sort out this thread anyay.
demond: could you update your first post here with a updated website?
socketapi | Code less, create more.
User avatar
strikelight
Owner
Posts: 708
Joined: Mon Oct 07, 2002 10:39 am
Contact:

Post by strikelight »

demond wrote:
domme wrote:How can I do this with egghttp?
don't use egghttp, it is severily outdated; it had its use long time ago, when Tcl still didn't have the built-in http package, which is superior in any way to egghttp
No, egghttp was written even when the http package for Tcl existed.
However the http package is severely bloated for the function of simply grabbing a web page, not to mention it does not provide async connection across all Tcl versions, and thus all eggdrop versions. Egghttp is still very much a practical utility today.
User avatar
De Kus
Revered One
Posts: 1361
Joined: Sun Dec 15, 2002 11:41 am
Location: Germany

Post by De Kus »

refering to the header of http.tcl v2.5.1 from 2005/01/06, http supports callback (truely async) since version 2.1. I found 2.4.2 which was from 2004/04/05. You can probably guess how old 2.1 is...
The biggest diffrence between http and egghttp is, that egghttp uses connect and control and http uses of course socket and fileevent.
De Kus
StarZ|De_Kus, De_Kus or DeKus on IRC
Copyright © 2005-2009 by De Kus - published under The MIT License
Love hurts, love strengthens...
User avatar
strikelight
Owner
Posts: 708
Joined: Mon Oct 07, 2002 10:39 am
Contact:

Post by strikelight »

I didn't say the http package, I said older versions of Tcl itself and thus different eggdrop versions. (There is no event loop to facilitate the required callback mechanism of the http package in these versions)
User avatar
sKy
Op
Posts: 194
Joined: Thu Apr 14, 2005 5:58 pm
Location: Germany

Post by sKy »

But you always could process the eventloop yourself. Call the update proc every second via utimer.
socketapi | Code less, create more.
User avatar
strikelight
Owner
Posts: 708
Joined: Mon Oct 07, 2002 10:39 am
Contact:

Post by strikelight »

Yes, you could call a utimer event for 'update' every second for certain versions of eggdrop/tcl combination, however, you will be causing your bot to consume heavy cpu usage by doing so. However, also doing update every second will render your bot inoperable for extended amounts of time, negating the purpose of the async mode altogether.
Performance, needless to say, would be sluggish using such a method,
and not be very practical. Calling 'update' every second (or even every few seconds) is not a viable solution.
User avatar
sKy
Op
Posts: 194
Joined: Thu Apr 14, 2005 5:58 pm
Location: Germany

Post by sKy »

The performance. Well. To call a proc/command-update every seconds doesn`t cause a that high cpu usage. I tested it. Important is there should only run one utimer at once. If this timer is started on evey script load or reash again and again this will cause a high cpu usage for sure.

Update is simple. Do i have work to do? No? -> Return. This doesn`t take that long. And if he has work to do this will block the bot just that long the newer versions of eggdrop blocked aswell too.
socketapi | Code less, create more.
User avatar
strikelight
Owner
Posts: 708
Joined: Mon Oct 07, 2002 10:39 am
Contact:

Post by strikelight »

Not talking about executing any utimer event every second, or every few seconds. We are talking specifically about calling a utimer event that makes a call to the 'update' command. Update will have to process any and all info in the event queue (not just what you think may be there), and it will not return until that has been completed, again as I stated earlier, negating the purpose of async mode altogether. And no, newer versions of eggdrop would not be blocked as their event queue is updated every iteration of the main program loop, so having a utimer event on those bots would be pointless as the event queue would most certainly be empty already. I have tried this on versions that required manual updating, many a time, and it did indeed result in severe performance degredation.
Hence the production of the egghttp.tcl script, and also a patch for certain versions as an alternative solution.
f
flashy
Voice
Posts: 24
Joined: Mon May 01, 2006 3:38 am

Post by flashy »

man thats awesome, can someone make a little how-to guide to finding out the nessery things to phrase something off a webpage? finding out table size td row div and such using dom inspector?
User avatar
demond
Revered One
Posts: 3073
Joined: Sat Jun 12, 2004 9:58 am
Location: San Francisco, CA
Contact:

Post by demond »

flashy wrote:man thats awesome, can someone make a little how-to guide to finding out the nessery things to phrase something off a webpage? finding out table size td row div and such using dom inspector?
that would be your basic HTML primer
connection, sharing, dcc problems? click <here>
before asking for scripting help, read <this>
use

Code: Select all

 tag when posting logs, code
s
salkkus
Voice
Posts: 1
Joined: Mon Oct 02, 2006 10:08 am

Post by salkkus »

Hello all :)

I am interested in trying this tclDOM, but my shell doesnt support compiling..
So, is there anything I should know when I compile it on my own machine and then copy to the shell?

Like, should the shell have all those libxml, etc installed? And if yes, how can I find out which modules are installed?

Lots of questions and too few answers :) ( I checked many websites regarding this, but couldn't find, sort of, basic info )
User avatar
rosc2112
Revered One
Posts: 1454
Joined: Sun Feb 19, 2006 8:36 pm
Location: Northeast Pennsylvania

Post by rosc2112 »

Assuming your shell and your local machine are running the same OS on a similar platform (eg, running freebsd on a 80868 based machine), you can compile the module statically if necessary. You can find out what libs it depends on when compiled dynamically (the usual/default method) by running 'ldd modulename' on it. If your shell isn't running on the same platform, you'll have to cross-compile, which is beyond my own experience, so I'm not much help there.
User avatar
demond
Revered One
Posts: 3073
Joined: Sat Jun 12, 2004 9:58 am
Location: San Francisco, CA
Contact:

Post by demond »

it's not tclDOM, it's tDOM; TclDOM is completely different package that indeed requires TclXML; tDOM does not require any XML package as it itself contains XML engine
connection, sharing, dcc problems? click <here>
before asking for scripting help, read <this>
use

Code: Select all

 tag when posting logs, code
j
johne
Voice
Posts: 29
Joined: Tue Jul 19, 2005 2:24 am

Post by johne »

some more working examples would be greatly appreciated :)
k
karodde
Voice
Posts: 10
Joined: Tue May 01, 2007 12:56 pm

Re: Parsing webpages made easy

Post by karodde »

The Script works for my shell @ home. Now I want my eggdrop to post this Info by typing !test.

I Made this, but it doesnt work :)

Code: Select all

#!/bin/sh
# This line continues for Tcl, but is a single line for 'sh' \
exec tclsh8.4 "$0" ${1+"$@"}
package require tdom
package require http
set url "http://www.url.de/"
set page [::http::data [::http::geturl $url]]
set doc [dom parse -html $page]
set root [$doc documentElement]
set node [$root selectNodes {//table[@cellspacing=0]/tr[1]/td[1]}]
set text [[[lindex $node 0] childNodes] nodeValue]

bind pub - !test tester
proc tester { nick uhost hand chan args } {
putserv "PRIVMSG $chan : $text"
}
I got following error msg:
Tcl error [tester]: can't read "text": no such variable

Probably the variable text is not committed,
can somebody help me please? thanks :)


EDIT:
got it,

Code: Select all

bind pub - !test thefunction
proc thefunction { nick uhost hand chan rest } {

package require tdom
package require http
set url "http://www.url.de/"
set page [::http::data [::http::geturl $url]]
set doc [dom parse -html $page]
set root [$doc documentElement]
set node [$root selectNodes {//table[@cellspacing=0]/tr[1]/td[1]}]
set text [[[lindex $node 0] childNodes] nodeValue]

putserv "PRIVMSG $chan : $text"
}
Post Reply