This is the new home of the egghelp.org community forum.
All data has been migrated (including user logins/passwords) to a new phpBB version.


For more information, see this announcement post. Click the X in the top right-corner of this box to dismiss this message.

list specific chapter from a href tag?

Old posts that have not been replied to for several years.
J
Jagg
Halfop
Posts: 53
Joined: Sat Jan 24, 2004 11:32 am

list specific chapter from a href tag?

Post by Jagg »

Hi,

I have a html-file. In the source code there are many a href-tags which looks like

<a href = "http://www.url.com/e?t=WORD" target="_blank">WORD</a>

<a href = "http://www.url.com/e?t=WORD2" target="_blank">WORD2</a>

and so on....

How can I list all blue WORD* ?

Thanks
User avatar
user
&nbsp;
Posts: 1452
Joined: Tue Mar 18, 2003 9:58 pm
Location: Norway

Re: list specific chapter from a href tag?

Post by user »

Try this:

Code: Select all

foreach {word word} [regexp -all -inline {"http://www.url.com/e\?t=([^"]+)"} $html] {
  lappend words $word
}
Have you ever read "The Manual"?
J
Jagg
Halfop
Posts: 53
Joined: Sat Jan 24, 2004 11:32 am

Re: list specific chapter from a href tag?

Post by Jagg »

THANKS A LOT!!!!!

I get now a list from all WORD like:
WORD WORD2 WORD3 and so on...
How can I write a comma between these words?
User avatar
strikelight
Owner
Posts: 708
Joined: Mon Oct 07, 2002 10:39 am
Contact:

Post by strikelight »

By converting it to a string with "join" and telling it what to join with, ie. a comma:

Code: Select all

set text [join $yourlist ", "]
J
Jagg
Halfop
Posts: 53
Joined: Sat Jan 24, 2004 11:32 am

Post by Jagg »

Ok, it works perfect now..... BUT there is a problem :-?

The server allows only ~445 characters per line! Sometimes the output of the above script is more than 445 characters so it stops after this amount of ch. :-(

With which command I can say/split the output in two, three lines when var $a is more than 445 characters?
User avatar
user
&nbsp;
Posts: 1452
Joined: Tue Mar 18, 2003 9:58 pm
Location: Norway

Post by user »

Jagg wrote:With which command I can say/split the output in two, three lines when var $a is more than 445 characters?
There's no single command to do that :P

This is a pretty straight forward word wrapper. the optional third value can be used to split on a different char than space (only one char..sorry :P) It returns a list of "lines" that you can do whatever you like with :)

Code: Select all

proc wordwrap {str {len 70} {splitChr { }}} {
	set out [set cur {}]; set i 0
	foreach word [split [set str][unset str] $splitChr] {
		if {[incr i [string len $word]]>$len} {
			lappend out [join $cur $splitChr]
			set cur [list $word]
			set i [string len $word]
		} {
			lappend cur $word
		}
		incr i
	}
	lappend out [join $cur $splitChr]
}
ps: it won't split words even if they're longer than the max length. feature or bug? you decide :P
pps: it will return a list with one empty element if you feed it an empty string
Have you ever read "The Manual"?
User avatar
strikelight
Owner
Posts: 708
Joined: Mon Oct 07, 2002 10:39 am
Contact:

Post by strikelight »

user wrote:
Jagg wrote:With which command I can say/split the output in two, three lines when var $a is more than 445 characters?
There's no single command to do that :P

This is a pretty straight forward word wrapper. the optional third value can be used to split on a different char than space (only one char..sorry :P) It returns a list of "lines" that you can do whatever you like with :)

Code: Select all

proc wordwrap {str {len 70} {splitChr { }}} {
	set out [set cur {}]; set i 0
	foreach word [split [set str][unset str] $splitChr] {
		if {[incr i [string len $word]]>$len} {
			lappend out [join $cur $splitChr]
			set cur [list $word]
			set i [string len $word]
		} {
			lappend cur $word
		}
		incr i
	}
	lappend out [join $cur $splitChr]
}
ps: it won't split words even if they're longer than the max length. feature or bug? you decide :P
pps: it will return a list with one empty element if you feed it an empty string
What's the point of unsetting the local variable str? Sure you free up a little bit of memory, but with the command substitution you are adding cpu time... The local variable will be destroyed once the procedure has exited anyways, so for the little time it does spend inside the proc, plus the slight cpu overhead, there's no advantage to unsetting it.
User avatar
user
&nbsp;
Posts: 1452
Joined: Tue Mar 18, 2003 9:58 pm
Location: Norway

Post by user »

strikelight wrote:What's the point of unsetting the local variable str? Sure you free up a little bit of memory, but with the command substitution you are adding cpu time... The local variable will be destroyed once the procedure has exited anyways, so for the little time it does spend inside the proc, plus the slight cpu overhead, there's no advantage to unsetting it.
The point is to not have two copies of the entire input stored in memory at any point. In my opinion, saving ~50% memory (for large amounts of data) outweights the disadvantage of the tiny cpu usage added by the unset.

But it probably doesn't matter for what these folks will use it for anyway...and to be perfectly honest I did it to make the boring code have a slightly interesting part for us to talk about right now :P
Have you ever read "The Manual"?
User avatar
strikelight
Owner
Posts: 708
Joined: Mon Oct 07, 2002 10:39 am
Contact:

Post by strikelight »

user wrote:
strikelight wrote:What's the point of unsetting the local variable str? Sure you free up a little bit of memory, but with the command substitution you are adding cpu time... The local variable will be destroyed once the procedure has exited anyways, so for the little time it does spend inside the proc, plus the slight cpu overhead, there's no advantage to unsetting it.
The point is to not have two copies of the entire input stored in memory at any point. In my opinion, saving ~50% memory (for large amounts of data) outweights the disadvantage of the tiny cpu usage added by the unset.

But it probably doesn't matter for what these folks will use it for anyway...and to be perfectly honest I did it to make the boring code have a slightly interesting part for us to talk about right now :P
Nothing wrong with us having a discussion about it though :wink:

Code: Select all

% proc a {mylist} {set i 0; foreach word $mylist {incr i}}
% proc b {mylist} {set i 0; foreach word [set mylist][unset mylist] {incr i}}
% time {a $mylist} 1000
26 microseconds per iteration
% time {b $mylist} 1000
81 microseconds per iteration
% time {a $mylist} 1000
36 microseconds per iteration
% time {b $mylist} 1000
91 microseconds per iteration
so while you are correct in stating that you save ~50% in memory,
you are also adding ~200% to ~300% in cpu time :wink:
User avatar
user
&nbsp;
Posts: 1452
Joined: Tue Mar 18, 2003 9:58 pm
Location: Norway

Post by user »

strikelight wrote:

Code: Select all

% proc a {mylist} {set i 0; foreach word $mylist {incr i}}
% proc b {mylist} {set i 0; foreach word [set mylist][unset mylist] {incr i}
Your test is not fair, as 'unset' returns an empty STRING, making the list before it become a string that has to be translated back to a list by foreach. (oh, the wonderfull internals of tcl ;P)
Try this instead and you'll see the unset adds very little to the total cpu usage :)

Code: Select all

proc a it {set i 0; foreach word [split $it] {incr i}; set i}
proc b it {set i 0; foreach word [split [set it][unset it]] {incr i}; set i}
Have you ever read "The Manual"?
User avatar
strikelight
Owner
Posts: 708
Joined: Mon Oct 07, 2002 10:39 am
Contact:

Post by strikelight »

user wrote:
strikelight wrote:

Code: Select all

% proc a {mylist} {set i 0; foreach word $mylist {incr i}}
% proc b {mylist} {set i 0; foreach word [set mylist][unset mylist] {incr i}
Your test is not fair, as 'unset' returns an empty STRING, making the list before it become a string that has to be translated back to a list by foreach
try this instead and you'll see the unset adds very little to the total cpu usage :)

Code: Select all

proc a it {set i 0; foreach word [split $it] {incr i}; set i}
proc b it {set i 0; foreach word [split [set it][unset it]] {incr i}; set i}
You are correct...

Code: Select all

% proc b {mylist} {set i 0; foreach word [split [set mylist][unset mylist]] {incr i}}
% proc a {mylist} {set i 0; foreach word [split $mylist] {incr i}}
% time {a $mylist} 1000
126 microseconds per iteration
% time {b $mylist} 1000
178 microseconds per iteration
Still ~50% cpu increase though.
User avatar
user
&nbsp;
Posts: 1452
Joined: Tue Mar 18, 2003 9:58 pm
Location: Norway

Post by user »

How long is the string you're testing it with? To be on-topic it should be > 445 bytes :P
Oh..and what's the point of 'i'?
Have you ever read "The Manual"?
User avatar
strikelight
Owner
Posts: 708
Joined: Mon Oct 07, 2002 10:39 am
Contact:

Post by strikelight »

user wrote:How long is the string you're testing it with? To be on-topic it should be > 445 bytes :P
Oh..and what's the point of 'i'?
It was 271 bytes....
Now it is:

Code: Select all

% string bytelength $mylist
547
'i' just gives the procs something to do to simulate that they are doing something meaningful ;x

Code: Select all

% time {a $mylist} 1000
305 microseconds per iteration
% time {b $mylist} 1000
391 microseconds per iteration
% time {a $mylist} 1000
277 microseconds per iteration
% time {b $mylist} 1000
336 microseconds per iteration
So it's STILL ~20% - ~30% increase.

heh. I guess we can agree, like most things, it's a trade off of memory vs performance.
User avatar
user
&nbsp;
Posts: 1452
Joined: Tue Mar 18, 2003 9:58 pm
Location: Norway

Post by user »

strikelight wrote:heh. I guess we can agree, like most things, it's a trade off of memory vs performance.
Yeah...it seems most tcl versions are faster at setting the value to an empty string instead of deleting the entire variable, so I guess I should make it

Code: Select all

foreach word [split [set str][set str ""] $splitChr] {...}
:)
And to make it even faster I could replace all variable substitutions with [set ..], but then the code will be much harder to read.
Have you ever read "The Manual"?
User avatar
strikelight
Owner
Posts: 708
Joined: Mon Oct 07, 2002 10:39 am
Contact:

Post by strikelight »

user wrote:
strikelight wrote:heh. I guess we can agree, like most things, it's a trade off of memory vs performance.
Yeah...it seems most tcl versions are faster at setting the value to an empty string instead of deleting the entire variable, so I guess I should make it

Code: Select all

foreach word [split [set str][set str ""] $splitChr] {...}
:)
And to make it even faster I could replace all variable substitutions with [set ..], but then the code will be much harder to read.
From my tests, variable substitutions seem to be on exact par with command substitutions. (except for those command substitutions which also involve variable substitutions at the same time, of course).
Locked