This is the new home of the egghelp.org community forum.
All data has been migrated (including user logins/passwords) to a new phpBB version.


For more information, see this announcement post. Click the X in the top right-corner of this box to dismiss this message.

Regexps?

Old posts that have not been replied to for several years.
Locked
I
Iridium

Regexps?

Post by Iridium »

Tutorials, people.

How did you learn to use them?

What can I use them for?

How do they return?


I've looked through the re_syntax on tcl.tk / tcl commands and I can't make heads or tails of it. Thanks for your help :)
User avatar
stdragon
Owner
Posts: 959
Joined: Sun Sep 23, 2001 8:00 pm
Contact:

Post by stdragon »

1. I learned to use them by experimentation in tclsh. If you're in a hurry and don't want to 'learn as you go' then sit down for an hour and play with every special character you see in re_sytnax until you know what it does.

2. You can use them for complex string matching and substitution. That's about it, but that encompasses a lot, since most things in life can be represented as a string. Usually people use it for syntax checking (e.g. "Is the input a valid email address?" or "Is this sentence 'bad' as defined by this list of user rules?"). Other common uses are getting rid of color/control codes, extracting parts of a line into variables, and performing substitutions into kick messages,etc (like the person's nick replaces %nick in the kick message).

3. Regexp returns 0 for no match and 1 for a match. Regsub returns the number of substitutions, e.g. 0 for no match and non-zero for a match.

Looking through re_syntax is good, but it's better to find some example scripts. Depending on what you want to do, most regular expressions are very simple. There are only a few special chars you have to escape (like (), |, ., *, {}, +, erm, maybe some more..).

Here's a quick mini tutorial:

( ) is used for match reporting.. regexp lets you specify 'match variables' that get filled in with what exactly matched. The matches within parenthesis are what get reported. Also it's just like math, they allow you to group other operations. An example of match reporting would be: regexp {(.*)!(.*)@(.*)} $from match nick user host

| means "or". It lets your regexp match more than one thing. For instance, "hello there|hi there" would match either "hello there" or "hi there". Using ( ) for grouping, you could say "(hello|hi) there" and it would be the same thing.

. means "any single character". So the regex "..." would match any 3 letters (including spaces). To match an actual period, escape it with a backslash \.

* means "any number of the previous thing, including 0." So if you have "baa*", what is the previous thing? "a". So that would match "ba" (0), "baa", "baaaaaaaaa", etc. However, using grouping ( ) you can match multiple things: "(baa)*" would match "baabaabaa". If you notice, .* will match anything, because . means "any char" and * means "any number". So any easy way to translate between dos-style wildcards and regexps is to replace ?'s with single dots, and * with ".*"

+ is exactly the same as *, but it requires at least 1. So "baa+" would not match "ba" anymore.

{ } is a range operator. It's just like * and + but it lets you specify how many repeats are acceptable. For instance, "ba{2,10}" would match from "baa" (2 a's) to "baaaaaaaaaa" (10 a's).

There are a few more ones that are either less useful or way more complicated. For instance the section on negative lookaheads meant very little to me until the other day. It's a silly name, what it really is is a "and not" operator. For instance, if you want to match something that "contains an a and no b" you could use a negative lookahead. (Yes for that example you can use the [ ] operator to match for ^b, but [ ] can't contain a full regular expression, whereas negative lookaheads can.)

So those are the basics. Anything more advanced would require exponentially more amounts of text I think. If you have specific questions feel free to ask.
I
Iridium

Kickass

Post by Iridium »

Image Thanks *loads* for taking the time to help me out there, really appreciated.
User avatar
z_one
Master
Posts: 269
Joined: Mon Jan 14, 2002 8:00 pm
Location: Canada

Post by z_one »

stdragon let me commend you on this excellent mini-tutorial and I mean it !
In just a few words you have managed to make me understand more than I ever knew about "regexp". You're da man ! 8)

- z_one -
Locked