| View previous topic :: View next topic |
| Author |
Message |
entrapmen Voice

Joined: 08 Jul 2003 Posts: 27 Location: TR
|
Posted: Fri Aug 19, 2005 8:55 am Post subject: some regexp problems |
|
|
Hi,
i was working on spammers a while ago and im back again. but now spammers get developed. they really got weird spam methods.
here are some of the examples:
ºwºwºwº
I¤R¤C
(using with color and it seem like w w w and I R C)
lots of thing like that. i think they all can be catch by using "regexp" but i couldnt figure it out. i dont wanna add all spam words which i was doing to that time. speacial charecters allways changing (, . ? ' ( ) and words). i want to catch them if there is "w w w" in the sentence. it doesnt matter where they are. (ex: where is will wonder) <<-- that person must be caught.
got an another problem, the channel im protecting is at about 300 person and its very active i m using greet message to catch some kind spammers which are using the way "away message". so i have to message all the users but when i give that work to 1 bot it get lagged or "eflood"ed. so i want to cut it to half. if nick starts with "a-k" message else dont if nick starts with "k-z && "special charecters??"" message else dont.
sorry for bad english, i tried my best. thanks _________________ <@ll the world is about smiles and cries> |
|
| Back to top |
|
 |
greenbear Owner
Joined: 24 Sep 2001 Posts: 733 Location: Norway
|
Posted: Fri Aug 19, 2005 10:51 am Post subject: |
|
|
You could remove everything thats not a normal char or a number from the string before doing any regexp matches on it. | Code: | | regsub -nocase -all {[^a-z0-9]} $line {} line |
|
|
| Back to top |
|
 |
Alchera Revered One

Joined: 11 Aug 2003 Posts: 3344 Location: Ballarat Victoria, Australia
|
Posted: Fri Aug 19, 2005 7:39 pm Post subject: |
|
|
| Quote: | | stripcodes <strip-flags> <string>
| Description: strips specified control characters from the string given.
| strip-flags can be any combination of the following:
| b - remove all boldface codes
| c - remove all color codes
| r - remove all reverse video codes
| u - remove all underline codes
| a - remove all ANSI codes
| g - remove all ctrl-g (bell) codes
| Returns: the stripped string.
| Module: core |
NB: eggdrop v1.6.17.0 _________________ Add [SOLVED] to the thread title if your issue has been.
Search | FAQ | RTM |
|
| Back to top |
|
 |
demond Revered One

Joined: 12 Jun 2004 Posts: 3073 Location: San Francisco, CA
|
Posted: Fri Aug 19, 2005 7:59 pm Post subject: |
|
|
| /me makes a note to himself to check that (source code for stripping) out |
|
| Back to top |
|
 |
De Kus Revered One

Joined: 15 Dec 2002 Posts: 1361 Location: Germany
|
Posted: Sat Aug 20, 2005 7:25 am Post subject: |
|
|
| greenbear wrote: | | regsub -nocase -all {[^a-z0-9]} $line {} line |
Just think what will happen to "vist us on http://www.mystupid.ad and our #stupid-channel". yes, it will become "vistusonhttpwwwmystupidadandourstupidchannel". Good luck .
you won't be able to find advirtising spam that way, you should keep at least the characters " ", "#", "/", ":" and ".". So I would rather prefer: [^a-z0-9\./:# ]
However, for general purpose stripcodes is the best and probably the fastest solution. _________________ De Kus
StarZ|De_Kus, De_Kus or DeKus on IRC
Copyright © 2005-2009 by De Kus - published under The MIT License
Love hurts, love strengthens... |
|
| Back to top |
|
 |
demond Revered One

Joined: 12 Jun 2004 Posts: 3073 Location: San Francisco, CA
|
Posted: Sat Aug 20, 2005 12:56 pm Post subject: |
|
|
this will strip mIRC color and control codes:
| Code: |
regsub -all {([\002\017\026\037]|[\003]{1}[0-9]{0,2}[\,]{0,1}[0-9]{0,2})} $str {} str
|
and this will match a hotlink (clickable URL) or chan ad:
| Code: |
regexp {(?i)(http://|www\.|irc\.|\s#)} $str
|
|
|
| Back to top |
|
 |
entrapmen Voice

Joined: 08 Jul 2003 Posts: 27 Location: TR
|
Posted: Sun Aug 21, 2005 11:42 pm Post subject: |
|
|
| demond wrote: | this will strip mIRC color and control codes:
| Code: |
regsub -all {([\002\017\026\037]|[\003]{1}[0-9]{0,2}[\,]{0,1}[0-9]{0,2})} $str {} str
|
and this will match a hotlink (clickable URL) or chan ad:
| Code: |
regexp {(?i)(http://|www\.|irc\.|\s#)} $str
|
|
thats the problem, i know those codes and was using them. but unfortunately they are no more working. coz bots doesnt advertise clickable links. _________________ <@ll the world is about smiles and cries> |
|
| Back to top |
|
 |
entrapmen Voice

Joined: 08 Jul 2003 Posts: 27 Location: TR
|
Posted: Mon Aug 22, 2005 12:02 am Post subject: |
|
|
about the second problem can someone help me?
i think one if and else should be enough, like if nick starts with the letter a-q do something else do nothing... _________________ <@ll the world is about smiles and cries> |
|
| Back to top |
|
 |
demond Revered One

Joined: 12 Jun 2004 Posts: 3073 Location: San Francisco, CA
|
Posted: Mon Aug 22, 2005 12:32 am Post subject: |
|
|
| entrapmen wrote: | | demond wrote: | this will strip mIRC color and control codes:
| Code: |
regsub -all {([\002\017\026\037]|[\003]{1}[0-9]{0,2}[\,]{0,1}[0-9]{0,2})} $str {} str
|
and this will match a hotlink (clickable URL) or chan ad:
| Code: |
regexp {(?i)(http://|www\.|irc\.|\s#)} $str
|
|
thats the problem, i know those codes and was using them. but unfortunately they are no more working. coz bots doesnt advertise clickable links. |
really? give me just ONE clickable link that these don't match |
|
| Back to top |
|
 |
entrapmen Voice

Joined: 08 Jul 2003 Posts: 27 Location: TR
|
Posted: Mon Aug 22, 2005 8:01 am Post subject: |
|
|
| demond wrote: | | entrapmen wrote: | | demond wrote: | this will strip mIRC color and control codes:
| Code: |
regsub -all {([\002\017\026\037]|[\003]{1}[0-9]{0,2}[\,]{0,1}[0-9]{0,2})} $str {} str
|
and this will match a hotlink (clickable URL) or chan ad:
| Code: |
regexp {(?i)(http://|www\.|irc\.|\s#)} $str
|
|
thats the problem, i know those codes and was using them. but unfortunately they are no more working. coz bots doesnt advertise clickable links. |
really? give me just ONE clickable link that these don't match |
i say it really works for clickable links but bots(spammers) doesnt use that method anymore. thats what i am saying. sorry for bad understandings and bad english  _________________ <@ll the world is about smiles and cries> |
|
| Back to top |
|
 |
demond Revered One

Joined: 12 Jun 2004 Posts: 3073 Location: San Francisco, CA
|
Posted: Mon Aug 22, 2005 12:09 pm Post subject: |
|
|
| so what do you care if they use another method with no clickable links? nobody will bother to manually strip the spam and paste it in their browser - people are lazy, spammers know that; spam without clickable links is harmless (annoying yes, but that's all about it) |
|
| Back to top |
|
 |
entrapmen Voice

Joined: 08 Jul 2003 Posts: 27 Location: TR
|
Posted: Mon Aug 22, 2005 1:28 pm Post subject: |
|
|
| demond wrote: | | so what do you care if they use another method with no clickable links? nobody will bother to manually strip the spam and paste it in their browser - people are lazy, spammers know that; spam without clickable links is harmless (annoying yes, but that's all about it) |
as you say they are annoying. i m trying to stop them before they message to users. users are lazy if they would //mode their self +R they would not see the spammers...
anyway demond can you help me about the second problem  _________________ <@ll the world is about smiles and cries> |
|
| Back to top |
|
 |
demond Revered One

Joined: 12 Jun 2004 Posts: 3073 Location: San Francisco, CA
|
Posted: Mon Aug 22, 2005 2:34 pm Post subject: |
|
|
| if you've managed to write some script satisfying some of your needs, you should be able to grasp the [regexp] concept and eliminate nick patterns that are of no interest to you, it's not that hard |
|
| Back to top |
|
 |
|