egghelp.org community Forum Index
[ egghelp.org home | forum home ]
egghelp.org community
Discussion of eggdrop bots, shell accounts and tcl scripts.
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Dictionary.com script (finished/final)
Goto page 1, 2  Next
 
Post new topic   Reply to topic    egghelp.org community Forum Index -> Script Support & Releases
View previous topic :: View next topic  
Author Message
rosc2112
Revered One


Joined: 19 Feb 2006
Posts: 1454
Location: Northeast Pennsylvania

PostPosted: Tue Oct 10, 2006 11:22 am    Post subject: Dictionary.com script (finished/final) Reply with quote

I'd like to announce a test release of a dictionary.com script I'm working on. The script works to retrieve definitions from dictionary.com's unabridged dictionary v1.0.1. Other dictionaries are available from their site and will be added to the script, with the option of searching them in particular, or the default as a fall-back. It can show suggested spellings if you misspell a word. Shows results including the pronunciation key, word forms, each definition, word origin, synonyms and antonyms.

Currently, the options available are:

Commandline options (privmsg $botnick or typed in channel):
To look up a word, simply use : .dict <word>
To show only the word origin : .dict dcorigin <word>
To show only synonyms/antonyms: .dict dcsyn <word>

Script Configure Options:

# Channels where we allow public use
set dcomchans "#mychan #chan2 #etc"

# Channels that only respond via privmsg
set dcquietchans "#chan2 #etc"

# Timeout for geturl
set dcomtimeout "30000"

# If you want limit output, set the line-limit here
#(this will truncate results.) Set to 0 for no-limit.
set dclinelimit 0

# Show Word Origins when available? 1 == yes, 0 == no
set dcorigin 1

# Show Synonyms and Antynoms for words? 1 == yes, 0 == no
set dcsynant 1

Dictionary.com provides the following additional databases, which will be incorporated into this script:
American Heritage Dictionary
Webster's 1913 Dictionary
WordNet v2.0
American Heritage Steadman Medical Dictionary
Merriam Webster Medical Dictionary
Free On-line Dictionary of Computing (FOLDOC)
Internet Jargon File
Wallstreet Words
Investopedia
Merriam-Webster's Dictionary of Law

I'd appreciate people testing the script and sending me words that produce errors (showing html codes or truncated results for example. Dictionary.com tends to use a LOT of unicode chars and short of adding thousands of them to the string map, I've been adding them as I see them.)

Keep in mind this is a preview of the script, the other databases are not yet incorporated, although the regexp's are in place (just need to finish them to format the results Wink

Check the url for updates, there will no doubt be many Smile
http://members.dandy.net/~fbn/dictcom.tcl.txt


Last edited by rosc2112 on Thu Oct 12, 2006 1:58 am; edited 4 times in total
Back to top
View user's profile Send private message
rosc2112
Revered One


Joined: 19 Feb 2006
Posts: 1454
Location: Northeast Pennsylvania

PostPosted: Wed Oct 11, 2006 5:35 pm    Post subject: v0.01c Reply with quote

History
- Oct. 08 2006 - Initial conception
- Oct. 11 2006 - Added more db's, added ability for user to specify line-limit (still respecting admin's choice of max line-limit, and added option for admin to allow user to override that limit or not), etc.
- Removed dcorigin/dcsynant options, added combined commandline options.

Databases: Dictionary.com Unabridged Dictionary; American Heritage Dictionary; Webster's 1913 Dictionary;American Heritage Stedman's Medical Dictionary; Merriam-Webster Medical Dictionary, Investopedia
Back to top
View user's profile Send private message
rosc2112
Revered One


Joined: 19 Feb 2006
Posts: 1454
Location: Northeast Pennsylvania

PostPosted: Thu Oct 12, 2006 2:12 am    Post subject: v0.01d Reply with quote

- Added a 'dbmatch' option to show which databases a word is found in.
- Added the rest of the db's (if dictionary.com provided the VERA, GCIDE and Ambrose Bierce's Devil's dictionaries, this script would be complete compared to db's available from dict.org.)
- Consolidated redundant regexp's into a seperate proc.

I found a few more db's in addition to the one's I mentioned earlier. Here's a complete list:

Databases:
Dictionary.com Unabridged Dictionary;
Webster's 1913 Unabridged Dictionary;
Webster's New Millenium Dictionary;
WordNet Dictionary;
American Heritage Dictionary;
American Heritage Stedman's Medical Dictionary;
American Heritage Dictionary of Idioms;
Merriam-Webster Medical Dictionary;
Merriam-Webster Law Dictionary;
Investopedia;
Wall Street Words;
Easton's 1897 Bible Dictionary;
Hitchcock's Bible Names Dictionary;
Free On-line Dictionary of Computing;
Jargon File;
US Gazetteer 1990 Census;
CIA 1995 World Factbook;
Atomic Elements Database
Back to top
View user's profile Send private message
rosc2112
Revered One


Joined: 19 Feb 2006
Posts: 1454
Location: Northeast Pennsylvania

PostPosted: Sat Oct 14, 2006 2:02 pm    Post subject: v0.02a Reply with quote

Oct. 12 2006 - Minor changes to dbnames.
Oct. 13 2006 - Fixed a mistake in the variable used for selecting a database.
-Added a string map for 'dbmatch' to show the dbname's as used in the script (rather than the names as known to dictionary.com)
- Added additional error msg when user selects a database and the word is not found.
- Changed proc dictmsg test to see if user is either on the channel or validuser, if neither, script quietly returns (unknown users outside of channels cannot use.)
- Made a configuration option for limiting input length.
Back to top
View user's profile Send private message
v00j00
Voice


Joined: 18 Dec 2005
Posts: 4

PostPosted: Wed Nov 15, 2006 3:58 pm    Post subject: Reply with quote

Thank you!
Back to top
View user's profile Send private message
rosc2112
Revered One


Joined: 19 Feb 2006
Posts: 1454
Location: Northeast Pennsylvania

PostPosted: Wed Nov 15, 2006 4:29 pm    Post subject: Reply with quote

Welcome Smile

I just made a minor update to the script, added more unicode chars to the string list.. Same url as above.
Back to top
View user's profile Send private message
rosc2112
Revered One


Joined: 19 Feb 2006
Posts: 1454
Location: Northeast Pennsylvania

PostPosted: Sun Dec 03, 2006 6:21 am    Post subject: v0.02k Reply with quote

Dictionary.com changed their html a bit, a fix has been uploaded (same url as above.) There are also some new db's that I'll be adding in the next few days.
Back to top
View user's profile Send private message
cache
Master


Joined: 10 Jan 2006
Posts: 306
Location: Mass

PostPosted: Mon Dec 18, 2006 10:10 pm    Post subject: Reply with quote

did html change again? just wonder since I just added this and see odd characters and no spaces when it lists them like...

1.whatever
2.whatever
Back to top
View user's profile Send private message
rosc2112
Revered One


Joined: 19 Feb 2006
Posts: 1454
Location: Northeast Pennsylvania

PostPosted: Mon Dec 18, 2006 11:06 pm    Post subject: Reply with quote

Which word, and what commandline options did you look up? I just checked and don't see any changes in the html being used. And yes, it will normally list definitions 1 at a time for the default dictionary.
Back to top
View user's profile Send private message
cache
Master


Joined: 10 Jan 2006
Posts: 306
Location: Mass

PostPosted: Mon Dec 18, 2006 11:46 pm    Post subject: Reply with quote

This is how I see it, if this is how it's suppose to run thats fine.. and thanks for all these new scripts you've been making Smile

Code:

It shows this  and no spaces by numbers while it lists..

<Bot> Dictionary.com Unabridged: Results for 'chat' - [pronunciation key: chat ]
<Bot> verb (used without object)
<Bot> 1.to converse in a familiar or informal manner.
<Bot> noun
<Bot> 2.informal conversation: We had a pleasant chat.
<Bot> 3.any of several small Old World thrushes, esp. of the genus Saxicola, having a chattering cry.
<Bot> 4.yellow-breasted chat.
<Bot> Verb phrase
<Bot> Output limit reached [6 lines max]
<Bot> Origin: 140050; late ME; short for chatter
<Bot> Synonyms: 1, 2. talk, chitchat, gossip, visit.
<Bot> [End Dictionary.com Unabridged - 'chat']
Back to top
View user's profile Send private message
rosc2112
Revered One


Joined: 19 Feb 2006
Posts: 1454
Location: Northeast Pennsylvania

PostPosted: Tue Dec 19, 2006 2:05 am    Post subject: Reply with quote

The binary codes are unicode chars (EN DASH and EM DASH), and unfortunately I'm not able to cut/paste them into the string map to translate them.. I also tried using the hex code for them, but that don't work for my platform either. Curiously, when I save the page in Mozilla, I get this char:

â

because the first part of the hex for 'EN DASH' is 0xE2 (the full hex code for it is 0xE2 0x80 0x93 (e28093))

If anyone has a clue about how to add these pesky unicode chars into string map, I'd appreciate a hint (I'm not even able to reproduce the chars using bash's \x codes, I don't know what the octals for the chars are so I didn't try that..)

Here's info about the chars:

http://www.fileformat.info/info/unicode/char/2013/index.htm
http://www.fileformat.info/info/unicode/char/2014/index.htm

Needless to say, dictionary.com is turning into a real pain in the rump cos they keep changing these silly little things (they were originally using the html codes for dash..)
Back to top
View user's profile Send private message
BeBoo
Halfop


Joined: 26 Sep 2007
Posts: 42

PostPosted: Tue Nov 20, 2007 2:31 pm    Post subject: Reply with quote

I'm getting the following error when loading it:

Code:
can't read "dcomdef": no such variable
    while executing
"regexp {class="sectionLabel">.+?Synonyms.*?</span>(.*?)</div>} $dcomdef match dcsynon"
    invoked from within
"if {[regexp {class="sectionLabel">.+?Synonyms.*?</span>(.*?)</div>} $dcomdef match dcsynon]} {
                        regsub -all -nocase {<b>.*?</b>} $dcsynon } dcsynon"


Any ideas?
Back to top
View user's profile Send private message
rosc2112
Revered One


Joined: 19 Feb 2006
Posts: 1454
Location: Northeast Pennsylvania

PostPosted: Tue Nov 20, 2007 4:28 pm    Post subject: Reply with quote

No clue. Read here to get proper debug output:

http://forum.egghelp.org/viewtopic.php?p=63899#63899

and

http://forum.egghelp.org/viewtopic.php?t=10215
Back to top
View user's profile Send private message
rosc2112
Revered One


Joined: 19 Feb 2006
Posts: 1454
Location: Northeast Pennsylvania

PostPosted: Sat Apr 05, 2008 8:26 am    Post subject: Reply with quote

I'm assuming there's a version incompatibility between the version of tcl I'm using, and the version the people getting that "dcomdef" var error are using - something about the regexp, since obviously $dcomfef IS defined and the other regexp's don't throw any errors about it. I'm using tcl 8.4.11 and eggdrop 1.6.18.

Or perhaps its a unix-vs-windrop incompatibility (if you're using windrop and get that error, I can only suggest using a real unix system, or at the least, eggdrop & tcl compiled under cygwin.) I don't do windoze =)
Back to top
View user's profile Send private message
speechles
Revered One


Joined: 26 Aug 2006
Posts: 1398
Location: emerald triangle, california (coastal redwoods)

PostPosted: Sun Apr 06, 2008 1:43 am    Post subject: Reply with quote

rosc2112, it's this part:
Code:
regsub -all -nocase {<b>.*?</b>} $dcsynon {[*control-code 002 is here*]&[*control-code 002 is here*]} dcsynon


You've embedded unescaped control-codes to handle the bold rather than the proper escape sequence \002. Synonyms and antonyms suffer from this. Those who aren't saving the link directly, and instead copying and resaving in their editor of choice will most certainly have problems.. It's a simple fix.

The problem becomes noticeable in BeBoo's error message:
Code:
can't read "dcomdef": no such variable
    while executing
"regexp {class="sectionLabel">.+?Synonyms.*?</span>(.*?)</div>} $dcomdef match dcsynon"
    invoked from within
"if {[regexp {class="sectionLabel">.+?Synonyms.*?</span>(.*?)</div>} $dcomdef match dcsynon]} {
                        regsub -all -nocase {<b>.*?</b>} $dcsynon } dcsynon"

Notice particularly that part below with the stange brace all alone, this is where the embedded control-codes are causing conflict.

Edit: Appears this embedding happens in other spots as well, but it's the same exact 'regsub' concerning bold, except each is done on a different variable. When this control-character crosses platforms it causes unexpected problems. This directly stems from using *.tcl.txt to name the script, the browser will display these itself. Named as *.tcl the browser would only ask to save them or open them with another program, not display them. But....This would not even be an issue if people would stop embedding in the first place and properly generate their characters using escape sequences or similar means of creation. Calling windrop inferior rather than investigate your own code is just silly. Most windrop related script issues are usually: 1) the script author not following accepted standards or 2) dependencies on incompatible/missing modules (read this as the user now has to fully install cygwin to create it).
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    egghelp.org community Forum Index -> Script Support & Releases All times are GMT - 4 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Forum hosting provided by Reverse.net

Powered by phpBB © 2001, 2005 phpBB Group
subGreen style by ktauber