We have a cleartext list of up to 20,000 words. We need a developer to develop a script to retrieve the definitions of these words from Wordnet DB. Wordnet is accessible from
<[login to view URL]>
There are also packages for wordnet in most Linux distributions. We are currently using the 'wordnet,' 'wordnet-gui,' and 'wordnet-base' packages in debian, which allows us to retrieve definitions for words entered manually in the GUI.
There is a package in Debian called 'libwordnet-querydata-perl' which is a perl interface to the Wordnet DB that was especially designed to facilitate batch text searching. For programmers familiar with perl, this will probably be the fastest way to code up a solution.
We would prefer a generic solution that allows the user to load a txt file in the format: "line number" <TAB> "word"
and will write out to a file in the format:
"line number" <TAB> "word" <TAB> "definition"
## Deliverables
We have attached a sample tab-delimited text file to give an example of the source data.
-When data is retrieved, all senses of the word should be returned (if a word has more than one meaning).
-There should be a <TAB> delimiter between each defintion. Ex: Defn 1. ... <TAB> Defn 2. ... <TAB> etc.
The txt file will eventually be imported into a spreadsheet program, so suggestions on data schema are welcome.