Circle Noetics Services natural language technologies
homenewsproductsabout us

Dashes
hyphenation module

Dashes Pro
the multilingual hyphenation application

PassWord
spelling checker

WordFan
natural language processing toolbox

WordList
word list

InOtherWords
semantic net

 

InOtherWords™ lexical database

While computing has been primarily concerned with the presentation of data, little has been developed to date to make the actual information content generally available for electronic processing. Extensive information about language itself has been relatively unavailable.

Uses of InOtherWords™

Information content and linguistic data is important in a number of current fields.

CNS has developed an expert system for “the English that everyone knows.” It is a linguistic tool which will be a step in solving technical problems like the following:

Database Query

The computer does not in general know that coal, gas, and oil are fuels, and that airlines operate airplanes that fly, take off, land, carry passengers and use fuel. So the computer can generally not assist the user in finding all articles in a database about air travel if (s)he specifies only the word airline in the query. A human assistant would also pick out the articles that talk about aircraft, airports, jets, and other related concepts.

Database Indexing

Most current PC applications store only an index of all the words in all the documents. It would be helpful also to automatically cross-reference documents according to their aboutness, or to be able to immediately access all other words in the relevant semantic domain.

Automatic Translation

An automatic translation system from Russian to English needs to know that, although both wide and broad are usually translated as shiroko. In English there is a set of adjectives which involve a boundary, and can be used with measurements: 5 feet wide, but not 5 feet broad; 6 pounds heavy, but not 6 pounds fat. It needs to know that many English words ending in -er or -le involve repetitions, such as battle vs. fight, batter vs. beat. It needs to know that although Russian boltat’ can be translated as chat, chatter, gab, shoot the breeze, the first three of these verbs in English typically have female subjects. It sounds funny to say He was chattering away.

Speech Recognition and OCR

If these programs had an idea what the message was all about, and could guess which words in a text were likely to appear, they would be able to sort out the ambiguities with a far better speed and accuracy. If such a program knew that an article was about writing, it could prioritize a reading of word over work. If it knew by a syntactic parse, however, that the word had to be a verb modified by hand, it might still prioritize work over word.

Text Compression

For archiving and telecommunications, it’s important to store as much data as possible in as little space as possible, preferable without losing access speed. Compression is essentially the removal of redundancy. If A can be predicted from B, one need not store A. Thus the trick to linguistic compression is to have access to as many generalizations about language as possible.

Size and content

About 40,000 pages of linguistic information have been compiled and entered into the database up to now. Up to 300 categories of information are available for over 100,000 words of the English language, each of which is divided into an average of three to four senses. The complete set of relations and structures is a network of millions of concepts and specifications. To do this, CNS has invented hundreds of proprietary concepts and technologies.

For example, IOW knows monitor is a person, who monitors a situation; monitor is an object as in a video monitor; monitor is an object as in a regulator; monitor is an object as a general device for the observance of events or situations; Monitor is the name of the famous gunboat in the Civil War; monitor is a name for a species of reptiles. Every one of these words monitor comes with a full entry of specifications and relations. This includes, but is not limited to:

  • Syntax: For example, you can say, "I like running", "I like to run", "I enjoy running", but not "I enjoy to run."
  • Semantic Net: Includes Made of, Purpose, Result, Cause, Part of, Is, Shape, Texture, Color, Has, Situation, Field, Synonyms, Antonyms, and numerous other relations.
  • Semantic Constraints: For example, the word "leash" implies a dog, the verb "eat", unless used metaphorically, must have an animal for a subject, and some kind of food for the object, "planting" occurs in earth, etc.
  • Morphology: For example, under((e/valu)ate)
  • Idioms, Cliches Quotes, Cultural Literacy
  • Pronunciation

InOtherWords™ Market

CNS is planning to release a number of products for the OEM market that will be based on it’s proprietary InOtherWords™ technology but will not rule out licensing the InOtherWords™ lexicon itself on an OEM basis with certain restrictions.

How to get in touch

For more information, contact:

Circle Noetic Services, Inc.
5 Pine Knoll Drive
Mont Vernon, NH 03057

(603) 283-6462 /voice/

(603) 672-8025 /fax/

e-mail: Admin@CircleNoetics.com
web: www.CircleNoetics.com

Pricing & Licensing

For pricing and licensing information

Call CNS at (603) 283-6462

or E-mail Admin@CircleNoetics.com


© 2002 Circle Noetic Services, Inc. All rights reserved. Last Modified 07/26/07