|
Our Information Extraction (IE) technology complements our PowerAnswer
Question Answering (QA) approach perfectly. While PowerAnswer provides
an open-domain solution for direct, relevant answers from
heterogeneous document collections, IE is ideally suited for customers
interested in retrieving highly-relevant information for
domain-specific topics. We do not rely on keyword or simple pattern
matching, instead, we provide deep syntactic and semantic understanding of
domain information. Our IE technology, implemented in the
Cicero
Information Extraction system, extracts and
formats information from virtually any domain of interest. Typical
applications range from market analysis, e.g. what stocks changed and
how, to intelligence applications, such as monitoring the bombing
events in the Middle East.
The Information Extraction technology behind Cicero was developed by
our team of renowned researchers. Technical
details about our technology have been published in May 2002 in the
proceedings of the prestigious "Human Language Technology"
conference. The complete paper is available
here.
Our prominent position in Information Extraction was recently awarded
with significant research contracts for further technology
development. Information Extraction is a complex task that encompasses
many Natural Language Processing techniques, such as syntactic
parsing, entity and event coreference, and information fusion. Our
unique ways in approaching these tasks have raised the accuracy of the
Cicero IE system well into the 80% range, leaving the other competing
systems well behind at low 60% accuracy rates. Language Computer
Corporation has developed an open-domain IE framework, which includes
open-domain parsing, coreference and named-entity recognition. Fast
customization to different domains is guaranteed by our proprietary
grammar development environment, which includes a grammar
specification language and a run-time grammar execution environment
that address the ambiguities of the natural language.
|