Information extraction in a natural language understanding system

US9454525B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9454525-B2
Application numberUS-201313897780-A
CountryUS
Kind codeB2
Filing dateMay 20, 2013
Priority dateJun 18, 2007
Publication dateSep 27, 2016
Grant dateSep 27, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of extracting information from text within a natural language understanding system can include processing a text input through at least one statistical model for each of a plurality of features to be extracted from the text input. For each feature, at least one value can be determined, at least in part, using the statistical model associated with the feature. One value for each feature can be combined to create a complex information target. The complex information target can be output.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of extracting information from a text input received by a natural language understanding system, comprising: parsing the text input to extract a plurality of features from the text input; processing each of the plurality of features through a plurality of statistical models to obtain at least one value; searching the text input for at least one named entity; determining a value for a feature based upon the at least one named entity located within the text input; combining, via a processor, one value for each of the plurality of features to create a complex information target; and outputting the complex information target, wherein the complex information target indicates a meaning for the text input. 2. The method of claim 1 , wherein the plurality of statistical models output a plurality of candidate values for a particular feature, a value having a highest confidence score is selected, for the particular feature, from the plurality of candidate values, and the value having the highest confidence score is the one value, for the particular feature, used to create the complex information target. 3. The method of claim 1 , further comprising: determining a value for a first feature in the text input; and selecting one of the plurality of statistical models based upon the value determined for the first feature. 4. The method of claim 3 , wherein the selected statistical model is built using a subset of the plurality of features that correspond to the determined value for the first feature. 5. The method of claim 1 , further comprising: comparing the complex information target with a plurality of allowable complex information targets; and determining whether the complex information target is allowable based upon the comparison. 6. The method of claim 5 , further comprising, selecting an alternate value for at least one feature of the complex information target from a plurality of candidate values based upon a confidence score. 7. The method of claim 6 , wherein the alternate value conforms to an allowable complex information target. 8. A method of extracting information from a text input having a first feature and a second feature using a natural language understanding system, comprising: determining a first value for the first feature using a selected text processing technique; selecting a statistical model from a plurality of statistical models associated with a second feature based upon the first value; determining a second value for the second feature using the selected statistical model; forming, using a processor, a complex information target by combining the first and second values; and outputting the complex information target. 9. The method of claim 8 , wherein the selected text processing technique is a statistical model associated with the first feature. 10. The method of claim 8 , wherein the selected text processing technique is a named entity feature extraction technique. 11. The method of claim 8 , further comprising: comparing the complex information target with a plurality of allowable complex information targets; and determining whether the complex information target is valid based upon the comparison. 12. A computer program product, comprising: a computer usable hardware storage device having stored therein computer usable program code for extracting information from a text input received by a natural language understanding system, the computer usable program code, which when executed by a computer hardware system, causes the computer hardware system to perform: parsing the text input to extract a plurality of features from the text input; processing each of the plurality of features through a plurality of statistical models to obtain at least one value; searching the text input for at least one named entity; determining a value for a feature based upon the at least one named entity located within the text input; combining, via a processor, one value for each of the plurality of features to create a complex information target; and outputting the complex information target, wherein the complex information target indicates a meaning for the text input. 13. The computer program product claim 12 , wherein the plurality of statistical models output a plurality of candidate values for a particular feature, a value having a highest confidence score is selected, for the particular feature, from the plurality of candidate values, and the value having the highest confidence score is the one value, for the particular feature, used to create the complex information target. 14. The computer program product claim 12 , wherein the computer usable program code further causes the computer hardware system to perform: determining a value for a first feature in the text input; and selecting one of the plurality of statistical models based upon the value determined for the first feature. 15. The computer program product claim 12 , wherein the computer usable program code further causes the computer hardware system to perform: comparing the complex information target with a plurality of allowable complex information targets; and determining whether the complex information target is allowable based upon the comparison. 16. The computer program product claim 12 , wherein the computer usable program code further causes the computer hardware system to perform: selecting an alternate value for at least one feature of the complex information target from a plurality of candidate values based upon a confidence score. 17. The computer program product claim 16 , wherein the alternate value conforms to an allowable complex information target.

Assignees

Inventors

Classifications

  • G06F40/216Primary

    using statistical methods · CPC title

  • Validation · CPC title

  • G06F40/295Primary

    Named entity recognition · CPC title

  • Processing or translation of natural language (natural language analysis G06F40/20; semantic analysis G06F40/30) · CPC title

  • Semantic analysis · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9454525B2 cover?
A method of extracting information from text within a natural language understanding system can include processing a text input through at least one statistical model for each of a plurality of features to be extracted from the text input. For each feature, at least one value can be determined, at least in part, using the statistical model associated with the feature. One value for each feature…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F40/216. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 27 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).