Generating natural language dialog using a questions corpus

US10049152B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10049152-B2
Application numberUS-201514864057-A
CountryUS
Kind codeB2
Filing dateSep 24, 2015
Priority dateSep 24, 2015
Publication dateAug 14, 2018
Grant dateAug 14, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Generating a natural language dialog by finding missing semantic information in a user question by comparing it to the closest question available in a question corpus. Incrementally improved question precision is targeted during each round of the natural language dialog by generating follow-up questions that clarify semantic and syntactic characteristics of the user question. The follow-up questions are derived from analysis of the user question to identify areas of improvement on the user question.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving a target question in a natural language dialog from a human user; computing a set of target syntax features based on the target question; identifying a first set of reference syntax features based on a first reference question; identifying a second set of reference syntax features based on a second reference question; selecting the first set of reference syntax features based, at least in part, on the set of target syntax features being more similar to the first set of reference syntax features when compared to the second set of reference syntax features; determining a set of missing syntax features based at least in part on a comparison of the set of target syntax features and the first set of reference syntax features; computing a first weight for a first missing syntax feature of the set of missing syntax features and a second weight for a second missing syntax feature of the set of missing syntax features; responsive to the first weight being higher than the second weight, determining a follow-up question based at least in part on a semantic analysis of the target question and the first missing syntax feature; and displaying, by a computer, the follow-up question to the human user. 2. The method of claim 1 , wherein the set of target syntax features, the first set of reference syntax features, and the second set of reference syntax features are each in the form of a parse tree. 3. The method of claim 1 , wherein selecting the first set of reference syntax features includes: determining a first count of reference syntax features for the first set of reference syntax features that match at least one target syntax feature of the set of target syntax features; and determining a second count of reference syntax features for the second set of reference syntax features that match at least one target syntax feature of the set of target syntax features; wherein: the set of target syntax features are more similar to the first set of reference syntax features than the second set of reference syntax features at least because the first count of reference syntax features is greater than the second count of reference syntax features. 4. The method of claim 1 , wherein selecting the first set of reference syntax features includes: determining a first count of missing reference syntax features for the first set of reference syntax features that are not present in the set of target syntax features; and determining a second count of missing reference syntax features for the second set of reference syntax features that are not present in the set of target syntax features; wherein: the set of target syntax features are more similar to the first set of reference syntax features than the second set of reference syntax features at least because the first count of missing reference syntax features is smaller than the second count of missing reference syntax features. 5. The method of claim 1 , wherein: the first weight is determined according to a pre-determined weight of a first word type corresponding to the first missing syntax feature; and and the second weight is determined according to a pre-determined weight of a second word type corresponding to the second missing syntax feature. 6. The method of claim 5 , wherein the word type is selected from the group consisting of a noun, a determiner, a pronoun, a verb, an adjective, an adverb, a preposition, and a conjunction. 7. The method of claim 1 , wherein: the first reference question and the second reference question are identified in a historical user query store as well-formulated questions; and the historical user query store includes questions previously received from users including the first reference question and the second reference question. 8. A computer program product comprising a non-transitory computer-readable storage medium having a set of instructions stored therein which, when executed by a processor, causes the processor to perform the following steps: receiving a target question in a natural language dialog from a human user; computing a set of target syntax features based on the target question; identifying a first set of reference syntax features based on a first reference question; identifying a second set of reference syntax features based on a second reference question; selecting the first set of reference syntax features based, at least in part, on the set of target syntax features being more similar to the first set of reference syntax features when compared to the second set of reference syntax features; determining a set of missing syntax features based at least in part on a comparison of the set of target syntax features and the first set of reference syntax features; computing a first weight for a first missing syntax feature of the set of missing syntax features and a second weight for a second missing syntax feature of the set of missing syntax features; responsive to the first weight being higher than the second weight, determining a follow-up question based at least in part on a semantic analysis of the target question and the first missing syntax feature; and displaying the follow-up question to the human user. 9. The computer program product of claim 8 , wherein the set of target syntax features, the first set of reference syntax features, and the second set of reference syntax features are each in the form of a parse tree. 10. The computer program product of claim 8 , wherein selecting the first set of reference syntax features includes: determining a first count of reference syntax features for the first set of reference syntax features that match at least one target syntax feature of the set of target syntax features; and determining a second count of reference syntax features for the second set of reference syntax features that match at least one target syntax feature of the set of target syntax features; wherein: the set of target syntax features are more similar to the first set of reference syntax features than the second set of reference syntax features at least because the first count of reference syntax features is greater than the second count of reference syntax features. 11. The computer program product of claim 8 , wherein selecting the first set of reference syntax features includes: determining a first count of missing reference syntax features for the first set of reference syntax features that are not present in the set of target syntax features; and determining a second count of missing reference syntax features for the second set of reference syntax features that are not present in the set of target syntax features; wherein: the set of target syntax features are more similar to the first set of reference syntax features than the second set of reference syntax features at least because the first count of missing reference syntax features is smaller than the second count of missing reference syntax features. 12. The computer program product of claim 8 , wherein: the first weight is determined according to a pre-determined weight of a first word type corresponding to the first missing syntax feature; and and the second weight is determined according to a pre-determined weight of a second word type corresponding to the second missing syntax feature. 13. The computer program product of claim 12 , wherein the word type is selected from the group consisting of a noun, a determiner, a pronoun, a verb, an adjective, an adverb, a preposition, and a conjunction. 14. A computer system comprising: a processor set; and a computer readable storage med

Assignees

Inventors

Classifications

  • Semantic analysis · CPC title

  • Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars · CPC title

  • Natural language query formulation · CPC title

  • Physics · mapped topic

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10049152B2 cover?
Generating a natural language dialog by finding missing semantic information in a user question by comparing it to the closest question available in a question corpus. Incrementally improved question precision is targeted during each round of the natural language dialog by generating follow-up questions that clarify semantic and syntactic characteristics of the user question. The follow-up ques…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/3329. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 14 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).