Conversational query answering system

US11120059B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11120059-B2
Application numberUS-201816020328-A
CountryUS
Kind codeB2
Filing dateJun 27, 2018
Priority dateJun 27, 2018
Publication dateSep 14, 2021
Grant dateSep 14, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques of directing a user to content based on a semantic interpretation of a query input by the user involves generating links to specific content in a collection of documents in response to user string query, the links being generated based on an answer suggestion lookahead index. The answer suggestion lookahead index references a mapping between a plurality of groups of semantically equivalent terms and a respective link to specific content of the collection of documents. These techniques are useful for the generalized task of natural language question answering.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: receiving document data representing a collection of documents, each document of the collection of documents including a respective topic and content pertaining to the topic; for each document of the collection of documents, generating a respective document object model (DOM) of a set of DOMs, the respective DOM corresponding to each document, the DOM corresponding to each document including a topic, a set of subtopics, and links to the content of each document; aggregating the set of DOMs to generate a master link list for a topic knowledge graph, the master link list including a plurality of unique links to the content of the collection of documents; generating answer suggestion lookahead index data based on the collection of documents, the answer suggestion lookahead index data representing a plurality of term/link pairs, wherein the answer suggestion lookahead index data comprises the topic knowledge graph and the master link list; receiving a user query string; in response to receiving the user query string, locating a semantically equivalent term of a term/link pair of the plurality of term/link pairs, the semantically equivalent term being located based a semantic equivalence to the user query string; and outputting a representation of the link of the located term/link pair to an output device. 2. The method of claim 1 , wherein generating the answer suggestion lookahead index data includes: generating the topic knowledge graph data based on the collection of documents, the topic knowledge graph data representing a topic knowledge graph that includes (i) the respective topic and the set of subtopics of each of the collection of documents, and (ii) the links to the content of the collection of documents, each of the links corresponding to one of the respective topic or subtopic of the set of subtopics of each of the collection of documents. 3. The method of claim 2 , wherein generating the answer suggestion lookahead index data further includes: generating semantic annotation data representing respective semantic annotations to the topic and set of subtopics of each of the collection of documents, each of the semantic annotations including a respective predicate of a set of predicates and a respective object of a set of objects; identifying, for each predicate of the set of predicates, at least one object of the set of objects that, when combined with that predicate, correspond to one of a topic or a subtopic of the set of subtopics of each of the collection of documents; and identifying, for each object of the set of objects, at least one predicate of the set of predicates that, when combined with that object, correspond to one of a topic or a subtopic of the set of subtopics of each of the collection of documents. 4. The method of claim 2 , wherein generating the topic knowledge graph data includes: formatting the set of DOMs to produce an aggregate flattened knowledge graph formatted for a natural language processing (NLP) pipeline. 5. The method of claim 4 , wherein the aggregate flattened knowledge graph includes at least one of the topic and set of subtopics of each DOM of the set of DOMs, the NLP pipeline being configured to produce <S,P,O>triplets consisting of subject, predicate, and object for each of the topic and set of subtopics of each of the set of DOMs. 6. The method of claim 4 , wherein generating the topic knowledge graph data further includes: splitting the aggregate flattened knowledge graph to produce a plurality of aggregate flattened knowledge graph parts; and performing natural language processing by the NLP pipeline on each of the plurality of aggregate flattened knowledge graph parts to produce the knowledge graph, the natural language processing being performed on each of the plurality of aggregate flattened knowledge graph parts in parallel. 7. The method of claim 4 , wherein generating the DOM of the set of DOMs includes: identifying non-informative content of each document of the collection of documents; and removing the non-informative content of that document to produce the topic, the set of subtopics, and the links to the content of that document, the removing including applying a term frequency inverse document frequency (TF-IDF) algorithm to that document. 8. The method of claim 4 , wherein generating the DOM of the set of DOMs includes: reformatting each document of the collection of documents to produce the document formatted in a Markdown Markup language (MDML). 9. A computer program product comprising a nontransitory storage medium, the computer program product including code that, when executed by processing circuitry of a computer configured to direct a user to content based on a semantic interpretation of a query input by the user, causes the processing circuitry to perform a method, the method comprising: receiving document data representing a collection of documents, each document of the collection of documents including a respective topic and content pertaining to the topic; obtaining the search query log data, the search query log data representing a mapping between user query data and links to content of the collection of documents, the user query data representing a plurality of user queries; performing a text mining operation on the search query log data to produce the set of common user queries for one of a topic or a respective subtopic of a set of subtopics of a document of the collection of documents, each of the topic and set of subtopics being associated with respective content corresponding to a respective link; generating answer suggestion lookahead index data based on the collection of documents, the answer suggestion lookahead index data representing a plurality of term/link pairs, each term/link pair of the plurality of term/link pairs including a link to content of the collection of documents, wherein the answer suggestion lookahead index data is generated based at least in part on a set of common user queries obtained using search query log data; receiving a user query string; in response to receiving the user query string, locating a semantically equivalent term of a term/link pair of the plurality of term/link pairs, the semantically equivalent term being located based a semantic equivalence to the user query string; and outputting a representation of the link of the located term/link pair to an output device. 10. The computer program product of claim 9 , wherein generating the answer suggestion lookahead index data includes: generating topic knowledge graph data based on the collection of documents, the topic knowledge graph data representing a topic knowledge graph that includes (i) a respective topic and a set of subtopics of each of the collection of documents, and (ii) links to the content of the collection of documents, each of the links corresponding to one of a respective topic or subtopic of the set of subtopics of each of the collection of documents. 11. The computer program product of claim 10 , wherein generating the answer suggestion lookahead index data further includes: forming pairs of (i) a respective user query of the set of common user queries and (ii) a respective link to content of the collection of documents, each pair based on annotated topics and sets of subtopics of the topic knowledge graph, the annotated topics and sets of subtopics including topic titles and <S,P,O>triplets consisting of subject, predicate, and object for each of the topics and sets of subtopics, and wherein producing the link to specific content in the collection of documents includes: identifying a pair of a user query and a link to the c

Assignees

Inventors

Classifications

  • using natural language analysis · CPC title

  • G06F16/22Primary

    Indexing; Data structures therefor; Storage structures · CPC title

  • Selection or weighting of terms for indexing · CPC title

  • Presentation of query results · CPC title

  • using system suggestions (G06F16/3325 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11120059B2 cover?
Techniques of directing a user to content based on a semantic interpretation of a query input by the user involves generating links to specific content in a collection of documents in response to user string query, the links being generated based on an answer suggestion lookahead index. The answer suggestion lookahead index references a mapping between a plurality of groups of semantically equi…
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 14 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).