Conversational query answering system

US2020004873A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2020004873-A1
Application numberUS-201816020328-A
CountryUS
Kind codeA1
Filing dateJun 27, 2018
Priority dateJun 27, 2018
Publication dateJan 2, 2020
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques of directing a user to content based on a semantic interpretation of a query input by the user involves generating links to specific content in a collection of documents in response to user string query, the links being generated based on an answer suggestion lookahead index. The answer suggestion lookahead index references a mapping between a plurality of groups of semantically equivalent terms and a respective link to specific content of the collection of documents. These techniques are useful for the generalized task of natural language question answering.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method, comprising: receiving document data representing a collection of documents, each document of the collection of documents including a respective topic and content pertaining to the topic; generating answer suggestion lookahead index data based on the collection of documents, the answer suggestion lookahead index data representing a plurality of term/link pairs, each term/link pair of the plurality of term/link pairs including a semantically equivalent term and a corresponding link to content of the collection of documents; receiving a user query string; in response to receiving the user query string, locating a semantically equivalent term of a term/link pair of the plurality of term/link pairs, the semantically equivalent term being located based a semantic equivalence to the user query string; and outputting a representation of the link of the located term/link pair to an output device. 2 . The method of claim 1 , wherein generating the answer suggestion lookahead index data includes: generating topic knowledge graph data based on the collection of documents, the topic knowledge graph data representing a topic knowledge graph that includes (i) a respective topic and a set of subtopics of each of the collection of documents, and (ii) links to the content of the collection of documents, each of the links corresponding to one of a respective topic or subtopic of the set of subtopics of each of the collection of documents. 3 . The method of claim 2 , wherein generating the answer suggestion lookahead index data further includes: generating semantic annotation data representing respective semantic annotations to the topic and set of subtopics of each of the collection of documents, each of the semantic annotations including a respective predicate of a set of predicates and a respective object of a set of objects; identifying, for each predictate of the set of predicates, at least one object of the set of objects that, when combined with that predicate, correspond to one of a topic or a subtopic of the set of subtopics of each of the collection of documents; and identifying, for each object of the set of objects, at least one predicate of the set of predicates that, when combined with that object, correspond to one of a topic or a subtopic of the set of subtopics of each of the collection of documents. 4 . The method of claim 2 , wherein generating the topic knowledge graph data includes: for each document of the collection of documents, generating a respective document object model (DOM) of a set of DOMs, the respective DOM corresponding to each document, the DOM corresponding to each document including the topic, the set of subtopics, and the links to the content of each document; and aggregating the set of DOMs to generate a master link list for the topic knowledge graph, the master link list including a plurality of unique links to the content of the collection of documents. 5 . The method of claim 2 , wherein generating the topic knowledge graph data includes: for each document of the collection of documents, generating a respective document object model (DOM) of a set of DOMs, the respective DOM corresponding to that document, the DOM corresponding to that document including the topic, the set of subtopics, and the links to the content of that document; and formatting the set of DOMs to produce an aggregate flattened knowledge graph formatted for a natural language processing (NLP) pipeline. 6 . The method of claim 5 , wherein the aggregate flattened knowledge graph includes at least one of the topic and set of subtopics of each DOM of the set of DOMs, the NLP pipeline being configured to produce <S,P,O> triplets consisting of subject, predicate, and object for each of the topic and set of subtopics of each of the set of DOMs. 7 . The method of claim 5 , wherein generating the topic knowledge graph data further includes: splitting the aggregate flattened knowledge graph to produce a plurality of aggregate flattened knowledge graph parts; and performing natural language processing by the NP pipeline on each of the plurality of aggregate flattened knowledge graph parts to produce the knowledge graph, the natural language processing being performed on each of the plurality of aggregate flattened knowledge graph parts in parallel. 8 . The method of claim 5 , wherein generating the DOM of the set of DOMs includes: identifying non-informative content of each document of the collection of documents; and removing the non-informative content of that document to produce the topic, the set of subtopics, and the links to the content of that document, the removing including applying a term frequency inverse document frequency (TF-IDF) algorithm to that document. 9 . The method of claim 5 , wherein generating the DOM of the set of DOMs includes: reformatting each document of the collection of documents to produce the document formatted in a Markdown Markup language (MDML). 10 . A computer program product comprising a nontransitory storage medium, the computer program product including code that, when executed by processing circuitry of a computer configured to direct a user to content based on a semantic interpretation of a query input by the user, causes the processing circuitry to perform a method, the method comprising: receiving document data representing a collection of documents, each document of the collection of documents including a respective topic and content pertaining to the topic; generating answer suggestion lookahead index data based on the collection of documents, the answer suggestion lookahead index data representing a plurality of term/link pairs, each term/link pair of the plurality of term/link pairs including a semantically equivalent term and a corresponding link to content of the collection of documents; receiving a user query string; in response to receiving the user query string, locating a semantically equivalent term of a term/link pair of the plurality of term/link pairs, the semantically equivalent term being located based a semantic equivalence to the user query string; and outputting a representation of the link of the located term/link pair to an output device. 11 . The computer program product of claim 10 , wherein generating the answer suggestion lookahead index data includes: generating topic knowledge graph data based on the collection of documents, the topic knowledge graph data representing a topic knowledge graph that includes ( i ) a respective topic and a set of subtopics of each of the collection of documents, and (ii) links to the content of the collection of documents, each of the links corresponding to one of a respective topic or subtopic of the set of subtopics of each of the collection of documents. 12 . The computer program product of claim 11 , wherein generating the answer suggestion lookahead index data includes: obtaining search query log data, the search query log data representing a mapping between user query data and links to content of the collection of documents, the user query data representing a plurality of user queries; and performing a text mining operation on the search query log data to produce a set of common user queries for one of a topic or a respective subtopic of a set of subtopics of a document of the collection of documents, each of the topic and set of subtopics being associated with respective content corresponding to a respective link. 13 . The computer program product of claim 12 , wherein generating the answer suggestion lookahead index data further includes: forming pairs of ( i

Assignees

Inventors

Classifications

  • Natural language query formulation · CPC title

  • Selection or weighting of terms for indexing · CPC title

  • using system suggestions (G06F16/3325 takes precedence) · CPC title

  • G06F16/22Primary

    Indexing; Data structures therefor; Storage structures · CPC title

  • Presentation of query results · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020004873A1 cover?
Techniques of directing a user to content based on a semantic interpretation of a query input by the user involves generating links to specific content in a collection of documents in response to user string query, the links being generated based on an answer suggestion lookahead index. The answer suggestion lookahead index references a mapping between a plurality of groups of semantically equi…
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/3329. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jan 02 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).