Information extraction from question and answer websites

US10452694B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10452694-B2
Application numberUS-201715849212-A
CountryUS
Kind codeB2
Filing dateDec 20, 2017
Priority dateMar 25, 2015
Publication dateOct 22, 2019
Grant dateOct 22, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus for obtaining a resource, identifying a first portion of text of the resource that is characterized as a question, and a second part of text of the resource that is characterized as an answer to the question, identifying an entity that is referenced by one or more terms of the text that is characterized as the question, a relationship type that is referenced by one or more other terms of the text that is characterized as the question, and an entity that is referenced by the text that is characterized as the answer to the question, and adjusting a score for a relationship of the relationship type for the entity that is referenced by the one or more terms of the text that is characterized as the question and the entity that is referenced by the text that is characterized as the answer to the question.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: obtaining a resource; identifying (i) a first portion of text of the resource that is characterized as a question, and (ii) a second portion of text of the resource that is characterized as an answer to the question; identifying, (i) an entity that is referenced by the first portion of text that is characterized as the question, and (ii) an entity that is referenced by the second portion of text that is characterized as the answer to the question; determining, by a machine learned classifier, one or more candidate relationship types that are referenced by the first portion of text that is characterized as the question and the second portion of text that is characterized as the answer to the question, wherein each of the one or more candidate relationship types is associated with a respective probability, determined by the machine learned classifier, of the candidate relationship type being a proper relationship type between the entity that is referenced by the first portion of text that is characterized as the question and the entity that is referenced by the second portion of text that is characterized as the answer to the question; selecting a particular relationship type from among the one or more candidate relationship types based at least on the one or more probabilities; and adjusting a score associated with a relationship of the particular relationship type for the entity that is referenced by the first portion of text that is characterized as the question and the entity that is referenced by the second portion of text that is characterized as the answer to the question. 2. The computer-implemented method of claim 1 , wherein the resource is a question and answer (Q&A) website resource. 3. The computer-implemented method of claim 1 , wherein determining the one or more candidate relationship types and the one or more probabilities comprises: comparing the first portion of the text that is characterized as the question and one or more templates that are each associated with a respective relationship type; and determining the one or more candidate relationship types and the one or more probabilities based at least on the comparison of the first portion of the text that is characterized as the question and the one or more templates that are each associated with a respective relationship type indicating a match with one or more particular templates. 4. The computer-implemented method of claim 3 , wherein each of the one or more templates is one of a surface-based template or a parser-based template. 5. The computer-implemented method of claim 1 , wherein determining the one or more candidate relationship types and the one or more probabilities comprises: determining an entity class corresponding to the entity that is referenced by the first portion of the text that is characterized as the question and an entity class corresponding to the entity that is referenced by the second portion of the text that is characterized as the answer to the question; and determining the one or more candidate relationship types and the one or more probabilities based at least on the entity class corresponding to the entity that is referenced by the first portion of the text that is characterized as the question and the entity class corresponding to the entity that is referenced by the second portion of the text that is characterized as the answer to the question. 6. The computer-implemented method of claim 1 , wherein determining the one or more candidate relationship types and the one or more probabilities comprises: determining a parse path from a head token identified from the first portion of the text that is characterized as the question to the entity that is referenced by the second portion of the text that is characterized as the answer to the question, wherein the parse path indicates a syntactic dependency between the head token and the entity that is referenced by the second portion of the text that is characterized as the answer to the question; and determining the one or more candidate relationship types and the one or more probabilities based at least on the parse path. 7. The computer-implemented method of claim 1 , wherein determining the one or more candidate relationship types and the one or more probabilities comprises: determining one or more first terms that are adjacent to one or more terms of the first portion of text that is characterized as the question that reference the entity that is referenced by the first portion of text that is characterized as the question; determining one or more second terms that are adjacent to one or more terms of the second portion of text that is characterized as the answer to the question that reference the entity that is referenced by the second portion of text that is characterized as the answer to the question; and determining the one or more candidate relationship types and the one or more probabilities based at least on the one or more first terms and the one or more second terms. 8. The computer-implemented method of claim 1 , comprising: aggregating the score associated with the relationship of the particular relationship type for the entity that is referenced by the first portion of text that is characterized as the question and the entity that is referenced by the second portion of text that is characterized as the answer to the question and one or more other scores that are each associated with a relationship of the particular relationship type for the entity that is referenced by the first portion of text that is characterized as the question and another entity; comparing the score associated with the relationship of the particular relationship type for the entity that is referenced by the first portion of text that is characterized as the question and the entity that is referenced by the second portion of text that is characterized as the answer to the question and the one or more other scores that are each associated with a relationship of the particular relationship type for the entity that is referenced by the first portion of text that is characterized as the question and another entity; and establishing, at an entity relationship model and based at least on the comparison, a relationship of the particular relationship type between the entity that is referenced by the first portion of text that is characterized as the question and the entity that is referenced by the second portion of text that is characterized as the answer to the question. 9. A system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: obtaining a resource; identifying (i) a first portion of text of the resource that is characterized as a question, and (ii) a second portion of text of the resource that is characterized as an answer to the question; identifying, (i) an entity that is referenced by the first portion of text that is characterized as the question, and (ii) an entity that is referenced by the second portion of text that is characterized as the answer to the question; determining, by a machine learned classifier, one or more candidate relationship types that are referenced by the first portion of text that is characterized as the question and the second portion of text that is characterized as the answer to the question, wherein each of the one or more candidate relationship types is associated with a respective probability, determined by the machine learned classifier, of the candidate relationship type being a proper relationship type between the entity that i

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10452694B2 cover?
Methods, systems, and apparatus for obtaining a resource, identifying a first portion of text of the resource that is characterized as a question, and a second part of text of the resource that is characterized as an answer to the question, identifying an entity that is referenced by one or more terms of the text that is characterized as the question, a relationship type that is referenced by o…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/3322. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 22 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).