System and method for automatically detecting and interactively displaying information about entities, activities, and events from multiple-modality natural language sources

US10698964B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10698964-B2
Application numberUS-201715419615-A
CountryUS
Kind codeB2
Filing dateJan 30, 2017
Priority dateJun 11, 2012
Publication dateJun 30, 2020
Grant dateJun 30, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for automatically extracting and organizing information by a processing device from a plurality of data sources is provided. A natural language processing information extraction pipeline that includes an automatic detection of entities is applied to the data sources. Information about detected entities is identified by analyzing products of the natural language processing pipeline. Identified information is grouped into equivalence classes containing equivalent information. At least one displayable representation of the equivalence classes is created. An order in which the at least one displayable representation is displayed is computed. A combined representation of the equivalence classes that respects the order in which the displayable representation is displayed is produced.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for processing information by a processing device, the method comprising: receiving a user query; inferring a user query intention from the user query to develop an inferred user intention; detecting a query type of the inferred user intention selected from a set of query types comprising a person type, an organization type, and an event type; identifying a plurality of information categories based on the query type; searching for information within each of the plurality of information categories based on identifying the plurality of information categories, wherein the searching is further based on natural language processing of a corpus of documents having multiple modalities comprising at least one of text, audio and video; identifying a plurality of items in the searched information for each of the plurality of information categories, wherein each of the plurality of items comprises a span of text from the corpus of documents including two or more entities and a relation mention explicitly describing a relation between the two or more entities; organizing the items into a plurality of equivalence classes, wherein each item having a same equivalence class comprises an equivalent relation mention; selecting a representative item from the plurality of items for each of the equivalence classes; and automatically generating a page in response to the user query by adaptively building a template with a plurality of page elements that correspond to the plurality of information categories based on the inferred user intention, wherein each of the plurality of page elements displays the searched information for a single information category from the plurality of information categories based on the selected representative items. 2. The method of claim 1 , further comprising: when the user query selects a person who has a political status, detecting the political status, and identifying the information categories comprising at least one of an election campaign, public appearances, statements, and public service history based on detecting the political status. 3. The method of claim 1 , further comprising: when the user query selects a company identifying the information categories comprising at least one of recent news about the company, information on the company's top officials, and press releases for the company based on the user query selecting the company. 4. The method of claim 1 , further comprising: when the user query selects an event identifying the information categories comprising at least one of news items about the event and reactions to the event based on the user query selecting the event. 5. The method of claim 4 , wherein entities in the event are identified and relevant information about the entities is searched. 6. A non-transitory computer program storage device embodying instructions executable by a processor, the non-transitory computer program storage device comprising storage memory configured to store: program code that receives a user query; program code that infers a user query intention from the user query to develop an inferred user intention; program code that detects a query type of the inferred user intention selected from a set of query types comprising a person type, an organization type, and an event type; program code that identifies a plurality of information categories based on the query type; program code that searches for information within each of the plurality of information categories based on identifying the plurality of information categories, wherein the searching is further based on natural language processing of a corpus of documents having multiple modalities comprising at least one of text, audio and video; program code that identifies a plurality of items in the searched information for each of the plurality of information categories, wherein each of the plurality of items comprises a span of text from the corpus of documents including two or more entities and a relation mention explicitly describing a relation between the two or more entities; program code that organizes the items into a plurality of equivalence classes, wherein each item having a same equivalence class comprises an equivalent relation mention; program code that selects a representative item from the plurality of items for each of the equivalence classes; and program code that automatically generates a page in response to the user query by adaptively building a template with a plurality of page elements that correspond to the plurality of information categories based on the inferred user intention, wherein each of the plurality of page elements displays the searched information for a single information category from the plurality of information categories based on the selected representative items. 7. The non-transitory computer program storage device of claim 6 , further comprising: program code that, when the user query selects a person who has a political status, detects the political status, identifies the information categories comprising at least one of an election campaign, public appearances, statements, and public service history based on detecting the political status. 8. The non-transitory computer program storage device of claim 6 , further comprising program code that, when the user query selects a company, identifies the information categories comprising at least one of recent news about the company, information on the company's top officials, and press releases for the company based on the user query selecting the company. 9. The non-transitory computer program storage device of claim 6 , further comprising program code that, when the user query selects an event, identifies the information categories comprising at least one of news items about the event and reactions to the event based on the user query selecting the event. 10. The non-transitory computer program storage device of claim 9 , wherein entities in the event are identified and relevant information about the entities is searched. 11. A method for processing information by a processing device, the method comprising: receiving a user query; inferring a user query intention from the user query to develop an inferred user intention; detecting a query type based on the inferred user intention; searching for a plurality of information categories in a corpus of documents based on the query type; identifying a plurality of items for each of the plurality of information categories based on the search, wherein each of the plurality of items comprises a span of text from the corpus of documents including two or more entities and a relation mention explicitly describing a relation between the two or more entities; organizing the items into a plurality of equivalence classes, wherein each item having a same equivalence class comprises an equivalent relation mention; selecting a representative item from the plurality of items for each of the plurality of equivalence classes; and generating a page based on the query type, the page comprising a plurality of tabs, wherein each tab contains the representative items corresponding to the plurality of equivalence classes of an information category from the plurality of information categories. 12. The method of claim 11 , further comprising: scoring each of the equivalence classes; and prioritizing the equivalence classes based on the scoring.

Assignees

Inventors

Classifications

  • G06F40/295Primary

    Named entity recognition · CPC title

  • Clustering; Classification · CPC title

  • Search customisation based on user profiles and personalisation · CPC title

  • using natural language analysis · CPC title

  • Creation of semantic tools, e.g. ontology or thesauri · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10698964B2 cover?
A method for automatically extracting and organizing information by a processing device from a plurality of data sources is provided. A natural language processing information extraction pipeline that includes an automatic detection of entities is applied to the data sources. Information about detected entities is identified by analyzing products of the natural language processing pipeline. Ide…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F40/295. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 30 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).