Automated assistance for generating relevant and valuable search results for an entity of interest

US11210350B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11210350-B2
Application numberUS-201916261250-A
CountryUS
Kind codeB2
Filing dateJan 29, 2019
Priority dateMay 2, 2017
Publication dateDec 28, 2021
Grant dateDec 28, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods are provided for identifying relevant information for an entity, referred to as a seed entity. A plurality of search queries can be generated each comprising a property of a seed entity or one of the entities associated with the seed entity (seed-linked entities). Preferably, a collection of search queries includes ones representing different properties of the seed entity and properties of different seed-linked entities. Optionally, the collection of search queries is optimized to reduce search burden. Searches can then be conducted with the search queries in one or more data sources to obtain a plurality of search results, wherein each search result comprises a hit entity and one or more entities associated with the hit entity (hit-linked entity). For each of the search results, a score can be determined taking as input (a) likelihood of match between the seed entity and the hit entity or between a seed-linked entity and a hit-linked entity, (b) presence of a new entity in the search result not present in the search queries or a difference between the new entity and an entity present in the search queries, and (c) characteristic of the new entity in the search result. Based on the scores, high priority search results can be presented a user for further analysis.

First claim

Opening claim text (preview).

The invention claimed is: 1. A system for identifying relevant information for an entity comprising: one or more processors; and a memory storing instructions that, when executed by the one or more processors, cause the system to: conduct a pre-search related to a seed entity provided by a user in one or more data sources to obtain one or more entities associated with the seed entity; and generate a plurality of search queries comprising the seed entity and the one or more entities associated with the seed entity based on the pre-search, the generation comprising: determining a second entity validated to be linked to the seed entity, the second entity and the seed entity forming a seed cluster; identifying properties associated with the second entity and the seed entity; generating a search query that is associated with a subset of the identified properties; determining that the seed entity is associated with a third entity; in response to the determination that the seed entity is associated with the third entity: determining a likelihood of a match randomly occurring between:  the seed entity or an entity in the seed cluster; and  the third entity; and creating a second search query based on the determined likelihood. 2. The system of claim 1 , wherein at least one of the search queries comprises a fourth entity associated with one of the one or more entities associated with the seed entity, wherein the fourth entity is not known to be associated with the seed entity. 3. The system of claim 2 , wherein the fourth entity is identified from the pre-search based on a search query that comprises the seed entity. 4. The system of claim 1 , wherein the seed entity or the one or more entities associated with the seed entity are represented by their respective properties, wherein the properties are selected from a group comprising name, address, date of birth, social security number, city of birth, image, social networking account, phone number and email address. 5. The system of claim 1 , wherein the instructions further cause the system to: eliminate search queries from the search queries that are less likely to return desired search results. 6. The system of claim 1 , wherein the instructions, when executed, further cause the system to: conduct searches, based on the search queries, in the one or more data sources to obtain a plurality of search results, wherein each of the search results comprises a hit entity and one or more entities associated with the hit entity. 7. The system of claim 6 , wherein the instructions, when executed, further cause the system to: determine a score for each of the search results based on (a) a likelihood of a match between the seed entity and the hit entity or between an entity associated with the seed entity and an entity associated with the hit entity, (b) a presence of a new entity in the search result not present in the search queries or a difference between the new entity and an entity present in the search queries, and (c) a characteristic of the new entity in the search result. 8. The system of claim 7 , wherein the seed entity is associated with a person, and the likelihood of the match is determined based on a frequency of use of a name of the person. 9. The system of claim 7 , wherein the characteristic of the new entity is compared to a predefined list of characteristics of entities to determine a value of the characteristic. 10. A computer-implemented method comprising: conducting a pre-search related to a seed entity provided by a user in one or more data sources to obtain one or more entities associated with the seed entity; and generating a plurality of search queries comprising the seed entity and the one or more entities associated with the seed entity based on the pre-search, the generation comprising: determining a second entity validated to be linked to the seed entity, the second entity and the seed entity forming a seed cluster; identifying properties associated with the second entity and the seed entity; generating a search query that is associated with a subset of the identified properties; determining that the seed entity is associated with a third entity; in response to the determination that the seed entity is associated with the third entity: determining a likelihood of a match randomly occurring between: the seed entity or an entity in the seed cluster; and the third entity; and creating a second search query based on the determined likelihood. 11. The method of claim 10 , wherein at least one of the search queries comprises a fourth entity associated with one of the one or more entities associated with the seed entity, wherein the fourth entity is not known to be associated with the seed entity. 12. The method of claim 10 , wherein the seed entity or the one or more entities associated with the seed entity are represented by their respective properties, wherein the properties are selected from a group comprising name, address, date of birth, social security number, city of birth, image, social networking account, phone number and email address. 13. The method of claim 10 , further comprising: eliminate search queries from the search queries that are less likely to return desired search results. 14. The method of claim 10 , further comprising: conducting searches, based on the search queries, in the one or more data sources to obtain a plurality of search results, wherein each of the search results comprises a hit entity and one or more entities associated with the hit entity. 15. The method of claim 14 , further comprising: determining a score for each of the search results based on (a) a likelihood of a match between the seed entity and the hit entity or between an entity associated with the seed entity and an entity associated with the hit entity, (b) a presence of a new entity in the search result not present in the search queries or a difference between the new entity and an entity present in the search queries, and (c) a characteristic of the new entity in the search result. 16. The method of claim 15 , wherein the seed entity is associated with a person, and the likelihood of the match is determined based on a frequency of use of a name of the person. 17. The method of claim 15 , wherein the characteristic of the new entity is compared to a predefined list of characteristics of entities to determine a value of the characteristic. 18. A non-transitory computer readable medium of a computing system comprising instructions that, when executed, cause one or more processors of the computing system to perform: conducting a pre-search related to a seed entity provided by a user in one or more data sources to obtain one or more entities associated with the seed entity; and generating a plurality of search queries comprising the seed entity and the one or more entities associated with the seed entity based on the pre-search, the generation comprising: determining a second entity validated to be linked to the seed entity, the second entity and the seed entity forming a seed cluster; identifying properties associated with the second entity and the seed entity; generating a search query that is associated with a subset of the identified properties; determining that the seed entity is associated with a third entity; in response to the determination that the seed entity is associated with the third entity: determining a likelihood of a match randomly occurring between: the seed entity or an entity in the seed cluster; and the third entity; and cr

Assignees

Inventors

Classifications

  • G06F16/38Primary

    Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually · CPC title

  • G06F16/951Primary

    Indexing; Web crawling techniques · CPC title

  • Presentation of query results · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11210350B2 cover?
Systems and methods are provided for identifying relevant information for an entity, referred to as a seed entity. A plurality of search queries can be generated each comprising a property of a seed entity or one of the entities associated with the seed entity (seed-linked entities). Preferably, a collection of search queries includes ones representing different properties of the seed entity an…
Who is the assignee on this patent?
Palantir Technologies Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/38. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 28 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).