Generating data clusters

US10216801B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10216801-B2
Application numberUS-201514819272-A
CountryUS
Kind codeB2
Filing dateAug 5, 2015
Priority dateMar 15, 2013
Publication dateFeb 26, 2019
Grant dateFeb 26, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are disclosed for for prioritizing a plurality of clusters. Prioritizing clusters may generally include identifying a scoring strategy for prioritizing the plurality of clusters. Each cluster is generated from a seed and stores a collection of data retrieved using the seed. For each cluster, elements of the collection of data stored by the cluster are evaluated according to the scoring strategy and a score is assigned to the cluster based on the evaluation. The clusters may be ranked according to the respective scores assigned to the plurality of clusters. The collection of data stored by each cluster may include financial data evaluated by the scoring strategy for a risk of fraud. The score assigned to each cluster may correspond to an amount at risk.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: by one or more hardware computer processors configured with specific computer executable instructions: accessing one or more electronic data stores, the one or more electronic data stores storing a plurality of data entities and respective data entity attributes; applying a clustering strategy to generate a data entity cluster by at least: designating a seed data entity, from the plurality of data entities, as the data entity cluster; accessing, based on the clustering strategy, one or more search protocols; performing first growth of the data entity cluster by executing at least a first of the one or more search protocols on the one or more electronic data stores to identify one or more data entities related to the seed data entity; adding the one or more data entities to the data entity cluster; performing second growth of the data entity cluster by executing at least a second of the one or more search protocols on the one or more electronic data stores to identify one or more additional data entities related to the one or more added data entities, the second search protocol different than the first search protocol; and adding the one or more additional data entities to the data entity cluster; and storing the data entity cluster in at least one of the one or more electronic data stores. 2. The computer-implemented method of claim 1 , wherein executing at least the first of the one or more search protocols on the one or more electronic data stores to identify one or more data entities related to the seed data entity further comprises: by the one or more hardware computer processors configured with specific computer executable instructions: identifying at least one data entity attribute associated with the seed data entity; and evaluating the plurality of data entities to determine the one or more data entities sharing the at least one data entity attribute with the seed data entity. 3. The computer-implemented method of claim 2 , wherein executing at least the first of the one or more search protocols on the one or more electronic data stores to identify one or more data entities related to the seed data entity further comprises: by the one or more hardware computer processors configured with specific computer executable instructions: applying a filter to the at least one data entity attribute associated with the seed data entity, the filter selected based on the clustering strategy. 4. The computer-implemented method of claim 1 further comprising: by the one or more hardware computer processors configured with specific computer executable instructions: comparing data entities associated with the data entity cluster to data entities associated with a second data entity cluster; and in response to determining that at least one data entity associated with the data entity cluster shares an attribute with and/or is related to at least one data entity associated with the second data entity cluster, merging the data entity cluster and the second data entity cluster. 5. The computer-implemented method of claim 1 , wherein the first search protocol searches for data entities in a first electronic data store and the second search protocol searches for data entities in a second electronic data store. 6. The computer-implemented method of claim 1 , wherein the data entity cluster is iteratively generated by further: by the one or more hardware computer processors configured with specific computer executable instructions: executing at least a third of the one or more search protocols on the one or more electronic data stores to identify yet one or more additional data entities related to the one or more additional data entities; and adding the yet one or more additional data entities to the data entity cluster. 7. The computer-implemented method of claim 1 further comprising: by the one or more hardware computer processors configured with specific computer executable instructions: causing a ranking score to be assigned to the data entity cluster; and ordering a listing of the data entity cluster and other data entity clusters relative to a one another. 8. A computer-implemented method of accessing one or more electronic data sources, the method comprising: by one or more hardware computer processors configured with specific computer executable instructions: accessing one or more electronic data stores, the one or more electronic data stores storing: a plurality of data entities and respective data entity attributes, and a plurality of data entity clusters; and causing access of a data entity cluster of the plurality of data entity clusters, wherein the data entity cluster is related to a clustering strategy, and wherein the data entity cluster has been iteratively generated by: designating a seed data entity, from the plurality of data entities, as the data entity cluster; accessing, based on the clustering strategy, one or more search protocols; performing first growth of the data entity cluster by executing at least a first of the one or more search protocols on the one or more electronic data stores to identify one or more data entities related to the seed data entity; adding the one or more data entities to the data entity cluster; performing second growth of the data entity cluster by executing at least a second of the one or more search protocols on the one or more electronic data stores to identify one or more additional data entities related to the one or more added data entities, the second search protocol different than the first search protocol; and adding the one or more additional data entities to the data entity cluster. 9. The computer-implemented method of claim 8 , wherein executing at least the first of the one or more search protocols on the one or more electronic data stores to identify one or more data entities related to the seed data entity further comprises: by the one or more hardware computer processors configured with specific computer executable instructions: identifying at least one data entity attribute associated with the seed data entity; and evaluating the plurality of data entities to determine the one or more data entities sharing the at least one data entity attribute with the seed data entity. 10. The computer-implemented method of claim 9 , wherein executing at least the first of the one or more search protocols on the one or more electronic data stores to identify one or more data entities related to the seed data entity further comprises: by the one or more hardware computer processors configured with specific computer executable instructions: applying a filter to the at least one data entity attribute associated with the seed data entity, the filter selected based on the clustering strategy. 11. The computer-implemented method of claim 8 further comprising: by the one or more hardware computer processors configured with specific computer executable instructions: accessing, from the one or more electronic data stores, a scoring strategy for prioritizing the plurality of data entity clusters relative to one another; for each particular data entity cluster of the plurality of data entity clusters: evaluating, based on the scoring strategy, the particular data entity cluster; and assigning, based on the evaluation, a score to the particular data entity cluster; and ranking the plurality of data entity clusters according to the respective assigned scores. 12. The computer-implemented method of claim 11 , wherein the score assigned to each data entity cluster corresponds to an amount at risk. 1

Assignees

Inventors

Classifications

  • Credit; Loans; Processing thereof · CPC title

  • Grouping and aggregation · CPC title

  • Product, service or business identity fraud · CPC title

  • H04L63/145Primary

    the attack involving the propagation of malware through the network, e.g. viruses, trojans or worms · CPC title

  • Clustering; Classification · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10216801B2 cover?
Techniques are disclosed for for prioritizing a plurality of clusters. Prioritizing clusters may generally include identifying a scoring strategy for prioritizing the plurality of clusters. Each cluster is generated from a seed and stores a collection of data retrieved using the seed. For each cluster, elements of the collection of data stored by the cluster are evaluated according to the scori…
Who is the assignee on this patent?
Palantir Technologies Inc
What technology area does this patent fall under?
Primary CPC classification H04L63/145. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Feb 26 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).