Centroid detection for clustering

US9280593B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9280593-B1
Application numberUS-201313949526-A
CountryUS
Kind codeB1
Filing dateJul 24, 2013
Priority dateJul 24, 2013
Publication dateMar 8, 2016
Grant dateMar 8, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of categorizing data points is described which, when combined with a clustering algorithm, provides groupings of data points that have an improved confidence interval. The method can be used to find an optimal number of groupings for a dataset, which in turn allows a user to categorize a group of data points for processing. In some examples, a dataset containing a number of data points may be accessed. Additionally, in some aspects, groupings of data points within the dataset may be grouped based at least in part on similarities between the data. Further, a number of groupings of data points may be adjusted so that the distance between the data points within one or more groupings of data points may fit within a confidence level.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for categorizing data points, comprising: identifying a first number of centroids indicating how many centroids are to be used in evaluating a dataset; selecting a location for the identified first number of centroids within the dataset; performing a clustering procedure, comprising: repeating a second number of times: assigning, to data points within the dataset, a cluster based at least in part on a centroid location; determining a center point of at least one cluster of the data points; and moving the centroid location to the center point of its respective cluster; adjusting the first number of centroids in the dataset and repeating the clustering procedure based at least in part on the movement of at least one centroid location by a delta amount; and identifying at least one final centroid location. 2. The computer-implemented method of claim 1 , wherein the final centroid location is identified when the movement of all of the centroid locations is less than a delta amount. 3. The computer-implemented method of claim 1 , wherein the centroid locations are initially selected randomly. 4. The computer-implemented method of claim 1 , wherein the center point is determined based at least in part on a meta-clustering technique. 5. The computer-implemented method of claim 1 , wherein the delta identifies a confidence interval associated with the final centroid location. 6. The computer-implemented method of claim 1 , wherein adjusting the first number of centroids in the dataset results in a new location of the identified first number of centroids. 7. The computer-implemented method of claim 6 , wherein the new location of the identified first number of centroids is different from a previous location of the identified first number of centroids. 8. A computer-implemented method of categorizing data points, comprising: selecting a number of centroids; assigning, to data points, a cluster based at least in part on a location of the centroid; determining a center point of the cluster of data points; determining a difference between the location of the centroid and the center point of the cluster; adjusting the number of centroids based at least in part on the difference between the location of the centroid and the center point of the cluster; and identifying a final a centroid location based at least in part on the difference between the location of the centroid and the center point of the cluster. 9. The computer-implemented method of claim 8 , wherein the data points are assigned to the cluster based at least in part on a vector distance from the centroid location. 10. The computer-implemented method of claim 8 , wherein the cluster's center point along an axis of a dataset is determined to be a mean average of all of that cluster's data points along that axis. 11. The computer-implemented method of claim 8 , wherein the cluster's center point along an axis of a dataset is determined to be a median of all of that cluster's data points along that axis. 12. The computer-implemented method of claim 8 , wherein the number of centroids is adjusted by adding or removing one or more centroid locations. 13. The computer-implemented method of claim 8 , further comprising, reporting a confidence level for a centroid location based at least in part on the difference between the location of the centroid and the center point of the cluster. 14. The computer-implemented method of claim 8 , further comprising, causing the final centroid locations to be displayed to a device associated with a user. 15. The computer-implemented method of claim 8 , wherein the cluster assignment is repeated a number of times. 16. The computer-implemented method of claim 15 , wherein the number of times that the cluster assignment is repeated is chosen to provide an optimization level.

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9280593B1 cover?
A method of categorizing data points is described which, when combined with a clustering algorithm, provides groupings of data points that have an improved confidence interval. The method can be used to find an optimal number of groupings for a dataset, which in turn allows a user to categorize a group of data points for processing. In some examples, a dataset containing a number of data points…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/285. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 08 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).