What technology area does this patent fall under?

Primary CPC classification G06F17/30333. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 22 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Data distribution/retrieval using multi-dimensional index

US9292559B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9292559-B2
Application number	US-201313743988-A
Country	US
Kind code	B2
Filing date	Jan 17, 2013
Priority date	Jan 19, 2012
Publication date	Mar 22, 2016
Grant date	Mar 22, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A distributed data storage system uses a data distribution and location algorithm based on distance functions and hyper-spheres in a multi-dimensional space. The distributed data storage system uses the algorithm to maintain, over time, a balanced distribution across a number of computers interconnected by a network of a varying set of data items. Each data item includes one or more key fields. The system also includes an efficient partial-match and exact-match search across a whole set of data items using as search criteria the values of any or all of the sought data item's key field(s).

First claim

Opening claim text (preview).

The invention claimed is: 1. A method of data distribution in a distributed data storage system, the distributed data storage system storing a set of data items across a plurality of physically distinct computers, each physically distinct computer managing a subset of the data items, the method comprising: defining at least one metric space associated with the distributed data storage system; defining within the one metric space a number of hyper-spheres, each of the defined hyper-spheres having a center and radius, data points representing corresponding data items, and an association with one or more of the physically distinct computers; calculating a distance from a specified data point, corresponding to a specified data item to be stored in the distributed data storage system, to a center of each of the defined hyper-spheres; and selecting at least one physically distinct computer for storage of the specified data item based on which of the hyper-sphere centers is closest to the specified data point; wherein each data item includes one or more key fields, each key field representing a defining field of the data item within the set of data items; wherein every value of a key field is mapped to a real number; and wherein the set of real numbers to which the key fields of a data item are mapped comprises a data point in the metric space. 2. The method of claim 1 , further comprising: calculating a distance from a specified data point, corresponding to a specified data item to be retrieved from the distributed data storage system, to a center of each of the defined hyper-spheres; and selecting at least one physically distinct computer for data retrieval of the specified data item based on which of the hyper-sphere centers is closest to the specified data point and the association of the hyper-sphere to such physically distinct computer. 3. The method of claim 1 : wherein each subset includes one or more key fields, each key field representing a defining field of one item within the set of data items; the method further comprising retrieving data items when only a subset of the key field values defining the data items are known, by matching the subset of the whole set of data items whose mapped points are located on a hyper-plane that intersects one or more of the defined hyper-spheres, the hyper-plane defined by the coordinates to which the known key field values map. 4. The method of claim 3 , wherein a data retrieval request is addressed simultaneously to each physically distinct computer associated with each of the hyper-spheres intersected by the hyper-plane. 5. The method of claim 1 , wherein for numeric data, the mapping of numeric key fields is based on an identity function f(x)=x. 6. The method of claim 1 , wherein, for character data, the mapping of character strings is based on Levenshtein distance. 7. The method of claim 1 , wherein, for numeric data, the mapping of numeric key fields is based on a compressing function f(x) such that f(x min )>>x min and f(x max )>>x max . 8. The method of claim 1 , further comprising using a shared state containing the hyper-sphere centers and radii, and an identifier of every computer that is associated with every hyper-sphere of the defined at least one metric space and associated hyper- spheres, on reception of a data retrieval request from an external entity. 9. The method of claim 1 , wherein the defined hyper-sphere centers are fixed. 10. The method of claim 1 , wherein the defined hyper-sphere centers are dynamically updated based on historical data. 11. The method of claim 1 , further comprising dynamically splitting or joining defined hyper-spheres based on clustering of data points. 12. The method of claim 11 , wherein splitting defined hyper-spheres comprises reallocating data points within a portion of a first hyper-sphere to a neighboring second hyper-sphere. 13. The method of claim 12 , wherein reallocating data points includes transferring the corresponding data items from a first physically distinct computer associated with the first hyper-sphere to a second physically distinct computer associated with the neighboring second hyper-sphere. 14. The method of claim 11 , wherein joining defined hyper-spheres comprises joining two defined hyper-spheres into a single defined hyper-sphere containing all data points of the two joined hyper-spheres and retaining all corresponding data items in a physically distinct computer associated with the single defined hyper-sphere. 15. The method of claim 1 , further comprising: redistributing the set of data items across the distributed data storage system when one or more physically distinct computers are added to or removed from the plurality of physically distinct computers by creating or destroying respective hyper-spheres associated with each of the added or removed physically distinct computers followed by transferring of corresponding data items. 16. The method of claim 1 , wherein the distributed data storage system comprises a cloud based storage system. 17. The method of claim 1 , wherein the distributed data storage system comprises a telecommunications storage system and the set of data items includes telecommunications-related data. 18. A distributed data storage system for storing a set of data items across a plurality of physically distinct computers interconnected by a network, each physically distinct computer managing a subset of the data items, each computer of the plurality of computers configured as a query processor and/or data storage device, the query processor and/or data storage device comprising a data manager implemented using one or more of the physically distinct computers and configured to: define at least one metric space associated with the distributed data storage system; define within the at least one metric space a number of hyper-spheres, each of the hyper-spheres having a center and radius, data points representing corresponding data items, and an association with one or more of the physically distinct computers; and determine a location to store a specified data item in the distributed data storage system by: calculating a distance of the corresponding specified data point to a center of each of the defined hyper-spheres; and selecting at least one physically distinct computer for storage of the specified data item based on which associated hyper-sphere's center is closest to the corresponding specified data point; wherein each data item includes one or more key fields, each key field representing a defining field of the data item within the set of data items; wherein every value of a key field is mapped to a real number; and wherein the set of real numbers to which the key fields of a data item are mapped comprises a data point in the metric space. 19. The distributed data storage system of claim 18 , wherein the data manager is further configured to determine a location to retrieve a specified data item in the distributed data storage system by: calculating a distance of the corresponding specified data point to a center of each of the defined hyper-spheres; and selecting at least one physically distinct computer for retrieval of the specified data item based on which associated hyper-sphere's center is closest to the corresponding specified data point. 20. The distributed data storage system of claim 18 , wherein the data manager is further configured to redistribute data points within the hyper-spheres and corresponding data items across associated physically

Assignees

Ericsson Telefon Ab L M

Inventors

Arauz-Rosado Jesus-Javier

Classifications

G06F16/278
Data partitioning, e.g. horizontal or vertical partitioning · CPC title
G06F17/30283
Physics · mapped topic
G06F17/30333Primary
Physics · mapped topic
G06F16/27Primary
Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title
G06F16/2264Primary
Multidimensional index structures · CPC title

Patent family

Related publications grouped by family.

View patent family 45999753

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9292559B2 cover?: A distributed data storage system uses a data distribution and location algorithm based on distance functions and hyper-spheres in a multi-dimensional space. The distributed data storage system uses the algorithm to maintain, over time, a balanced distribution across a number of computers interconnected by a network of a varying set of data items. Each data item includes one or more key fields.…
Who is the assignee on this patent?: Ericsson Telefon Ab L M
What technology area does this patent fall under?: Primary CPC classification G06F17/30333. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 22 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).