Adaptable data caching mechanism for in-memory cluster computing

US2019050334A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2019050334-A1
Application numberUS-201816159662-A
CountryUS
Kind codeA1
Filing dateOct 13, 2018
Priority dateDec 16, 2014
Publication dateFeb 14, 2019
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An in-memory cluster computing framework node is described. The node includes storage devices having various priorities. The node also includes a resource monitor to monitor the operation of the storage devices. The node also includes a resource scheduler. When the resource monitor indicates that a storage device is at or approaching saturation, the resource scheduler can migrate data from that storage device to another storage device of lower priority.

First claim

Opening claim text (preview).

What is claimed is: 1 . An in-memory cluster computing framework node, comprising: a processor; a first storage device storing cached data, the first storage device having a first priority ranking the first storage device according to at least one metric; a second storage device having a second priority ranking the second storage device according to the at least one metric; a resource monitor operative to monitor the first storage device; and a resource scheduler operative to migrate the cached data from the first storage device to the second storage device if the resource monitor indicates that the first storage device is approaching a performance characteristic limit according to the at least one metric, wherein the first priority for the first storage device and the second priority for the second storage device are determined without reference to an application and the application's data. 2 . The in-memory cluster computing framework node according to claim 1 , wherein the resource monitor is operative to determine the capabilities of the first storage device. 3 . The in-memory cluster computing framework node according to claim 1 , wherein the resource scheduler is operative to select the first storage device to initially cache the data based on information provided by an application that uses the data. 4 . The in-memory cluster computing framework node according to claim 1 , wherein: the first priority is higher than the second priority; and the resource scheduler is operative to select the first storage device to initially cache the data as a higher priority device. 5 . The in-memory cluster computing framework node according to claim 4 , wherein the resource scheduler is operative to select the second storage device for future data caching if the resource monitor indicates that the first storage device is approaching the performance characteristic limit according to the at least one metric. 6 . The in-memory cluster computing framework node according to claim 1 , further comprising a replicator to replicate the cached data on a third storage device. 7 . The in-memory cluster computing framework node according to claim 6 , wherein the third storage device is in a second in-memory cluster computing framework node. 8 . The in-memory cluster computing framework node according to claim 1 , wherein the data includes a resilient distributed dataset (RDD) on the first storage device. 9 . The in-memory cluster computing framework node according to claim 1 , wherein the resource scheduler is operative to migrate all data from the first storage device to the second storage device. 10 . The in-memory cluster computing framework node according to claim 1 , wherein the resource scheduler is operative to migrate an oldest data from the first storage device to the second storage device. 11 . The in-memory cluster computing framework node according to claim 1 , wherein the at least one metric is drawn from a set including latency and bandwidth. 12 . A method for caching data in an in-memory cluster computing framework, comprising: caching a data on a first storage device with a first priority in a cluster node, the first priority ranking the first storage device according to at least one metric; monitoring the operation of the first storage device; and if the first storage device is approaching a performance characteristic limit according to the at least one metric, migrating the cached data to a second storage device with a second priority, the second priority ranking the second storage device according to the at least one metric, wherein the first priority for the first storage device and the second priority for the second storage device are determined without reference to an application and the application's data. 13 . The method according to claim 12 , wherein monitoring the operation of the first storage device includes determining a capability of the first storage device. 14 . The method according to claim 12 , wherein caching a data on a first storage device with a first priority in a cluster node includes caching the data on the first storage device in the cluster node, the first storage device selected by an application using the data. 15 . The method according to claim 12 , wherein caching a data on a first storage device with a first priority in a cluster node includes caching the data on the first storage device in the cluster node, the first storage device having a higher priority among a plurality of devices. 16 . The method according to claim 15 , further comprising, if the first storage device is approaching the performance characteristic limit according to the at least one metric, re-directing future cache requests for the first storage device in the cluster node in the cluster node to the second storage device. 17 . The method according to claim 12 , further comprising replicating the cached data on a third storage device. 18 . The method according to claim 17 , wherein replicating the cached data on a third storage device includes replicating the cached data on the third storage device in a second cluster node. 19 . The method according to claim 12 , wherein caching a data on a first storage device with a first priority in a cluster node includes caching a resilient distributed dataset (RDD) on the first storage device. 20 . The method according to claim 12 , wherein migrating the cached data to a second storage device with a second priority includes migrating all data on the first storage device to the second storage device. 21 . The method according to claim 12 , wherein migrating the cached data to a second storage device with a second priority includes migrating an oldest data on the first storage device to the second storage device. 22 . The method according to claim 12 , wherein the at least one metric is drawn from a set including latency and bandwidth.

Assignees

Inventors

Classifications

  • for performance assessment · CPC title

  • Error detection or correction of the data by redundancy in operations (error detection or correction of the data by redundancy in hardware G06F11/16) · CPC title

  • Migration mechanisms · CPC title

  • Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

  • where the computing system component is a storage system, e.g. DASD based or network based (digital input from or digital output to record carriers G06F3/06; digital recording or reproducing G11B20/18; for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS], H04L67/1097) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2019050334A1 cover?
An in-memory cluster computing framework node is described. The node includes storage devices having various priorities. The node also includes a resource monitor to monitor the operation of the storage devices. The node also includes a resource scheduler. When the resource monitor indicates that a storage device is at or approaching saturation, the resource scheduler can migrate data from that…
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F12/0806. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Feb 14 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).