Adaptable data caching mechanism for in-memory cluster computing

US2016170882A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016170882-A1
Application numberUS-201514712895-A
CountryUS
Kind codeA1
Filing dateMay 14, 2015
Priority dateDec 16, 2014
Publication dateJun 16, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An in-memory cluster computing framework node is described. The node includes storage devices having various priorities. The node also includes a resource monitor to monitor the operation of the storage devices. The node also includes a resource scheduler. When the resource monitor indicates that a storage device is at or approaching saturation, the resource scheduler can migrate data from that storage device to another storage device of lower priority.

First claim

Opening claim text (preview).

What is claimed is: 1 . An in-memory cluster computing framework node ( 305 ), comprising: a processor ( 315 ); a first storage device ( 220 , 225 , 230 , 235 , 130 ) storing cached data, the first storage device ( 220 , 225 , 230 , 235 , 130 ) having a first priority ( 240 , 245 , 250 , 255 , 260 ); a second storage device ( 220 , 225 , 230 , 235 , 130 ) having a second priority ( 240 , 245 , 250 , 255 , 260 ); a resource monitor ( 205 ) operative to monitor the first storage device ( 220 , 225 , 230 , 235 , 130 ); and a resource scheduler ( 210 ) operative to migrate the cached data from the first storage device ( 220 , 225 , 230 , 235 , 130 ) to the second storage device ( 220 , 225 , 230 , 235 , 130 ) if the resource monitor ( 205 ) indicates that the first storage device ( 220 , 225 , 230 , 235 , 130 ) is saturated. 2 . An in-memory cluster computing framework node ( 305 ) according to claim 1 , wherein the resource monitor ( 205 ) is operative to determine the capabilities of the first storage device ( 220 , 225 , 230 , 235 , 130 ). 3 . An in-memory cluster computing framework node ( 305 ) according to claim 1 , wherein the resource scheduler ( 210 ) is operative to select the first storage device ( 220 , 225 , 230 , 235 , 130 ) to initially cache the data based on information provided by an application that uses the data. 4 . An in-memory cluster computing framework node ( 305 ) according to claim 1 , wherein: the first priority ( 240 , 245 , 250 , 255 , 260 ) is higher than the second priority ( 240 , 245 , 250 , 255 , 260 ); and the resource scheduler ( 210 ) is operative to select the first storage device ( 220 , 225 , 230 , 235 , 130 ) to initially cache the data as a higher priority ( 240 , 245 , 250 , 255 , 260 ) device. 5 . An in-memory cluster computing framework node ( 305 ) according to claim 4 , wherein the resource scheduler is operative to select the second storage device ( 220 , 225 , 230 , 235 , 130 ) for future data caching if the resource monitor ( 205 ) indicates that the first storage device ( 220 , 225 , 230 , 235 , 130 ) is saturated. 6 . An in-memory cluster computing framework node ( 305 ) according to claim 1 , further comprising a replicator ( 310 ) to replicate the cached data on a third storage device ( 220 , 225 , 230 , 235 , 130 ). 7 . An in-memory cluster computing framework node ( 305 ) according to claim 6 , wherein the third storage device ( 220 , 225 , 230 , 235 , 130 ) is in a second in-memory cluster computing framework node ( 305 ). 8 . An in-memory cluster computing framework node ( 305 ) according to claim 1 , wherein the data includes a resilient distributed dataset (RDD) on the first storage device ( 220 , 225 , 230 , 235 , 130 ). 9 . An in-memory cluster computing framework node ( 305 ) according to claim 1 , wherein the resource scheduler ( 210 ) is operative to migrate all data from the first storage device ( 220 , 225 , 230 , 235 , 130 ) to the second storage device ( 220 , 225 , 230 , 235 , 130 ). 10 . An in-memory cluster computing framework node ( 305 ) according to claim 1 , wherein the resource scheduler ( 210 ) is operative to migrate an oldest data from the first storage device ( 220 , 225 , 230 , 235 , 130 ) to the second storage device ( 220 , 225 , 230 , 235 , 130 ). 11 . A method for caching data in an in-memory cluster computing framework, comprising: caching ( 510 ) a data on a first storage device ( 220 , 225 , 230 , 235 , 130 ) with a first priority ( 240 , 245 , 250 , 255 , 260 ) in a cluster node ( 305 ); monitoring ( 520 ) the operation of the first storage device ( 220 , 225 , 230 , 235 , 130 ); and if the first storage device ( 220 , 225 , 230 , 235 , 130 ) is saturated, migrating ( 530 ) the cached data to a second storage device ( 220 , 225 , 230 , 235 , 130 ) with a second priority ( 240 , 245 , 250 , 255 , 260 )). 12 . A method according to claim 11 , wherein monitoring ( 520 ) the operation of the first storage device ( 220 , 225 , 230 , 235 , 130 ) includes determining ( 505 ) a capability of the first storage device ( 220 , 225 , 230 , 235 , 130 ). 13 . A method according to claim 11 , wherein caching ( 510 ) a data on a first storage device ( 220 , 225 , 230 , 235 , 130 ) with a first priority ( 240 , 245 , 250 , 255 , 260 ) in a cluster node ( 305 ) includes caching ( 510 ) the data on the first storage device ( 220 , 225 , 230 , 235 , 130 ) in the cluster node ( 305 ), the first storage device ( 220 , 225 , 230 , 235 , 130 ) selected by an application using the data. 14 . A method according to claim 11 , wherein caching ( 510 ) a data on a first storage device ( 220 , 225 , 230 , 235 , 130 ) with a first priority ( 240 , 245 , 250 , 255 , 260 ) in a cluster node ( 305 ) includes caching ( 510 ) the data on the first storage device ( 220 , 225 , 230 , 235 , 130 ) in the cluster node ( 305 ), the first storage device ( 220 , 225 , 230 , 235 , 130 ) having a higher priority ( 240 , 245 , 250 , 255 , 260 ) among a plurality of devices. 15 . A method according to claim 14 , further comprising, if the first storage device ( 220 , 225 , 230 , 235 , 130 ) is saturated, re-directing ( 535 ) future cache requests for the first storage device ( 220 , 225 , 230 , 235 , 130 ) in the cluster node ( 305 ) in the cluster node ( 305 ) to the second storage device ( 220 , 225 , 230 , 235 , 130 ). 16 . A method according to claim 11 , further comprising replicating ( 515 ) the cached data on a third storage device ( 220 , 225 , 230 , 235 , 130 ). 17 . A method according to claim 16 , wherein replicating ( 515 ) the cached data on a third storage device ( 220 , 225 , 230 , 235 , 130 ) includes replicating ( 515 ) the cached data on the third storage device ( 220 , 225 , 230 , 235 , 130 ) in a second cluster node ( 305 ). 18 . A method according to claim 11 , wherein caching ( 510 ) a data on a first storage device ( 220 , 225 , 230 , 235 , 130 ) with a first priority ( 240 , 245 , 250 , 255 , 260 ) in a cluster node ( 305 ) includes caching a resilient distributed dataset (RDD) on the first storage device ( 220 , 225 , 230 , 235 , 130 ). 19 . A method according to claim 11 , wherein migrating ( 530 ) the cached data to a second storage device ( 220 , 225 , 230 , 235 , 130 ) with a second priority ( 240 , 245 , 250 , 255 , 260 ) includes migrating ( 705 ) all data on the first storage device ( 220 , 225 , 230 , 235 , 130 ) to the second storage device ( 220 , 225 , 230 , 235 , 130 ). 20 . A method according to claim 11 , wherein migrating ( 530 ) the cached data to a second storage device ( 220 , 225 , 230 , 235 , 130 ) with a second priority ( 240 , 245 , 250 , 255 , 260 ) includes migrating ( 710 ) an oldest data on the first storage device ( 220 , 225 , 230 , 235 , 130 ) to the second storage device ( 220 , 225 , 230 , 235 , 130 ).

Assignees

Inventors

Classifications

  • Migration mechanisms · CPC title

  • Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

  • where the computing system component is a storage system, e.g. DASD based or network based (digital input from or digital output to record carriers G06F3/06; digital recording or reproducing G11B20/18; for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS], H04L67/1097) · CPC title

  • Saving storage space on storage systems · CPC title

  • Monitoring storage devices or systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016170882A1 cover?
An in-memory cluster computing framework node is described. The node includes storage devices having various priorities. The node also includes a resource monitor to monitor the operation of the storage devices. The node also includes a resource scheduler. When the resource monitor indicates that a storage device is at or approaching saturation, the resource scheduler can migrate data from that…
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F12/0806. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 16 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).