Assignment of data within file systems

US10127237B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10127237-B2
Application numberUS-201514974477-A
CountryUS
Kind codeB2
Filing dateDec 18, 2015
Priority dateDec 18, 2015
Publication dateNov 13, 2018
Grant dateNov 13, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The embodiments relate to assigning data to processors of a file system. Metadata associated with respective blocks of data, and an initial batch of the blocks is assigned to nodes of a file system based on the metadata. Unassigned blocks are selectively assigned to one or more of the nodes. The selective assignment includes constructing a linear regression model based on node data, and determining a value for each node based on the linear regression model. Each value is associated with a predicted load corresponding to a new assignment of one or more unassigned blocks.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: gathering metadata associated with respective blocks of data; assigning an initial batch of the blocks to nodes of a file system based on the gathered metadata; selectively assigning unassigned blocks to one or more of the nodes, the selective assignment comprising: constructing a linear regression model based on node data; and for each node, determining a value based on the linear regression model, wherein each value is associated with a predicted load corresponding to a new assignment of one or more unassigned blocks; and initializing a control factor associated with the selective assignment, wherein the control factor controls a maximum permitted block load per node, and wherein the selective assignment of the unassigned blocks further comprises performing a first assignment circuit, including: comparing a first predicted value associated with a first node to the control factor; in response to determining that the predicted value associated with the first node is less than the control factor, assigning the new assignment to the first node; and in response to determining that the predicted value associated with the first node exceeds the control factor, comparing a second predicted value associated with a second node to the control factor to determine if the new assignment is assignable to the second node. 2. The method of claim 1 , further comprising adjusting the control factor in response to determining that the predicted value for each node exceeds the control factor, and performing a second assignment circuit to the nodes based on the adjusted control factor. 3. The method of claim 1 , wherein constructing the model comprises: initializing the model based on a set of initial values associated with predicted costs; collecting one or more read metric samples for each of the nodes; and dynamically refining the model based on the collected samples. 4. The method of claim 1 , further comprising refining the model with a decay factor, wherein the decay factor governs relativity of weight between past samples and a new sample. 5. The method of claim 1 , further comprising controlling operation of the selective assignment based on a threshold associated with a time budget, including comparing a remaining time to the threshold, and selecting a mode of operation of the selective assignment based on the comparison. 6. A computer program product comprising a computer readable hardware storage device having program code embodied therewith, the program code executable by a processing unit to: gather metadata associated with respective blocks of data; assign an initial batch of the blocks to nodes of a file system based on the gathered metadata; selectively assign unassigned blocks to one or more of the nodes, including initialize a control factor associated with the selective assignment, wherein the control factor controls a maximum permitted block load per node, the selective assignment comprising program code to: construct a linear regression model based on node data; and each node, determine a value based on the linear regression model, wherein each value is associated with a predicted load corresponding to a new assignment of one or more unassigned blocks; and wherein the selective assignment of the unassigned blocks further comprises program code to perform a first assignment circuit, including program code to: compare a first predicted value associated with a first node to the control factor; in response to determining that the predicted value associated with the first node is less than the control factor, assign the new assignment to the first node; and in response to determining that the predicted value associated with the first node exceeds the control factor, compare a second predicted value associated with a second node to the control factor to determine if the new assignment is assignable to the second node. 7. The computer program product of claim 6 , further comprising program code to adjust the control factor in response to determining that the predicted value for each node exceeds the control factor, and perform a second assignment circuit to the nodes based on the adjusted control factor. 8. The computer program product of claim 6 , wherein constructing the model comprises program code to: initialize the model based on a set of initial values; collect one or more read metric samples for each of the nodes; and dynamically refine the model based on the collected samples. 9. The computer program product of claim 6 , further comprising program code to refine the model with a decay factor, wherein the decay factor governs a relativity of weight between past samples and a new sample. 10. The computer program product of claim 6 , further comprising program code to control operation of the selective assignment based on a threshold associated with a time budget, including program code to compare a remaining time to the threshold, and select a mode of operation of the selective assignment of the unassigned blocks based on the comparison. 11. A system comprising: a processing unit in communication with memory; a plurality of nodes, each node having local persistent storage and a local processor; and a tool in communication with the nodes, the tool to: gather metadata associated with respective blocks of data; assign an initial batch of the blocks to nodes of a file system based on the gathered metadata; and selectively assign unassigned blocks to one or more of the nodes, the selective assignment comprising the tool to: initialize a control factor associated with the selective assignment, wherein the control factor controls a maximum permitted block load per node, and wherein the selective assignment of the unassigned blocks further comprises the tool to perform a first assignment circuit, including the tool to: construct a linear regression model based on node data; for each node, determine a value based on the linear regression model, wherein each value is associated with a predicted load corresponding to a new assignment of one or more unassigned blocks; compare a first predicted value associated with a first node to the control factor; in response to determining that the predicted value associated with the first node is less than the control factor, assign the new assignment to the first node; and in response to determining that the predicted value associated with the first node exceeds the control factor, compare a second predicted value associated with a second node to the control factor to determine if the new assignment is assignable to the second node, adjust the control factor, and perform a second assignment circuit to the nodes based on the adjusted control factor. 12. The system of claim 11 , wherein the plurality of nodes are comprised in a distributed file system. 13. The system of claim 11 , wherein constructing the model comprises the tool to: initialize the model based on a set of initial values; collect one or more read metric samples for each of the nodes; and dynamically refine the model based on the collected samples, including the tool to refine the model with a decay factor, wherein the decay factor governs a relativity of weight between past samples and a new sample. 14. The system of claim 11 , further comprising the tool to control operation of the selective assignment based on a threshold associated with a time budget, including the tool to compare a remaining time to the threshold, and select a mode of operation of the selective assignment of the unassigned blocks based on the comparison.

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10127237B2 cover?
The embodiments relate to assigning data to processors of a file system. Metadata associated with respective blocks of data, and an initial batch of the blocks is assigned to nodes of a file system based on the metadata. Unassigned blocks are selectively assigned to one or more of the nodes. The selective assignment includes constructing a linear regression model based on node data, and determi…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F17/30097. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 13 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).