Identifying groups of similar data portions

US9753987B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9753987-B1
Application numberUS-201313870262-A
CountryUS
Kind codeB1
Filing dateApr 25, 2013
Priority dateApr 25, 2013
Publication dateSep 5, 2017
Grant dateSep 5, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for grouping data portions are disclosed. Each group includes data portions determined to exhibit similar behavior. The techniques may include determining whether an affinity measurement with respect to two groups exceeds an affinity threshold; merging the two groups into a single group responsive to the affinity measurement exceeding the affinity threshold; modeling movement of at least one data portion of the single group between two storage tiers at a particular time of day using predicted workload metrics; and performing the data movement of the at least one data portion between the two storage tiers. Predicted workload metrics may be determined by revising first modeled workload metrics using a bias value, where bias values are associated with different times of day, and the bias value is selected based on the particular time of day that the predicted workload metrics are modeling.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of grouping data portions comprising: receiving a set of groups, each group of the set including a plurality of data portions that are from a plurality of logical devices and are determined to exhibit similar behavior, during a same time interval, with respect to a change in I/O workload; determining, using a processor, whether an affinity measurement with respect to two groups of the set exceeds an affinity threshold, said affinity measurement representing a measure of similarity in behavior between the two groups with respect to a change in I/O workload during a time interval; merging, using the processor, the two groups of the set into a single group responsive to the affinity measurement exceeding the affinity threshold; modeling, using the processor, movement of at least one data portion of the single group between two storage tiers at a particular time of day, said modeling using predicted workload metrics for data portions of the single group at the particular time of day, said predicted workload metrics being determined by revising first modeled workload metrics using a bias value for the single group, wherein a plurality of bias values are associated with different times of day and the bias value is selected from the plurality of bias values based on the particular time of day that the predicted workload metrics are modeling, wherein said first modeled workload metrics are determined with respect to a time period partitioned into a plurality of time steps, one of a plurality of sets of bias values is determined for each group of the set of groups, and each of the bias values in said one set corresponds to a different one of the plurality of time steps in the time period, wherein a first of the sets of bias values is determined for the single group and includes the plurality of bias values, wherein revised values for said first modeled workload metrics for the single group are determined at the particular time of day being modeled, the particular time of day corresponding to one of the plurality of time steps, said revised values being determined in accordance with the bias value of the first set of bias values for the single group, said bias value corresponding to said one time step; and performing the data movement of the at least one data portion between the two storage tiers. 2. The method of claim 1 , further comprising: selecting a first group of the set and a first data portion from the plurality of data portions of the first group; receiving a first set of values for the first data portion, the first set of values including a time step rate denoting an average rate of a first metric for the first data portion during a single time interval corresponding to a single one of the plurality of time steps, a persistence rate denoting a rate of the first metric for the first data portion based on the time period, and a first activity ratio for the first data portion that is a ratio of the time step rate for the first data portion relative to the persistence rate for the first data portion; receiving a first group activity ratio for the first group, said first group activity ratio being a ratio of a plurality of time step rates for the plurality of data portions of the first group relative to a plurality of persistence rates for the plurality of data portions of the first group; and determining, for a current time step, a first affinity measurement for the first data portion with respect to the first group where the first affinity measurement is a metric representing a measurement of similarity of behavior of the first data portion with respect to behavior of the first group, wherein determining the first affinity measurement includes determining a current affinity measurement for the current time step that is a minimum of the first group activity ratio and the first activity ratio for the first data portion. 3. The method of claim 2 , wherein the first data portion is removed from the first group if the first affinity measurement is below an affinity measurement threshold, and wherein the first group is dissolved if a number of data portions in the first group falls below a minimum threshold number of group members. 4. The method of claim 2 , wherein said determining the first affinity measurement further comprises: determining whether any of the first group activity ratio and said first activity ratio for the first data portion exceed a minimum activity ratio threshold; and if none of the first group activity ratio and said first activity ratio for the first data portion exceed a minimum activity ratio threshold, determining not to update the first affinity measurement for the current time step. 5. The method of claim 4 , wherein if one or more of the first group activity ratio and said first activity ratio for the first data portion exceed a minimum activity ratio threshold, performing first processing comprising: determining the current affinity measurement for the first data portion for the current time step having a time span of the single time interval; and determining the first affinity measurement in accordance with the current affinity measurement and one or more previous affinity measurements obtained in connection with previous time steps with respect to the first data portion and the first group. 6. The method of claim 5 , wherein the first affinity measurement for the current time step, Asmooth, is a smoothed affinity value for the first data portion determined as: A smooth= A prev+alpha*( A calc− A prev) wherein Aprev represents the smoothed affinity value for the first data portion from a previous time step immediately prior to said current time step; alpha is a numeric value representing a weight; and Acalc is the current affinity measurement. 7. The method of claim 6 , wherein alpha is a real number having a value between 0 and 1. 8. The method of claim 2 , wherein the first affinity measurement is any of a time-smoothed or weighted affinity value determined using one or more affinity measurements obtained in connection with multiple time steps with respect to the first data portion and the first group. 9. The method of claim 2 , wherein the first activity ratio for the first data portion is determined by performing first processing comprising: determining whether both of the time step rate and the persistence rate are less than a minimum rate threshold; and if it is determined that both the time step rate and the persistence rate are less than the minimum rate threshold, determining that the first activity ratio for the first data portion is one (1), and otherwise determining a first result that is a ratio of the time step rate to the persistence rate for the first data portion. 10. The method of claim 9 , wherein if both the time step rate and the persistence rate are not less than the minimum rate threshold, performing second processing comprising: determining whether the first result exceeds a maximum value; and if it is determined that the first result exceeds the maximum value, determining that the first activity ratio for the first data portion is the maximum value, and otherwise determining that the first activity ratio for the first data portion is the first result. 11. The method of claim 2 , wherein the first group activity ratio is determined by performing first processing comprising: determining a first result that is a sum of time step rates for all data portions in the first group; determining a second result that is a sum of persistence rates for all data portions in the first group; determining whether both of the first result and the second result are less than a minimum rate threshold; a

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9753987B1 cover?
Techniques for grouping data portions are disclosed. Each group includes data portions determined to exhibit similar behavior. The techniques may include determining whether an affinity measurement with respect to two groups exceeds an affinity threshold; merging the two groups into a single group responsive to the affinity measurement exceeding the affinity threshold; modeling movement of at l…
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F17/3053. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 05 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).