Learning machine to optimize random access in a storage system

US9250819B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9250819-B2
Application numberUS-201313784575-A
CountryUS
Kind codeB2
Filing dateMar 4, 2013
Priority dateMar 4, 2013
Publication dateFeb 2, 2016
Grant dateFeb 2, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Mechanisms are provided for optimizing random access in a storage system. According to various embodiments, an access pattern may be identified for a plurality of data segments stored in a first arrangement on a storage medium. Each of the plurality of data segments may be stored at a respective first storage location on the storage medium in the first arrangement. The access pattern may indicate an order in which the data segments are likely to be retrieved from the storage medium. The plurality of data segments may be stored in a second arrangement on the storage medium based on the identified access pattern. Each of the plurality of data segments may be stored at a respective second storage location on the storage medium in the updated arrangement.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: identifying and recording an access pattern for a plurality of data segments stored in a first arrangement on a storage medium, the plurality of data segments including a designated set of data segments that have been deduplicated, each of the plurality of data segments being stored at a respective first storage location on the storage medium in the first arrangement, the identified access pattern indicating a sequence in which the data segments are likely to be retrieved from the storage medium, wherein recording the access pattern includes recording a frequency with which the access pattern has occurred, wherein recording the access pattern further includes identifying a designated content type associated with the plurality of data segments, the designated content type being one of a plurality of content types, and wherein the access pattern is identified at least in part based on at least a partial match with a typical access pattern associated with the designated content type; and storing the plurality of data segments in a second arrangement on the storage medium based on the identified access pattern, each of the plurality of data segments being stored at a respective second storage location on the storage medium in the second arrangement, the respective first storage location being different than the respective second storage location for one or more of the plurality of data segments, wherein storing the plurality of data segments includes storing a duplicate copy of at least one of the designated set of data segments that have been deduplicated. 2. The method recited in claim 1 , wherein accessing the plurality of data segments substantially in accordance with the identified access pattern is faster under the second arrangement than under the first arrangement. 3. The method recited in claim 1 , wherein identifying the access pattern comprises: analyzing first data access information describing one or more previous instances of retrieval of the plurality of data segments from the storage medium. 4. The method recited in claim 3 , wherein identifying the access pattern further comprises: computing an estimate of the access pattern based on the first data access information. 5. The method recited in claim 4 , wherein identifying the access pattern further comprises: analyzing second data access information describing one or more previous instances of retrieval of the plurality of data segments from the storage medium, and computing an updated estimate of the access pattern based on the second data access information as part of an iterative learning process. 6. The method recited in claim 1 , wherein identifying the access pattern further comprises: receiving predetermined access pattern information associated with the identified content type. 7. The method recited in claim 6 , wherein the predetermined access pattern information indicates one or more likely access patterns for the identified content types, the one or more likely access patterns including the identified access pattern. 8. The method recited in claim 1 , wherein storing the plurality of data segments in the second arrangement comprises: moving each of the plurality of data segments from the respective first storage location to the respective second storage location when the respective first storage location is different than the respective second storage location. 9. A system comprising: a storage module configured to store a plurality of data segments stored in a first arrangement, the plurality of data segments including a designated set of data segments that have been deduplicated, each of the plurality of data segments being stored at a respective first storage location on the storage module in the first arrangement; and a processor configured to: identify and record an access pattern for the plurality of data segments, the access pattern indicating a sequence in which the data segments are likely to be retrieved from the storage module, wherein recording the access pattern includes recording a frequency with which the access pattern has occurred, wherein recording the access pattern further includes identifying a designated content type associated with the plurality of data segments, the designated content type being one of a plurality of content types, and wherein the access pattern is identified at least in part based on at least a partial match with a typical access pattern associated with the designated content type, and transmit an instruction to store the plurality of data segments in a second arrangement on the storage medium based on the identified access pattern, each of the plurality of data segments being stored at a respective second storage location on the storage medium in the second arrangement, the respective first storage location being different than the respective second storage location for one or more of the plurality of data segments, wherein storing the plurality of data segments includes storing a duplicate copy of at least one of the designated set of data segments that have been deduplicated. 10. The system recited in claim 9 , wherein accessing the plurality of data segments substantially in accordance with the identified access pattern is faster under the second arrangement than under the first arrangement. 11. The system recited in claim 9 , wherein identifying the access pattern comprises: analyzing data access information describing one or more previous instances of retrieval of the plurality of data segments from the storage module. 12. The system recited in claim 11 , wherein identifying the access pattern further comprises: computing an estimate of the access pattern based on the data access information. 13. The system recited in claim 9 , wherein identifying the access pattern further comprises: analyzing second data access information describing one or more previous instances of retrieval of the plurality of data segments from the storage module, and computing an updated estimate of the access pattern based on the second data access information as part of an iterative learning process. 14. The system recited in claim 9 , wherein identifying the access pattern further comprises: receiving predetermined access pattern information associated with the identified content type. 15. The system recited in claim 14 , wherein the predetermined access pattern information indicates one or more likely access patterns for the identified content types, the one or more likely access patterns including the identified access pattern. 16. One or more non-transitory computer readable media having instructions stored thereon for performing a method, the method comprising: identifying and recording an access pattern for a plurality of data segments stored in a first arrangement on a storage medium, the plurality of data segments including a designated set of data segments that have been deduplicated, each of the plurality of data segments being stored at a respective first storage location on the storage medium in the first arrangement, the identified access pattern indicating a sequence in which the data segments are likely to be retrieved from the storage medium, wherein recording the access pattern includes recording a frequency with which the access pattern has occurred, wherein recording the access pattern further includes identifying a designated content type associated with the plurality of data segments, the designated content type being one of a plurality of content types, and wherein the access pattern is identified at least in part based

Assignees

Inventors

Classifications

  • G06F3/0622Primary

    in relation to access · CPC title

  • Plurality of storage devices · CPC title

  • De-duplication techniques · CPC title

  • User address space allocation, e.g. contiguous or non contiguous base addressing · CPC title

  • where the computing system component is a storage system, e.g. DASD based or network based (digital input from or digital output to record carriers G06F3/06; digital recording or reproducing G11B20/18; for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS], H04L67/1097) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9250819B2 cover?
Mechanisms are provided for optimizing random access in a storage system. According to various embodiments, an access pattern may be identified for a plurality of data segments stored in a first arrangement on a storage medium. Each of the plurality of data segments may be stored at a respective first storage location on the storage medium in the first arrangement. The access pattern may indica…
Who is the assignee on this patent?
Dell Products Lp
What technology area does this patent fall under?
Primary CPC classification G06F3/0622. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 02 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).