System and method for cache replacement using access-ordering lookahead approach

US9684469B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9684469-B1
Application numberUS-201213460711-A
CountryUS
Kind codeB1
Filing dateApr 30, 2012
Priority dateMar 12, 2012
Publication dateJun 20, 2017
Grant dateJun 20, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Data objects of a file are cached in a cache memory of a storage system. An access sequence of the cached data objects is determined based on metadata of the file. In response to a request for cache space reclamation, a data object is evicted from the cache memory whose next access is a farthest amongst the cached data objects based on the access sequence of the data objects.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for cache management of a storage system, the method comprising: in response to a request for accessing a file stored in a storage system, identifying and retrieving a plurality of data objects associated with the file from a storage device of the storage system, wherein the data objects constitute at least a portion of the file; caching the plurality of data objects of the file in a cache memory of the storage system; determining an access sequence of the cached data objects within the file based on metadata of the file, wherein the access sequence represents a sequential order in time of accessing the cached data objects within the file, wherein the data objects are deduplicated data objects within the file, and wherein at least one of the data objects appears more than once within the file; in response to a request for cache space reclamation, identifying one or more of the cached data objects whose next access is a farthest in time from a data object currently being accessed amongst the cached data objects within the file based on the access sequence of the data objects; and evicting the identified one or more data objects from the cache memory whose next access is a farthest amongst the cached data objects within the file, including accessing an eviction candidate data structure to identify an eviction candidate, wherein the eviction candidate data structure comprises a Max Heap data structure, and wherein a next data object to be evicted from the cache memory is identified by popping an entry having a highest sequence number from the Max Heap data structure. 2. The method of claim 1 , wherein determining the access sequence of the cached data objects is performed based on a sequential access pattern of the file. 3. The method of claim 1 , further comprising predicting an access pattern of the file, wherein determining the access sequence of the cached data objects is performed based on the predicted access pattern of the file. 4. The method of claim 1 , wherein the plurality of data objects are deduplicated data objects that are stored in a storage device of the storage system. 5. The method of claim 1 , further comprising: generating access information for each data object based on the metadata of the file, the access information including sequential access order of the data object within the file; and storing the access information in an access list associated with the data object, the access list including a list of sequence numbers, each sequence number representing a logical access time of the data object within the file. 6. The method of claim 5 , further comprising: in response to an access of a first data object, identifying a next sequence number from a first access list associated with the first data object, the next sequence number representing a next logical time at which the first data object is accessed again within the file; and storing the next sequence number in an entry associated with the first data object of the eviction candidate data structure. 7. The method of claim 6 , wherein the eviction candidate data structure includes one or more entries, each entry corresponding to one of the data objects. 8. The method of claim 6 , wherein evicting a data object from the cache memory comprises evicting a data object associated with an entry of the eviction candidate data structure that has a highest sequence number. 9. The method of claim 6 , wherein storing the next sequence number in an entry associated with the first data object of an eviction candidate data structure comprises: obtaining an index of the entry associated with the first data object by hashing a fingerprint of the first data object using a predetermined hash function; and storing the next sequence number in an entry of the eviction candidate data structure identified by the index. 10. The method of claim 1 , further comprising: generating access information for each data object based on the metadata of the file, the access information including sequential access order of the data object within the file; and storing the access information as an access list in an access information file, the access list including a list of object references and sequence numbers, each sequence number representing a logical access time of the data object subsequently occurring within the file. 11. The method of claim 10 , further comprising storing the access information with the metadata for the file, such that the access information is available when the file is accessed for reading. 12. A non-transitory computer-readable storage medium having instructions stored therein, which when executed by a processor, cause the processor to perform operations for cache management of a storage system, the operations comprising: in response to a request for accessing a file stored in a storage system, identifying and retrieving a plurality of data objects associated with the file from a storage device of the storage system, wherein the data objects constitute at least a portion of the file; caching the plurality of data objects of the file in a cache memory of the storage system; determining an access sequence of the cached data objects within the file based on metadata of the file, wherein the access sequence represents a sequential order in time of accessing the cached data objects within the file, wherein the data objects are deduplicated data objects within the file, and wherein at least one of the data objects appears more than once within the file; in response to a request for cache space reclamation, identifying one or more of the cached data objects whose next access is a farthest in time from a data object currently being accessed amongst the cached data objects within the file based on the access sequence of the data objects; and evicting the identified one or more data objects from the cache memory whose next access is a farthest amongst the cached data objects within the file, including accessing an eviction candidate data structure to identify an eviction candidate, wherein the eviction candidate data structure comprises a Max Heap data structure, and wherein a next data object to be evicted from the cache memory is identified by popping an entry having a highest sequence number from the Max Heap data structure. 13. The non-transitory computer-readable storage medium of claim 12 , wherein determining the access sequence of the cached data objects is performed based on a sequential access pattern of the file. 14. The non-transitory computer-readable storage medium of claim 12 , wherein the operations further comprise predicting an access pattern of the file, and wherein determining the access sequence of the cached data objects is performed based on the predicted access pattern of the file. 15. The non-transitory computer-readable storage medium of claim 12 , wherein the plurality of data objects are deduplicated data objects that are stored in a storage device of the storage system. 16. The non-transitory computer-readable storage medium of claim 12 , wherein the operations further comprise: generating access information for each data object based on the metadata of the file, the access information including sequential access order of the data object within the file; and storing the access information in an access list associated with the data object, the access list including a list of sequence numbers, each sequence number representing a logical access time of the data object within the file. 17. The non-transitory computer-readable stora

Assignees

Inventors

Classifications

  • Generational garbage collection · CPC title

  • for peripheral storage systems, e.g. disk cache · CPC title

  • G06F3/0641Primary

    De-duplication techniques · CPC title

  • Allocation or management of cache space · CPC title

  • using de-duplication of the data · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9684469B1 cover?
Data objects of a file are cached in a cache memory of a storage system. An access sequence of the cached data objects is determined based on metadata of the file. In response to a request for cache space reclamation, a data object is evicted from the cache memory whose next access is a farthest amongst the cached data objects based on the access sequence of the data objects.
Who is the assignee on this patent?
Douglis Frederick, Hsu Windsor W, Qian Hangwei, and 1 more
What technology area does this patent fall under?
Primary CPC classification G06F3/0641. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 20 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).