Techniques for memory de-duplication in a virtual system
US-9311250-B2 · Apr 12, 2016 · US
US9824018B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9824018-B2 |
| Application number | US-201514834157-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 24, 2015 |
| Priority date | Jan 27, 2012 |
| Publication date | Nov 21, 2017 |
| Grant date | Nov 21, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A de-duplication is configured to cache data for access by a plurality of different storage clients, such as virtual machines. A virtual machine may comprise a virtual machine de-duplication module configured to identify data for admission into the de-duplication cache. Data admitted into the de-duplication cache may be accessible by two or more storage clients. Metadata pertaining to the contents of the de-duplication cache may be persisted and/or transferred with respective storage clients such that the storage clients may access the contents of the de-duplication cache after rebooting, being power cycled, and/or being transferred between hosts.
Opening claim text (preview).
We claim: 1. An apparatus, comprising: a driver configured to monitor requests within an input/output (I/O) stack of a virtual machine; and a cache manager configured for operation within the virtual machine, the cache manager to service a first request, of the monitored requests, using a de-duplication cache in response to associating the first request with a data identifier in cache metadata maintained within the virtual machine, the data identifier corresponding to data admitted into the de-duplication cache by the virtual machine, wherein: to service the first request, the cache manager sends the data identifier from the virtual machine to the de-duplication cache and the cache manager comprises one or more of a circuit, programmable logic, firmware, and instructions stored on a non-transitory storage medium. 2. The apparatus of claim 1 , wherein, to admit a file into the de-duplication cache from the virtual machine, the cache manager is configured to: derive a data identifier from data of the file; and send an admission request from the virtual machine to the de-duplication cache, the admission request comprising the derived data identifier. 3. The apparatus of claim 2 , wherein the cache manager is further configured to record an association between the file and the derived data identifier in the cache metadata maintained within the virtual machine in response to the data of the file being admitted into the de-duplication cache. 4. The apparatus of claim 2 , wherein the cache manager is configured to derive the data identifier by one or more of: hashing, digesting, and computing a signature of the data of the file. 5. The apparatus of claim 2 , wherein the de-duplication cache is configured to admit the file into the de-duplication cache in response to determining that the file is not associated with a data identifier in the cache metadata of the virtual machine. 6. The apparatus of claim 2 , wherein the cache manager is configured to admit the file into the de-duplication cache in response to determining that the file satisfies a de-duplication policy. 7. The apparatus of claim 6 , wherein determining that the file satisfies the de-duplication policy comprises comparing one or more of a name, an extension, a path, a volume, an attribute, and a hint associated to a file selection criterion. 8. The apparatus of claim 2 , wherein: to admit the file into the de-duplication cache, the cache manager is further configured to access the file data by use of the I/O stack of the virtual machine; and the admission request comprises the file data. 9. An apparatus, comprising: a de-duplication manager configured for operation within a virtual machine hosted on a computing device, the de-duplication manager to identify I/O requests of the virtual machine pertaining to files that qualify for admission into a de-duplication cache shared by two or more virtual machines hosted on the computing device; and a de-duplication cache interface configured for operation within the virtual machine, the de-duplication cache interface to service the identified I/O requests using the de-duplication cache, the de-duplication manager comprising one or more of a circuit, programmable logic, and instructions stored on a non-transitory storage medium. 10. The apparatus of claim 9 , wherein the de-duplication manager is configured to admit a file into the de-duplication cache by deriving a data identifier from data of the file at the virtual machine, and providing the data of the file and the derived data identifier to the de-duplication cache by use of the de-duplication cache interface. 11. The apparatus of claim 10 , wherein: the de-duplication manager is configured to admit the file into the de-duplication cache in response to an I/O request pertaining to the file; and operations to admit the file into the de-duplication cache are performed on a separate thread from a thread performing operations to service the I/O request. 12. The apparatus of claim 9 , wherein: the de-duplication manager is configured to associate names of files admitted into the de-duplication cache with respective data identifiers derived from data of the files in a de-duplication index maintained within the virtual machine; and the de-duplication manager is further configured to request data of files admitted into the de-duplication cache by use of the data identifiers associated with the files in the de-duplication index. 13. The apparatus of claim 12 , wherein the de-duplication manager is configured to remove an association between a particular file and a data identifier from the de-duplication index in response to detecting an I/O request to modify the particular file. 14. The apparatus of claim 12 , wherein the de-duplication manager is configured to write the de-duplication index to persistent storage and to load the de-duplication index into memory of the virtual machine from the persistent storage in response to one or more of restarting the virtual machine, rebooting the virtual machine, power cycling the virtual machine, and migrating the virtual machine to a different host. 15. The apparatus of claim 9 , wherein the de-duplication manager identifies the I/O requests pertaining to files that qualify for admission into the de-duplication cache by use of file selection criteria, the file selection criteria based on one or more of a file name, a file extension, a file path, a file volume, a file attribute, and a hint. 16. A method, comprising: maintaining de-duplication metadata within a virtual machine operating on a host computing device, the de-duplication metadata to associate files of the virtual machine with respective data identifiers, the data identifiers derived from file data admitted into a de-duplication cache by the virtual machine; and servicing a request to read a particular file of the virtual machine by use of the de-duplication cache, wherein servicing the read request at the virtual machine comprises: using the de-duplication metadata maintained within the virtual machine to determine a data identifier associated with the particular file, the determined data identifier derived from file data admitted into the de-duplication cache by the virtual machine, and requesting the file data from the de-duplication cache by use of the determined data identifier. 17. The method of claim 16 , further comprising: determining that a file identifier corresponding to a specified file of the virtual machine is not associated with a data identifier by the de-duplication metadata maintained within the virtual machine; receiving file data corresponding to the specified file by use of a storage stack of the virtual machine; calculating a data identifier from the received file data; instructing the de-duplication cache to admit the received file data; and recording an association between the file identifier corresponding to the specified file and the calculated data identifier in the de-duplication metadata maintained within the virtual machine. 18. The method of claim 17 , wherein the received file data is admitted into a first de-duplication cache operating on a first host computing device, the method further comprising: retaining the association between the file identifier of the specified file and the calculated data identifier in the de-duplication metadata maintained within the virtual machine in response to the virtual machine migrating from the first host computing device to operate on a second host computing device. 19. The method of claim 18 , f
Replacement control · CPC title
Management of files · CPC title
Saving storage space on storage systems · CPC title
Single storage device · CPC title
with a shared cache · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.