Virtual machine storage system for duplication avoidance

US12346717B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12346717-B2
Application numberUS-202217727350-A
CountryUS
Kind codeB2
Filing dateApr 22, 2022
Priority dateApr 22, 2022
Publication dateJul 1, 2025
Grant dateJul 1, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for duplication avoidance are disclosed. In one implementation, a VM can receive a request to perform a file access operation with respect to a file and determine a hash value corresponding to a content of the file. The VM can search the file identified by the hash value in in a host file system. Responsive to failing to find the hash value in the host file system, the VM can search the hash value in a guest file system of the VM and responsive to finding the file identified by the hash value in the guest file system, can perform the file access operation with respect to the file.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: receiving, by a virtual machine (VM), a request to perform a file access operation with respect to a file; determining, by the VM, a hash value corresponding to a content of the file; searching, in a host file system of a host computer system, the file identified by the hash value by searching for the hash value in a first hash-to-content (HTC) table stored in the host file system; responsive to failing to find file identified by the hash value in the host file system, searching, in a guest file system of the VM, the file identified by the hash value by searching for the hash value in a second HTC table stored in the guest file system, wherein each HTC table comprises a plurality of records, and each of the plurality of records maps a particular hash value to a corresponding content location; and responsive to finding the file identified by the hash value in the guest file system, performing the file access operation with respect to the file. 2. The method of claim 1 , further comprising: responsive to finding the file identified by the hash value in the host file system, performing the file access operation with respect to the file. 3. The method of claim 1 , wherein determining, by the VM, the hash value corresponding to the content of the file includes referencing a guest identifier-to-hash (ITH) table stored in the guest file system of the VM, wherein the ITH table comprises a plurality of records, each record mapping an identifier of the file to a corresponding hash value. 4. The method of claim 1 , further comprising: storing a plurality of files in the host file system on a physical storage device of the host computer system; and indexing the plurality of files by a respective content hash of a content of each file of the plurality of files. 5. The method of claim 1 , further comprising: providing, by a hypervisor, access for the VM to a portion of the host file system, wherein the portion of the host file system comprises at least one ITH table and at least one HTC table. 6. The method of claim 1 , further comprising: adding an entry in an HTC table stored in the host file system for each new file to which more than one VM has access. 7. A system comprising: a memory; a processing device operatively coupled to the memory, the processing device configured to: receive, via a virtual machine (VM), a request to perform a file access operation with respect to a file; determine a hash value corresponding to a content of the file; search, in a host file system of a host computer system, the file identified by the hash value by searching for the hash value in a first hash-to-content (HTC) table stored in the host file system; responsive to failing to find file identified by the hash value in the host file system, search, in a guest file system of the VM, the file identified by the hash value by searching for the hash value in a second HTC table stored in the guest file system, wherein each HTC table comprises a plurality of records, and each of the plurality of records maps a particular hash value to a corresponding content location; and responsive to finding the file identified by the hash value in the guest file system perform the file access operation with respect to the file. 8. The system of claim 7 , wherein the processing device is further to: responsive to finding the file identified by the hash value in the host file system, perform the file access operation with respect to the file. 9. The system of claim 7 , wherein determining the hash value corresponding to the content of the file includes referencing a guest identifier-to-hash (ITH) table stored in the guest file system of the VM, wherein the ITH table comprises a plurality of records, each record mapping an identifier of the file to a corresponding hash value. 10. The system of claim 7 , wherein the processing device is further to: store a plurality of files in the host file system on a physical storage device of the host computer system; and index the plurality of files by a respective content hash of a content of each file of the plurality of files. 11. The system of claim 7 , wherein the processing device is further to: provide, via a hypervisor, access for the VM to a portion of the host file system, wherein the portion of the host file system comprises at least one ITH table and at least one HTC table. 12. The system of claim 7 , wherein the processing device is further to: add an entry in an first HTC table stored in the host file system for each new file to which more than one VM has access. 13. A non-transitory computer-readable media storing instructions that, when executed, cause a processing device to: receive, via a virtual machine (VM), a request to perform a file access operation with respect to a file; determine a hash value to a content of the file; search, in a host file system of a host computer system, the file identified by the hash value by searching for the hash value in a first hash-to-content (HTC) table stored in the host file system; responsive to failing to find the file identified by the hash value in the host file system, search, in a guest file system of the VM, the file identified by the hash value by searching for the hash value in a second HTC table stored in the guest file system, wherein each HTC table comprises a plurality of records, and each of the plurality of records maps a particular hash value to a corresponding content location; and responsive to finding the file identified by the hash value in the guest file system perform the file access operation with respect to the file. 14. The non-transitory computer-readable media of claim 13 , wherein the instructions further cause the processing device to: responsive to finding the file identified by the hash value in the host file system, perform the file access operation with respect to the file. 15. The non-transitory computer-readable media of claim 13 , wherein determining the hash value corresponding to the content of the file includes referencing a guest identifier-to-hash (ITH) table stored in the guest file system of the VM, wherein the ITH table comprises a plurality of records, each record mapping an identifier of the file to a corresponding hash value. 16. The non-transitory computer-readable media of claim 13 , wherein the instructions further cause the processing device to: provide, via a hypervisor, access for the VM to a portion of the host file system, wherein the portion of the host file system comprises at least one ITH table and at least one HTC table. 17. The non-transitory computer-readable media of claim 13 , wherein the instructions further cause the processing device to: store a plurality of files in the host file system on a physical storage device of the host computer system; index the plurality of files by a respective content hash of a content of each file of the plurality of files; and add an entry in a first HTC table stored in the host file system for each new file to which more than one VM has access.

Assignees

Inventors

Classifications

  • using file content signatures, e.g. hash values · CPC title

  • Hash-based (content-based indexing of textual data G06F16/31) · CPC title

  • Memory management, e.g. access or allocation · CPC title

  • Hypervisor-specific management and integration aspects · CPC title

  • G06F16/188Primary

    Virtual file systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12346717B2 cover?
Systems and methods for duplication avoidance are disclosed. In one implementation, a VM can receive a request to perform a file access operation with respect to a file and determine a hash value corresponding to a content of the file. The VM can search the file identified by the hash value in in a host file system. Responsive to failing to find the hash value in the host file system, the VM ca…
Who is the assignee on this patent?
Red Hat Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/45558. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 01 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).