What technology area does this patent fall under?

Primary CPC classification G06F16/9014. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jun 09 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Systems and methods for data backup using data binning and deduplication

US10678654B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10678654-B2
Application number	US-201715793188-A
Country	US
Kind code	B2
Filing date	Oct 25, 2017
Priority date	Oct 26, 2016
Publication date	Jun 9, 2020
Grant date	Jun 9, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed are methods and systems for performing data backup which implement data binning using log-structured merge (LSM) trees during deduplication. An exemplary method includes: calculating a reduced hash value (RHV) associated with each of a plurality of data blocks; partitioning the plurality of reduced hash values into groups; selecting a representative hash value for each group; determining whether the representative hash value occurs in a first LSM tree, the first LSM tree stored in a volatile memory; and when the representative hash value occurs in the first LSM tree: loading the RHVs in the representative hash value's group into volatile memory; comparing each of the RHVs to one or more hash values in a second LSM tree to identify a matching hash value; and writing a segment identifier (ID) corresponding to the matching hash value in an archive, which references a data block in a segment store.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer-implemented method for data backup, comprising: calculating, by a processor, a reduced hash value associated with each of a plurality of data blocks; partitioning, by the processor, the plurality of reduced hash values into groups; selecting, by the processor, a representative hash value for each group; determining, by the processor, whether the representative hash value occurs in a first log-structured merge (LSM) tree, the first LSM tree stored in a volatile memory; and when the representative hash value occurs in the first LSM tree: loading the reduced hash values in the representative hash value's group into the volatile memory; comparing each of the reduced hash values to one or more hash values in a second LSM tree to identify a matching hash value; and writing a segment identifier (ID) corresponding to the matching hash value in an archive, the segment ID referencing a data block in a segment store. 2. The method of claim 1 , wherein the representative hash value for each group is selected from the plurality of reduced hash values assigned to the group. 3. The method of claim 1 , wherein the segment ID is generated by incrementing a previous segment ID. 4. The method of claim 1 , further comprising: identifying the segment ID based on the determination that the representative hash value occurs in the first LSM tree, wherein the one or more reduced hash values in the second LSM tree are each associated with the segment ID. 5. The method of claim 1 , wherein calculating the reduced hash value comprises selecting a plurality of bits at a beginning of a first hash value as the reduced hash value. 6. The method of claim 1 , wherein the second LSM tree is stored in non-volatile memory. 7. The method of claim 1 , further comprising: when the representative hash value does not occur in the first LSM tree, storing each of the plurality of data blocks in the representative hash value's group in the segment store. 8. A system for data backup, comprising: one or more hardware or virtual processors configured to: calculate, for each of a plurality of data blocks, a reduced hash value associated with each of the data blocks; partition, by the processor, the plurality of reduced hash values into groups; select, by the processor, a representative hash value for each group; determine, by the processor, whether the representative hash value occurs in a first log-structured merge (LSM) tree, the first LSM tree stored in a volatile memory; and when the representative hash value occurs in the first LSM tree: load the reduced hash values in the representative hash value's group into the volatile memory; compare each of the reduced hash values to one or more hash values in a second LSM tree to identify a matching hash value; and write a segment identifier (ID) corresponding to the matching hash value in an archive, the segment ID referencing a data block in a segment store. 9. The system of claim 8 , wherein the representative hash value for each group is selected from the plurality of reduced hash values assigned to the group. 10. The system of claim 8 , wherein the segment ID is generated by incrementing a previous segment ID. 11. The system of claim 8 , wherein the processor is further configured to: identify the segment ID based on the determination that the representative hash value occurs in the first LSM tree, wherein the one or more reduced hash values in the second LSM tree are each associated with the segment ID. 12. The system of claim 8 , wherein calculating the reduced hash value comprises selecting a plurality of bits at a beginning of a first hash value as the reduced hash value. 13. The system of claim 8 , wherein the second LSM tree is stored in non-volatile memory. 14. The system of claim 8 , wherein the processor is further configured to: when the representative hash value does not occur in the first LSM tree, store each of the plurality of data blocks in the representative hash value's group in the segment store. 15. A non-transitory computer readable medium comprising computer executable instructions for data backup, including instructions for: calculating, for each of a plurality of data blocks, a reduced hash value associated with each of the data blocks; partitioning, by the processor, the plurality of reduced hash values into groups; selecting, by the processor, a representative hash value for each group; determining, by the processor, whether the representative hash value occurs in a first log-structured merge (LSM) tree, the first LSM tree stored in a volatile memory; and when the representative hash value occurs in the first LSM tree: loading the reduced hash values in the representative hash value's group into the volatile memory; comparing each of the reduced hash values to one or more hash values in a second LSM tree to identify a matching hash value; and writing a segment identifier (ID) corresponding to the matching hash value in an archive, the segment ID referencing a data block in a segment store. 16. The non-transitory computer readable medium of claim 15 , wherein the representative hash value for each group is selected from the plurality of reduced hash values assigned to the group. 17. The non-transitory computer readable medium of claim 15 , wherein the segment ID is generated by incrementing a previous segment ID. 18. The non-transitory computer readable medium of claim 15 , further comprising instructions for: identifying the segment ID based on the determination that the representative hash value occurs in the first LSM tree, wherein the one or more reduced hash values in the second LSM tree are each associated with the segment ID. 19. The non-transitory computer readable medium of claim 15 , wherein calculating the reduced hash value comprises selecting a plurality of bits at a beginning of a first hash value as the reduced hash value. 20. The non-transitory computer readable medium of claim 15 , wherein the second LSM tree is stored in non-volatile memory.

Assignees

Acronis Int Gmbh

Inventors

Classifications

G06F16/22
Indexing; Data structures therefor; Storage structures · CPC title
G06F2201/84
Using snapshots, i.e. a logical point-in-time copy of the data · CPC title
G06F16/27
Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title
G06F16/9014Primary
hash tables · CPC title
G06F11/1453Primary
using de-duplication of the data · CPC title

Patent family

Related publications grouped by family.

View patent family 61969635

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10678654B2 cover?: Disclosed are methods and systems for performing data backup which implement data binning using log-structured merge (LSM) trees during deduplication. An exemplary method includes: calculating a reduced hash value (RHV) associated with each of a plurality of data blocks; partitioning the plurality of reduced hash values into groups; selecting a representative hash value for each group; determin…
Who is the assignee on this patent?: Acronis Int Gmbh
What technology area does this patent fall under?: Primary CPC classification G06F16/9014. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jun 09 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Method and apparatus for content derived data placement in memory

Joint de-duplication-erasure coded distributed storage

System and method for data deduplication using log-structured merge trees

File granular data de-duplication effectiveness metric for data de-duplication

Frequently asked questions