Controlling layers in container images to reduce redundant content between layers
US-12093220-B1 · Sep 17, 2024 · US
US9418133B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9418133-B2 |
| Application number | US-201414466745-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 22, 2014 |
| Priority date | Nov 14, 2008 |
| Publication date | Aug 16, 2016 |
| Grant date | Aug 16, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Data replication with delta compression is disclosed. A primary system and a replica system are determined to both have an identical first data segment that is similar to a second data segment. The second data segment is encoded, wherein the encoding refers to the first data segment.
Opening claim text (preview).
What is claimed is: 1. A system for processing data, comprising: one or more processors configured to: store a first data stream or a first data block in a primary system using a first plurality of segments, wherein the first plurality of segments includes a first data segment; select a second data segment on the primary system for replication; determine that the first data segment is similar to the second data segment using a sketch function; cause storing of a second data stream or a second data block in a replica system using a second plurality of segments, wherein the second plurality of segments includes a first data segment copy of the first data segment; determine an encoding of the second data segment, wherein the determining of the encoding of the second data segment comprises determining a difference between the first data segment and the second data segment, wherein the encoded second data segment comprises the determined difference between the first data segment and the second data segment and a reference to the first data segment copy of the first data segment; compare a size of the encoding of the second data segment with an original size of the second data segment; in the event that a difference in the size of the encoding of the second data segment and the original size of the second data segment is greater than or equal to a threshold, transmit the encoding of the second data segment to the replica system from the primary system, wherein the encoding of the second data segment is decoded for storage in the replica system; and in the event that the difference in the size of the encoding of the second data segment and the original size of the second data segment is less than a threshold, transmit the second data segment to the replica system for storage from the primary system; and one or more memories coupled to the one or more processors and configured to provide the one or more processors with instructions. 2. The system as in claim 1 , wherein the encoding of the second data segment is compressed prior to transmitting. 3. The system as in claim 2 , wherein the encoding of the second data segment comprises an indication of a set of data blocks in the second data segment not present in the first data segment and an indication of a set of data blocks present in both data segments. 4. The system as in claim 1 , wherein the replica system decodes the encoding of the second data segment. 5. The system as in claim 1 , wherein the replica system stores the encoding of the second data segment. 6. The system as in claim 1 , wherein the replica system stores a decoding of the encoding of the second data segment. 7. The system as in claim 1 , wherein the sketch function comprises a hash function. 8. The system as in claim 1 , wherein the sketch function comprises a plurality of hash functions. 9. The system as in claim 1 , wherein the sketch function comprises one or more functions that return a same value for similar data segments. 10. The system as in claim 1 , wherein the sketch function comprises one or more functions that return a similar value for similar data segments. 11. The system as in claim 1 , wherein the sketch function comprises one or more functions that may return a same value for similar data segments. 12. The system as in claim 1 , wherein the sketch function comprises one or more functions that may return a similar value for similar data segments. 13. The system as in claim 12 , wherein sketch function values are determined to be similar based on one or more of the following methods: numeric difference, hamming distance, locality-sensitive-hashing, or nearest-neighbor-search. 14. The system as in claim 1 , wherein the first data segment is identified based at least in part on one or more of the following: temporal locality, spatial locality, ease of access, expected compression, or frequency of selection for other compressed segments. 15. The system as in claim 1 , wherein the second data segment is similar to one or more data segments on both the primary and replica systems in addition to the first data segment. 16. The system as in claim 15 , wherein the encoding of the second data segment is based at least in part on the first data segment and the one or more additional similar data segments. 17. The system as in claim 1 , wherein the second data segment was stored as an encoding of a third data segment. 18. A method for processing data comprising: storing a first data stream or a first data block in a primary system using a first plurality of segments, wherein the first plurality of segments includes a first data segment; selecting a second data segment on the primary system for replication; determining that the first data segment is similar to the second data segment using a sketch function; causing storing of a second data stream or a second data block in a replica system using a second plurality of segments, wherein the second plurality of segments includes a first data segment copy of the first data segment; determining, using a processor, an encoding of the second data segment, wherein the determining of the encoding of the second data segment comprises determining a difference between the first data segment and the second data segment, wherein the encoded second data segment comprises the determined difference between the first data segment and the second data segment and a reference to the first data segment copy of the first data segment; comparing a size of the encoding of the second data segment with an original size of the second data segment; in the event that a difference in the size of the encoding of the second data segment and the original size of the second data segment is greater than or equal to a threshold, transmitting the encoding of the second data segment to the replica system for storage from the primary system, wherein the encoding of the second data segment is decoded for storage in the replica system; and in the event that the difference in the size of the encoding of the second data segment and the original size of the second data segment is less than a threshold, transmitting the second data segment to the replica system for storage from the primary system. 19. A computer program product for processing data, the computer program product being embodied in a non-transitory computer readable storage medium and comprising computer instructions for: storing a first data stream or a first data block in a primary system using a first plurality of segments, wherein the first plurality of segments includes a first data segment; selecting a second data segment on the primary system for replication; determining that the first data segment is similar to the second data segment using a sketch function; causing storing of a second data stream or a second data block in a replica system using a second plurality of segments, wherein the second plurality of segments includes a first data segment copy of the first data segment; determining, using a processor, an encoding of the second data segment, wherein the determining of the encoding of the second data segment comprises determining a difference between the first data segment and the second data segment, wherein the encoded second data segment comprises the determined difference between the first data segment and the second data segment and a reference to the first data segment copy of the first data segment; comparing a size of the encoding of the second data segment with an original size of the second data segment; in
Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title
based on delta files · CPC title
for networked environments · CPC title
by selection of backup contents · CPC title
implemented as replicated file system · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.