Generating a data stream with a predictable change rate

US10853324B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10853324-B2
Application numberUS-201816140390-A
CountryUS
Kind codeB2
Filing dateSep 24, 2018
Priority dateSep 17, 2014
Publication dateDec 1, 2020
Grant dateDec 1, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Generating a data stream with a predictable change rate is disclosed, including: receiving a change rate parameter; and using the change rate parameter to provide a modified data stream that differs from a corresponding unmodified non-deduplicatable data stream by an amount determined based at least in part on the change rate parameter, including by: modifying at least a portion of a plurality of data blocks associated with the non-deduplicatable data stream to obtain a corresponding portion of the modified data stream, wherein a data block of the plurality of data blocks is associated with a block size that is based on a segmenting attribute associated with a storage destination.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: a processor configured to: store an unmodified non-deduplicatable data stream, wherein the non-deduplicatable data stream comprises a non-compressible data stream; receive a change rate parameter, wherein the change rate parameter indicates an amount by which the unmodified non-deduplicatable data stream is to be modified; and generate a modified data stream that differs from the unmodified non-deduplicatable data stream by the amount indicated by the change rate parameter; comparing the unmodified non-deduplicatable data stream with the modified data stream to identify a set of new data blocks; determine a percentage of the modified data stream to store; determine a deduplication result based on comparing the determined percentage to the change rate parameter; and in response to a determination that the determined percentage does not match the change rate parameter, reconfigure a deduplication technique; and a memory coupled to the processor and configured to provide the processor with instructions. 2. The system of claim 1 , wherein the processor is further configured to use an initialization parameter to generate the non-deduplicatable data stream comprising a merge of a first sequence and a second sequence, wherein the first sequence is generated using a first prime number and the initialization parameter and the second sequence is generated using a second prime number and the initialization parameter. 3. The system of claim 2 , wherein the first prime number and the second prime number are selected based at least in part on a revision parameter. 4. The system of claim 1 , wherein the processor is further configured to select a first constrained prime number and a second constrained prime number. 5. The system of claim 4 , wherein the first constrained prime number and the second constrained prime number are selected based at least in part on a revision parameter. 6. The system of claim 1 , wherein the change rate parameter comprises one or more of the following: a percentage, a proportion, and a value in between 0 and 1. 7. The system of claim 1 , wherein the processor is further configured to modify at least a portion of a plurality of data blocks associated with the non-deduplicatable data stream, wherein a data block of the plurality of data blocks is associated with a block size that is based on a segmenting attribute associated with a storage destination, wherein the segmenting attribute associated with the storage destination comprises a range of block sizes used by the storage destination. 8. The system of claim 7 , wherein the block size is determined based at least in part on an average block size of the range of block sizes used by the storage destination. 9. The system of claim 7 , wherein to modify the at least portion of the plurality of data blocks associated with the non-deduplicatable data stream, the processor is further configured to change at least one value associated with one location within a data block associated with the non-deduplicatable data stream. 10. The system of claim 7 , wherein the processor is further configured to receive a change rate revision parameter corresponding to the change rate parameter and wherein to modify the at least portion of the plurality of data blocks associated with the non-deduplicatable data stream, the processor is further configured to: determine the at least portion of the plurality of data blocks associated with the non-deduplicatable data stream to modify based at least in part on the change rate parameter and the change rate revision parameter; determine a first data block of the at least portion of the plurality of data blocks to corrupt, wherein the first data block comprises a plurality of elements; determine an element of the plurality of elements to corrupt based at least in part on the change rate revision parameter; determine a corruption value based at least in part on the change rate revision parameter; and set the element to the corruption value. 11. The system of claim 10 , wherein the non-deduplicatable data stream is configured to be stored and wherein the set of new data blocks of a plurality of data blocks associated with the modified data stream is identified relative to the plurality of data blocks associated with the unmodified non-deduplicatable data stream. 12. The system of claim 11 , wherein a new data block from the set of new data blocks is determined to include the corruption value and wherein the change rate parameter and the change rate revision parameter are determined using at least in part a portion of the corruption value. 13. The system of claim 12 , wherein the processor is further configured to: receive data associated with the identified set of new data blocks; determine the percentage of the modified data stream to store based at least in part on the identified set of new data blocks and the plurality of data blocks associated with the modified data stream; and determine the deduplication result based at least in part on comparing the percentage of the modified data stream to store to the change rate parameter. 14. The system of claim 11 , wherein the processor is further configured to: receive restored data associated with the non-deduplicatable data stream; determine a first prime number based at least in part on a difference between a first pair of non-consecutive values from the restored data associated with the non-deduplicatable data stream; determine a second prime number based at least in part on a difference between a second pair of non-consecutive values from the restored data associated with the non-deduplicatable data stream; and use the first prime number and the second prime number to verify the restored data associated with the non-deduplicatable data stream. 15. A method, comprising: storing an unmodified non-deduplicatable data stream, wherein the non-deduplicatable data stream comprises a non-compressible data stream; receiving a change rate parameter, wherein the change rate parameter indicates an amount by which the unmodified non-deduplicatable data stream is to be modified; and generating, by a processor, a modified data stream that differs from the unmodified non-deduplicatable data stream by the amount indicated by the change rate parameter; comparing the unmodified non-deduplicatable data stream with the modified data stream to identify a set of new data blocks; determining a percentage of the modified data stream to store; determining a deduplication result based on comparing the determined percentage to the change rate parameter; and in response to determining that the determined percentage does not match the change rate parameter, reconfiguring a deduplication technique. 16. The method of claim 15 , wherein generating the modified data stream includes changing at least one value associated with one location within a data block associated with the non-deduplicatable data stream. 17. The method of claim 15 further comprising receiving a change rate revision parameter corresponding to the change rate parameter and wherein generating the modified data stream includes: determining the at least portion of a plurality of data blocks associated with the non-deduplicatable data stream to modify based at least in part on the change rate parameter and the is change rate revision parameter; determining a first data block of the at least portion of the plurality of data blocks to corrupt, wherein the first data block comprises a plurality of elements; determining an element of

Assignees

Inventors

Classifications

  • De-duplication implemented within the file system, e.g. based on file segments (de-duplication techniques in storage systems for the management of data blocks G06F3/0641) · CPC title

  • for test design, e.g. generating new test cases · CPC title

  • using de-duplication of the data · CPC title

  • Monitoring · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10853324B2 cover?
Generating a data stream with a predictable change rate is disclosed, including: receiving a change rate parameter; and using the change rate parameter to provide a modified data stream that differs from a corresponding unmodified non-deduplicatable data stream by an amount determined based at least in part on the change rate parameter, including by: modifying at least a portion of a plurality …
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/1748. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 01 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).