Systems and methods for executing jump-based content-defined data chunking

US12517869B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12517869-B2
Application numberUS-202318477853-A
CountryUS
Kind codeB2
Filing dateSep 29, 2023
Priority dateSep 29, 2023
Publication dateJan 6, 2026
Grant dateJan 6, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed herein are systems and method for executing jump-based content-defined data chunking. In one aspect, a method may generate, for a byte stream of data, a primary window and a secondary window that overlaps with the primary window. A method may scan the byte stream for data chunks using the primary window and the secondary window, wherein the scanning comprises (a) executing, based on minimum values in the secondary window when shifted along the primary window, a jump mechanism in which the primary window and the secondary window are shifted forward by a fixed amount of bytes on the byte stream and (b) marking cut-off points of detected data chunks based on maximum values within and outside the primary window. A method may output the cut-off points of the detected data chunks.

First claim

Opening claim text (preview).

The invention claimed is: 1 . A method for data chunking, the method comprising: generating, for a byte stream of data, a primary window and a secondary window that overlaps with the primary window; scanning the byte stream for data chunks using the primary window and the secondary window, wherein the scanning comprises: executing, based on minimum values in the secondary window when shifted along the primary window, a jump mechanism in which the primary window and the secondary window are shifted forward by a fixed amount of bytes on the byte stream; and marking cut-off points of detected data chunks based on maximum values within and outside the primary window; and outputting the cut-off points of the detected data chunks. 2 . The method of claim 1 , wherein executing the jump mechanism comprises: identifying, when the secondary window is in a first position on the byte stream, a first minimum value within bytes in the secondary window; sliding the secondary window to a second position on the byte stream; identifying, when the secondary window is in the second position, a second minimum value within bytes in the secondary window; in response to determining that the second minimum value is less than the first minimum value, shifting the primary window by the fixed amount of bytes on the byte stream; and shifting the secondary window to the start of the shifted primary window. 3 . The method of claim 1 , wherein marking the cut-off points comprises: identifying, when the secondary window is in a first position on the byte stream, a first minimum value within bytes in the secondary window; detecting that the secondary window has traversed all bytes in the primary window without detecting a value lower than the first minimum value; identifying a maximum value within the bytes of the primary window; comparing the maximum value to a respective value of each byte ahead of the primary window; and in response to detecting, based on the comparing, a byte with a value greater than the maximum value, marking the byte as a cut-off point of a respective data chunk. 4 . The method of claim 1 , wherein the primary window is of a first size and the secondary window is of a second size, wherein the first size is greater than the second size. 5 . The method of claim 4 , further comprising: subsequent to detecting a data chunk by marking a cut-off point, determining an amount of bytes remaining in the byte stream; in response to determining that the amount of bytes is less than the first size plus one byte, identifying the bytes remaining in the byte stream as a last data chunk in the byte stream. 6 . The method of claim 5 , further comprising: in response to determining that the amount of bytes is not less than the first size plus one byte, continuing to scan the byte stream for data chunks. 7 . The method of claim 1 , wherein the secondary window is nested in the primary window. 8 . The method of claim 1 , further comprising: identifying a desired throughput value; setting sizes of the primary window, the secondary window, and the fixed amount of bytes of the jump mechanism based on the desired throughput value. 9 . The method of claim 8 , wherein identifying the desired throughput value comprises: determining a size of the byte stream; receiving a threshold period of time to complete the data chunking; and calculating the desired throughput value based on the size of the byte stream and the threshold period of time. 10 . A system for data chunking, comprising: at least one memory; at least one hardware processor coupled with the at least one memory and configured, individually or in combination, to: generate, for a byte stream of data, a primary window and a secondary window that overlaps with the primary window; scan the byte stream for data chunks using the primary window and the secondary window, wherein the scanning comprises: executing, based on minimum values in the secondary window when shifted along the primary window, a jump mechanism in which the primary window and the secondary window are shifted forward by a fixed amount of bytes on the byte stream; and marking cut-off points of detected data chunks based on maximum values within and outside the primary window; and output the cut-off points of the detected data chunks. 11 . The system of claim 10 , wherein the at least one hardware processor is configured to execute the jump mechanism by: identifying, when the secondary window is in a first position on the byte stream, a first minimum value within bytes in the secondary window; sliding the secondary window to a second position on the byte stream; identifying, when the secondary window is in the second position, a second minimum value within bytes in the secondary window; in response to determining that the second minimum value is less than the first minimum value, shifting the primary window by the fixed amount of bytes on the byte stream; and shifting the secondary window to the start of the shifted primary window. 12 . The system of claim 10 , wherein the at least one hardware processor is configured to mark the cut-off points by: identifying, when the secondary window is in a first position on the byte stream, a first minimum value within bytes in the secondary window; detecting that the secondary window has traversed all bytes in the primary window without detecting a value lower than the first minimum value; identifying a maximum value within the bytes of the primary window; comparing the maximum value to a respective value of each byte ahead of the primary window; and in response to detecting, based on the comparing, a byte with a value greater than the maximum value, marking the byte as a cut-off point of a respective data chunk. 13 . The system of claim 10 , wherein the primary window is of a first size and the secondary window is of a second size, wherein the first size is greater than the second size. 14 . The system of claim 13 , wherein the at least one hardware processor is configured to: subsequent to detecting a data chunk by marking a cut-off point, determine an amount of bytes remaining in the byte stream; in response to determining that the amount of bytes is less than the first size plus one byte, identify the bytes remaining in the byte stream as a last data chunk in the byte stream. 15 . The system of claim 14 , wherein the at least one hardware processor is configured to: in response to determining that the amount of bytes is not less than the first size plus one byte, continue to scan the byte stream for data chunks. 16 . The system of claim 10 , wherein the secondary window is nested in the primary window. 17 . The system of claim 10 , wherein the at least one hardware processor is configured to: identify a desired throughput value; set sizes of the primary window, the secondary window, and the fixed amount of bytes of the jump mechanism based on the desired throughput value. 18 . The system of claim 17 , wherein the at least one hardware processor is configured to identify the desired throughput value by: determining a size of the byte stream; receiving a threshold period of time to complete the data chunking; and calculating the desired throughput value based on the size of the byte stream and the threshold period of time. 19 . A non-transitory computer readable medium storing thereon computer executable instructions for data chunking, including instructions for: generating, for a byte stream o

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12517869B2 cover?
Disclosed herein are systems and method for executing jump-based content-defined data chunking. In one aspect, a method may generate, for a byte stream of data, a primary window and a secondary window that overlaps with the primary window. A method may scan the byte stream for data chunks using the primary window and the secondary window, wherein the scanning comprises (a) executing, based on m…
Who is the assignee on this patent?
Acronis Int Gmbh
What technology area does this patent fall under?
Primary CPC classification G06F16/1752. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 06 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).