Large content file optimization

US12164386B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12164386-B2
Application numberUS-202318197491-A
CountryUS
Kind codeB2
Filing dateMay 15, 2023
Priority dateJun 29, 2018
Publication dateDec 10, 2024
Grant dateDec 10, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A plurality of portions of a content file are stored. It is determined that the content file has a size that is greater than a threshold size. In response to determining that the content file has the size that is greater than the threshold size, a plurality of component file metadata structures are generated for each of the plurality of portions of the content file. A component file metadata structure of the plurality of component file metadata structures corresponds to one of the portions of the content file. Each of the plurality of component file metadata structures includes corresponding metadata that enables data chunks associated with a corresponding portion of the content file to be located.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: storing a plurality of portions of a content file; determining that the content file has a size that is greater than a threshold size; and based on determining that the content file has the size that is greater than the threshold size, generating a plurality of component file metadata structures for each of the plurality of portions of the content file, wherein each of the plurality of component file metadata structures corresponds to a respective portion of the plurality of portions of the content file, wherein each of the plurality of component file metadata structures comprises a tree structure including one or more leaf nodes that enable data chunks associated with the respective portion of the content file to be located. 2. The method of claim 1 , further comprising receiving the plurality of portions of the content file. 3. The method of claim 1 , wherein at least two of the plurality of portions of the content file have a same size. 4. The method of claim 1 , wherein at least two of the plurality of portions of the content file have a different size. 5. The method of claim 1 , wherein the plurality of portions of the content file are included in a full backup of a primary system. 6. The method of claim 1 , wherein the plurality of portions of the content file are included in an incremental backup of a primary system. 7. The method of claim 1 , wherein each of the one or more leaf nodes is associated with a corresponding data brick, wherein the corresponding data brick is an identifier for one or more of the data chunks. 8. The method of claim 7 , wherein a last data brick of at least one of the one or more leaf nodes has a particular capacity, wherein the last data brick is brick aligned in the event the last data brick is associated with a data chunk of the data chunks having the particular capacity. 9. The method of claim 7 , wherein a last data brick of at least one of the one or more leaf nodes has a particular capacity, wherein in the event the last data brick of the plurality of leaf nodes is not brick aligned, an unused portion of the last data brick is reserved for the content file. 10. The method of claim 1 , further comprising generating a tree data structure that provides a view of a primary system, wherein generating the tree data structure includes generating the plurality of component file metadata structures for each of the plurality of portions of the content file. 11. The method of claim 1 , wherein a first leaf node of a plurality of leaf nodes stores a first vector that indicates a size of corresponding content file data that is associated with the plurality of component file metadata structures. 12. The method of claim 11 , wherein the first leaf node of the plurality of leaf nodes stores information that indicates which component file metadata structure of the plurality of component file metadata structures is associated with which portion of the content file. 13. The method of claim 11 , wherein a plurality of sequential component file metadata structures associated with the content file have a same corresponding size. 14. The method of claim 13 , wherein the first vector utilizes run length encoding for the plurality of sequential component file metadata structures associated with the content file that have the same corresponding size. 15. The method of claim 14 , wherein the first leaf node of the plurality of leaf nodes stores a second vector that indicates a number of the plurality of sequential component file metadata structures that have the same corresponding size. 16. A computer program product, the computer program product being embodied in a non-transitory computer readable storage medium and comprising computer instructions for: storing a plurality of portions of a content file; determining that the content file has a size that is greater than a threshold size; and based on determining that the content file has the size that is greater than the threshold size, generating a plurality of component file metadata structures for each of the plurality of portions of the content file, wherein each of the plurality of component file metadata structures corresponds to a respective portion of the plurality of portions of the content file, wherein each of the plurality of component file metadata structures comprises a tree structure including one or more leaf nodes that enable data chunks associated with the respective portion of the content file to be located. 17. The computer program product of claim 16 , wherein at least two of the plurality of portions of the content file have a same size. 18. The computer program product of claim 16 , wherein at least two of the plurality of portions of the content file have a different size. 19. A system, comprising: a processor configured to: store a plurality of portions of a content file; determine that the content file has a size that is greater than a threshold size; and based on determining that the content file has the size that is greater than the threshold size, generate a plurality of component file metadata structures for each of the plurality of portions of the content file, wherein each of the plurality of component file metadata structures corresponds to a respective portion of the plurality of portions of the content file, wherein each of the plurality of component file metadata structures comprises a tree structure including one or more leaf nodes that enable data chunks associated with the respective portion of the content file to be located; and a memory coupled to the processor and configured to provide the processor with instructions.

Assignees

Inventors

Classifications

  • Trees, e.g. B+trees · CPC title

  • File access structures, e.g. distributed indices (arrangements of input from, or output to, record carriers G06F3/06) · CPC title

  • Details of file system snapshots on the file-level, e.g. snapshot creation, administration, deletion (error detection or correction of the data by redundancy in operations or in hardware G06F11/14, G06F11/16) · CPC title

  • Using snapshots, i.e. a logical point-in-time copy of the data · CPC title

  • Threshold · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12164386B2 cover?
A plurality of portions of a content file are stored. It is determined that the content file has a size that is greater than a threshold size. In response to determining that the content file has the size that is greater than the threshold size, a plurality of component file metadata structures are generated for each of the plurality of portions of the content file. A component file metadata st…
Who is the assignee on this patent?
Cohesity Inc
What technology area does this patent fall under?
Primary CPC classification G06F11/1458. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 10 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).