Generating and morphing a collection of files in a folder/sub-folder structure that collectively has desired dedupability, compression, clustering and commonality

US11455281B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11455281-B2
Application numberUS-201916389741-A
CountryUS
Kind codeB2
Filing dateApr 19, 2019
Priority dateApr 19, 2019
Publication dateSep 27, 2022
Grant dateSep 27, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

One example method includes receiving a set of filesystem parameters, creating a simulated filesystem based on the filesystem parameters, receiving a set of target characteristics for a file collection, based on the target characteristics, slicing a datastream into a grouping of data slices, populating the simulated files with the data slices to create the file collection and forward or reverse morphing the file collection from one generation to another without rewriting the entire file collection.

First claim

Opening claim text (preview).

What is claimed is: 1. A non-transitory storage medium having stored therein instructions which are executable by one or more hardware processors to perform operations comprising: receiving a set of parameters for a simulated filesystem; creating the simulated filesystem based on the parameters, wherein the simulated filesystem comprises a plurality of simulated files that mimic real world files and data; receiving a set of target characteristics for a collection that comprises the simulated files; based on the set of target characteristics, slicing a datastream into a grouping of data slices, wherein each of the data slices is a size of a respective one of the plurality of simulated files; populating the plurality of simulated files with respective data slices of the grouping of data slices, wherein each of the plurality of simulated files contains a single data slice of the grouping of data slices; and testing and evaluating a backup application using the simulated files in the collection. 2. The non-transitory storage medium as recited in claim 1 , wherein the set of target characteristics comprises one or more of dedupability, compressibility, commonality, and clustering. 3. The non-transitory storage medium as recited in claim 1 , wherein the target characteristics are representative of characteristics present in the datastream. 4. The non-transitory storage medium as recited in claim 1 , wherein the simulated files in the collection collectively possess the set of target characteristics. 5. The non-transitory storage medium as recited in claim 1 , wherein the filesystem parameters comprise a configuration of a file structure of the simulated filesystem. 6. The non-transitory storage medium as recited in claim 1 , further comprising receiving parameters of the collection and the parameters comprise: a total size of the collection; growth of the collection; and average size of each file in the collection. 7. The non-transitory storage medium as recited in claim 1 , wherein receipt of parameters for the simulated filesystem is performed in parallel with receipt of the set of target characteristics for the collection. 8. The non-transitory storage medium as recited in claim 1 , wherein both the creation of the simulated filesystem and slicing of the datastream are performed on a file level basis. 9. The non-transitory storage medium as recited in claim 1 , wherein a collective size of the collection is the same size as a collective size of the data slices taken from the datastream. 10. A method, comprising the operations: receiving a set of parameters for a simulated filesystem; creating the simulated filesystem based on the parameters, wherein the simulated filesystem comprises a plurality of simulated files that mimic real world files and data; receiving a set of target characteristics for a collection that comprises the simulated files; based on the set of target characteristics, slicing a datastream into a grouping of data slices, wherein each of the data slices is a size of a respective one of the plurality of simulated files; populating the plurality of simulated files with respective data slices of the grouping of data slices, wherein each of the plurality of simulated files contains a single data slice of the grouping of data slices; and testing and evaluating a backup application using the simulated files in the collection. 11. The method as recited in claim 10 , wherein the set of target characteristics comprise one or more of dedupability, compressibility, commonality, and clustering. 12. The method as recited in claim 10 , wherein the simulated files in the collection collectively possess the set of target characteristics. 13. The method as recited in claim 10 , wherein both the creation of the simulated filesystem and slicing of the datastream are performed on a file level basis. 14. A system, comprising: one or more hardware processors; and a non-transitory storage medium having stored therein instructions which are executable by the one or more hardware processors to perform operations comprising receiving a set of parameters for a simulated filesystem; creating the simulated filesystem based on the parameters, wherein the simulated filesystem comprises a plurality of simulated files that mimic real world files and data; receiving a set of target characteristics for a collection that comprises the simulated files; based on the set of target characteristics, slicing a datastream into a grouping of data slices, wherein each of the data slices is a size of a respective one of the plurality of simulated files; populating the plurality of simulated files with respective data slices of the grouping of data slices, wherein each of the plurality of simulated files contains a single data slice of the grouping of data slices; and testing and evaluating a backup application using the simulated files in the collection. 15. The system as recited in claim 14 , wherein the set of target characteristics comprises one or more of dedupability, compressibility, commonality, and clustering. 16. The system as recited in claim 14 , wherein the simulated files in the collection collectively possess the set of target characteristics. 17. The system as recited in claim 14 , wherein both the creation of the simulated filesystem and slicing of the datastream are performed on a file level basis.

Assignees

Inventors

Classifications

  • Versioning file systems, temporal file systems, e.g. file system supporting different historic versions of files · CPC title

  • File access structures, e.g. distributed indices (arrangements of input from, or output to, record carriers G06F3/06) · CPC title

  • Hybrid storage device · CPC title

  • Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

  • G06F3/0641Primary

    De-duplication techniques · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11455281B2 cover?
One example method includes receiving a set of filesystem parameters, creating a simulated filesystem based on the filesystem parameters, receiving a set of target characteristics for a file collection, based on the target characteristics, slicing a datastream into a grouping of data slices, populating the simulated files with the data slices to create the file collection and forward or reverse…
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F3/0641. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 27 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).