System and method for indirect data classification in a storage system operations

US11775193B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11775193-B2
Application numberUS-201916528620-A
CountryUS
Kind codeB2
Filing dateAug 1, 2019
Priority dateAug 1, 2019
Publication dateOct 3, 2023
Grant dateOct 3, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for managing data includes obtaining data from a host, wherein the data is associated with an object identifier (ID), initiating a classification mapping update to obtain a classification entry, applying an erasure coding procedure to the data to obtain a plurality of data chunks and at least one parity chunk, deduplicating the plurality of data chunks to obtain a plurality of deduplicated data chunks, generating storage metadata associated with the plurality of deduplicated data chunks and the at least one parity chunk, generating an object entry associated with the plurality of data chunks, and the at least one parity chunk, wherein the object entry comprises the object ID and a classification ID, storing the storage metadata and the object entry in an accelerator pool, and storing the plurality of deduplicated data chunks and the at least one parity chunk.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for managing data, the method comprising: obtaining, by a data classification engine executing on an accelerator pool, the data from a host, wherein the data is associated with an object identifier (ID) and an object, wherein the accelerator pool comprises a first plurality of data nodes; initiating a classification attributes update to obtain a classification mapping entry, wherein the classification mapping entry is associated with a classification ID, wherein a classification mapping comprises the classification mapping entry, wherein the classification mapping entry of comprises the object ID, the classification ID, and a plurality of classification attributes, and wherein the plurality of classification attributes comprises at least one of: a retention period for the data and a regulation associated with the data, wherein the regulation comprises a set of legal standards applied to a legal entity, and wherein the legal entity owns the data; applying an erasure coding procedure to the data to obtain a plurality of data chunks and at least one parity chunk; deduplicating the plurality of data chunks using a unique fingerprint generated for each of the plurality of data chunks to obtain a plurality of deduplicated data chunks; generating storage metadata associated with the plurality of deduplicated data chunks and the at least one parity chunk; generating an object entry associated with the plurality of data chunks, and the at least one parity chunk, wherein the object entry comprises the object ID and the classification ID; storing the storage metadata and the object entry in the accelerator pool; and storing, across a plurality of fault domains of a non-accelerator pool, the plurality of deduplicated data chunks and the at least one parity chunk, wherein the non-accelerator pool comprises a second plurality of data nodes, and wherein the first plurality of data nodes has a higher performance relative to the second plurality of data nodes. 2. The method of claim 1 , further comprising: making a determination that the classification mapping needs to be changed; and in response to the determination: updating the classification mapping to obtain an updated classification mapping; and updating the object entry based on the update to the classification mapping. 3. The method of claim 2 , wherein initiating the classification attributes update comprises: obtaining the plurality of classification attributes; and storing the plurality of classification attributes in the classification mapping. 4. The method of claim 3 , wherein the classification mapping is stored in the data classification engine. 5. The method of claim 1 , further comprising: initiating metadata distribution of the storage metadata and the object entry across the plurality of fault domains. 6. A non-transitory computer readable medium comprising computer readable program code, which when executed by a computer processor enables the computer processor to perform a method for managing data, the method comprising: obtaining the data from a host, wherein the data is associated with an object identifier (ID) and an object; initiating a classification attributes update to obtain a classification mapping entry, wherein the classification mapping entry is associated with a classification ID, wherein a classification mapping comprises the classification mapping entry, wherein the classification mapping entry of comprises the object ID, the classification ID, and a plurality of classification attributes, and wherein the plurality of classification attributes comprises at least one of: a retention period for the data and a regulation associated with the data, wherein the regulation comprises a set of legal standards applied to a legal entity, and wherein the legal entity owns the data; applying an erasure coding procedure to the data to obtain a plurality of data chunks and at least one parity chunk; deduplicating the plurality of data chunks using a unique fingerprint generated for each of the plurality of data chunks to obtain a plurality of deduplicated data chunks; generating storage metadata associated with the plurality of deduplicated data chunks and the at least one parity chunk; generating an object entry associated with the plurality of data chunks, and the at least one parity chunk, wherein the object entry comprises the object ID and the classification ID; storing the storage metadata and the object entry in the accelerator pool, wherein the accelerator pool comprises a first plurality of data nodes; and storing, across a plurality of fault domains of a non-accelerator pool, the plurality of deduplicated data chunks and the at least one parity chunk, wherein the non-accelerator pool comprises a second plurality of data nodes, and wherein the first plurality of data nodes has a higher performance relative to the second plurality of data nodes. 7. The non-transitory computer readable medium of claim 6 , the method further comprising: making a determination that the classification mapping needs to be changed; and in response to the determination: updating the classification mapping to obtain an updated classification mapping; and updating the object entry based on the update to the classification mapping. 8. The non-transitory computer readable medium of claim 7 , wherein initiating the classification attributes update comprises: obtaining the plurality of classification attributes; and storing the plurality of classification attributes in the classification mapping. 9. The non-transitory computer readable medium of claim 8 , wherein the classification mapping is stored in a data classification engine of the accelerator pool. 10. The non-transitory computer readable medium of claim 6 , the method further comprising: initiating metadata distribution of the storage metadata and the object entry across the plurality of fault domains. 11. A data cluster, comprising: a host; and an accelerator pool comprising a plurality of data nodes, wherein a data node of the plurality of data nodes comprises a processor and memory comprising instructions, which when executed by the processor perform a method, the method comprising: obtaining data from the host, wherein the data is associated with an object identifier (ID) and an object; initiating a classification attributes update to obtain a classification mapping entry, wherein the classification mapping entry is associated with a classification ID, wherein a classification mapping comprises the classification mapping entry, wherein the classification mapping entry of comprises the object ID, the classification ID, and a plurality of classification attributes, and wherein the plurality of classification attributes comprises at least one of: a retention period for the data and a regulation associated with the data, wherein the regulation comprises a set of legal standards applied to a legal entity, and wherein the legal entity owns the data; applying an erasure coding procedure to the data to obtain a plurality of data chunks and at least one parity chunk; deduplicating the plurality of data chunks using a unique fingerprint generated for each of the plurality of data chunks to obtain a plurality of deduplicated data chunks; generating storage metadata associated with the plurality of deduplicated data chunks and the at least one parity chunk; generating an object entry associated with the plurality of data chunks, and the at least one parity chunk, wherein the object entry comprises the object ID and the classification ID; storing the storage metadata and the obje

Assignees

Inventors

Classifications

  • G06F3/0641Primary

    De-duplication techniques · CPC title

  • Improving or facilitating administration, e.g. storage management · CPC title

  • Plurality of storage devices · CPC title

  • Classification techniques · CPC title

  • Securing storage systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11775193B2 cover?
A method for managing data includes obtaining data from a host, wherein the data is associated with an object identifier (ID), initiating a classification mapping update to obtain a classification entry, applying an erasure coding procedure to the data to obtain a plurality of data chunks and at least one parity chunk, deduplicating the plurality of data chunks to obtain a plurality of deduplic…
Who is the assignee on this patent?
Dell Products Lp
What technology area does this patent fall under?
Primary CPC classification G06F3/0641. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 03 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).