Transaction log index generation in an enterprise backup system
US-2021034571-A1 · Feb 4, 2021 · US
US12475145B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12475145-B2 |
| Application number | US-202318494186-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 25, 2023 |
| Priority date | Apr 2, 2021 |
| Publication date | Nov 18, 2025 |
| Grant date | Nov 18, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods for providing automated data governance are disclosed. The system may include a plurality of data environments, a metadata repository storing data attributes and classification requirements, a policy repository, one or more processors, and a memory in communication with the one or more processors storing instructions to execute steps of a method. The system may receive a first dataset from a first data environment having a first dataset ID. The system may transmit the dataset ID to the metadata repository and the metadata repository may return an indication that the first dataset includes at least one data attribute and at least one associated classification requirement. The system may transmit the classification requirement to the policy repository and receive classification code associated with the classification requirement. The system may modify the first dataset by transmitting instructions to the first data environment to execute the classification code.
Opening claim text (preview).
What is claimed is: 1 . A system comprising: a classification management device comprising: one or more processors; and memory in communication with the one or more processors and storing instructions that, when executed by the one or more processors, are configured to cause the system to: identify a first dataset stored in a first data environment, among a plurality of data environments, for which a metadata repository is missing at least one data attribute; responsive to parsing the first dataset stored in the first data environment to identify one or more data attributes associated with the first dataset that are missing from the metadata repository, transmit a first policy identifier and the identified one or more data attributes to a policy repository; store one or more standardized code arguments in the policy repository, wherein each of the one or more standardized code arguments apply a respective policy to a dataset stored in one of one or more data environments; receive a classification code from the policy repository for each of the one or more identified data attributes based on the first policy identifier by receiving a list of approved standardized code arguments that can be automatically applied to a dataset, the classification code comprising a code argument of the one or more standardized code arguments to be automatically applied to a respective data attribute, wherein a first classification code received from the policy repository comprises a standardized code argument for data masking or tokenization; transmit instructions to the first data environment to execute each classification code for each of the one or more identified data attributes to modify the first dataset in the first data environment; and update the metadata repository with missing data attributes. 2 . The system of claim 1 , wherein identifying one or more data attributes associated with the first dataset comprises identifying data attributes that are missing from the metadata repository by: scanning the first dataset to identify every attribute associated with the first dataset; and comparing every attribute associated with the first dataset to a set of attributes stored in the metadata repository; and identifying attributes that are included in every attribute associated with the first dataset and that are not included in the set of attributes stored in the metadata repository. 3 . The system of claim 2 , wherein the set of attributes stored in the metadata repository is identified by querying the metadata repository with a dataset identifier of the first dataset. 4 . The system of claim 1 , wherein the instructions are configured to cause the system to: determine the first policy identifier associated with the first dataset based on the first data environment. 5 . The system of claim 1 , wherein executing each classification code for each of the one or more identified data attributes causes each of the one or more identified data attributes to automatically have standardized code arguments applied to conform with classification requirements for each specific data attribute based on the first policy identifier. 6 . The system of Claim 1 , wherein entries in the policy repository are used to update the metadata repository with the missing data attributes. 7 . A system comprising: a plurality of data environments comprising at least a first data environment storing a first dataset; a metadata repository storing a plurality of data attributes; a policy repository; and a classification management device comprising: one or more processors; and memory in communication with the one or more processors and storing instructions that, when executed by the one or more processors, are configured to cause the system to: identify the first dataset stored in the first data environment for which the metadata repository is missing at least one data attribute; responsive to parsing the first dataset stored in the first data environment to identify one or more data attributes associated with the first dataset, transmit a first policy identifier and the identified one or more data attributes to the policy repository; store one or more standardized code arguments in the policy repository, wherein each of the one or more standardized code arguments apply a respective policy to a dataset stored in one of one or more data environments; receive a classification code from the policy repository for each of the one or more identified data attributes based on the first policy identifier by receiving a list of approved standardized code arguments that can be automatically applied to a dataset, the classification code comprising a code argument of the one or more standardized code arguments to be automatically applied to a respective data attribute, wherein a first classification code received from the policy repository comprises a standardized code argument for data masking or tokenization; query the metadata repository to determine a data steward for the first dataset; monitor a compliance management database for a change of approval by the data steward; receive a data steward approval from the compliance management database; transmit instructions to the first data environment to execute each classification code for each of the one or more identified data attributes to modify the first dataset in the first data environment; and update the metadata repository with missing data attributes. 8 . The system of claim 7 , wherein identifying one or more data attributes associated with the first dataset comprises identifying data attributes that are missing from the metadata repository by: scanning the first dataset to identify every attribute associated with the first dataset; and comparing every attribute associated with the first dataset to a set of attributes stored in the metadata repository; and identifying attributes that are included in every attribute associated with the first dataset and that are not included in the set of attributes stored in the metadata repository. 9 . The system of claim 8 , wherein the set of attributes stored in the metadata repository is identified by querying the metadata repository with a dataset identifier of the first dataset. 10 . The system of claim 7 , wherein the instructions are configured to cause the system to: determine the first policy identifier associated with the first dataset based on the first data environment. 11 . The system of claim 7 , wherein executing each classification code for each of the one or more identified data attributes causes each of the one or more identified data attributes to automatically have standardized code arguments applied to conform with classification requirements for each specific data attribute based on the first policy identifier. 12 . The system of claim 7 , wherein the missing data attributes comprise the identified one or more data attributes. 13 . The system of claim 12 , wherein entries in the policy repository are used to update the metadata repository with the missing data attributes. 14 . The system of claim 7 , wherein the data steward comprises a permissioned user associated with the first dataset. 15 . A method comprising: receiving, by a classification management device, a first dataset stored in a first data environment of a plurality of data environments; responsive to parsing, by the classification management device, the first dataset stored in the first data environment to identify one or more data attributes associated with the first dataset that are missing from a metadata repository, transm
Query processing · CPC title
Clustering or classification · CPC title
Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.