User clustering based on metadata analysis
US-10817542-B2 · Oct 27, 2020 · US
US11841965B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11841965-B2 |
| Application number | US-202117508161-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 22, 2021 |
| Priority date | Aug 12, 2021 |
| Publication date | Dec 12, 2023 |
| Grant date | Dec 12, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments for a system and method of selecting data protection policies for a new system, by collecting user, policy, and asset metadata for a plurality of other users storing data dictated by one or more protection policies. The collected metadata is anonymized with respect to personal identifying information, and is stored in an anonymized analytics database. The system receives specific user, policy and asset metadata for the new system from a specific user, and matches the received specific user metadata to the collected metadata to identify an optimum protection policy of the one or more protection policies based on the assets and protection requirements of the new system. The new system is then configured with the identified optimum protection policy as an initial configuration.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method of selecting data protection policies for a new system, comprising: collecting, in a hardware-based asset metadata management component, user, policy, and asset metadata for a plurality of users storing data dictated by one or more protection policies, wherein the collected metadata is anonymized with respect to personal identifying information of the users; storing the collected metadata in an anonymized analytics database; receiving specific user, policy and asset metadata for the new system from a specific user; and matching, in the component, the received specific user, the policy and the asset metadata to the collected metadata to identify an optimum protection policy of the one or more protection policies based on the assets and protection requirements of the new system, wherein the asset metadata is derived using a cluster analysis process, wherein for each asset: defining a set of metrics characterizing each asset in the system; extracting metadata of the set of metrics from an asset to be assigned a protection policy; comparing each metric of the asset with corresponding metadata of a plurality of clusters each containing one or more other assets, wherein a unique protection policy is assigned to each cluster of the plurality of clusters to be applied to each asset within a respective cluster; determining an overall affinity score of the asset relative to each cluster; and automatically grouping the asset into a cluster with the highest overall affinity score; and displaying to the specific user, through a graphical user interface (GUI), information regarding the matching to allow the user to confirm or change identification of the optimum protection policy. 2. The method of claim 1 further comprising configuring the new system with the identified optimum protection policy as an initial configuration of the new system. 3. The method of claim 2 wherein the new system is a newly deployed computer network installed at day zero of a deployment period. 4. The method of claim 1 wherein the collected user metadata comprises at least one of: a company type based on common industry classification, geolocation information of a user of the other users, a number of assets of each user of the other users; and a distribution of managed items of the other users. 5. The method of claim 4 wherein the collected policy metadata comprises at least one of: a policy name, an asset type, a backup method, a backup frequency, a backup location, and a retention period. 6. The method of claim 5 wherein the collected user and policy metrics comprise characteristics that define certain features of each asset relevant to a data backup or restore operation conducted by a respective other user data protection system, and wherein each metric is specified by a corresponding metadata element in each asset. 7. A computer-implemented method of assigning data assets of a new system to corresponding protection policies, comprising: extracting, in a hardware-based asset metadata management component, metadata for a plurality of metrics for each asset for a plurality of users other than the specific user, wherein the extracted metadata is anonymized with respect to personal identifying information of the users; storing the extracted metadata in an anonymized analytics database; comparing the metadata for each asset to corresponding asset metadata for each other asset; calculating an affinity percentage for each metric of the asset with the metrics of each other asset; determining an overall affinity percentage for the asset based on the calculated affinity percentage for each metric; and automatically grouping, in the component, the data assets of the new system with clusters of other assets when the overall affinity percentage exceeds a defined threshold value, wherein the asset metadata is derived using a cluster analysis process, wherein for each asset: defining a set of metrics characterizing each asset in the system; extracting metadata of the set of metrics from an asset to be assigned a protection policy; comparing each metric of the asset with corresponding metadata of a plurality of clusters each containing one or more other assets, wherein a unique protection policy is assigned to each cluster of the plurality of clusters to be applied to each asset within a respective cluster; determining an overall affinity score of the asset relative to each cluster; and automatically grouping the asset into a cluster with the highest overall affinity score; and displaying to the specific user, through a graphical user interface (GUI), information regarding the overall affinity percentage to allow the user to confirm or change identification of the grouping. 8. The method of claim 7 wherein the plurality of metrics each comprise an attribute that defines certain features of each asset relevant to a data storage or movement operation conducted by the data protection system, and wherein each metric is specified by a corresponding metadata element in the asset. 9. The method of claim 8 wherein the grouping determines a protection policy to be applied to the grouped assets, and wherein a different protection policy is applied to each cluster of assets. 10. The method of claim 7 wherein the metadata extracted for the plurality of other users comprises anonymized data having no personally identifying or identifiable information. 11. The method of claim 10 further comprising storing the extracted metadata in an anonymized analytics database. 12. The method of claim 10 wherein the extracted metadata comprises user metadata including at least one of: a company type based on common industry classification, geolocation information of a user of the other users, a number of assets of each user of the other users; and a distribution of managed items of the other users. 13. The method of claim 12 wherein the extracted metadata includes policy metadata comprising at least one of: a policy name, an asset type, a backup method, a backup frequency, a backup location, and a retention period. 14. A computer-implemented method of grouping assets for protection policy assignment based on asset metadata in a data protection system, comprising: grouping, in a hardware-based asset metadata management component, the assets into respective clusters based on a sufficiently high similarity of characteristics defined by metadata elements of the assets; wherein the metadata elements comprise metadata extracted for the plurality of other users comprises anonymized data having no personally identifying or identifiable information; assigning a unique protection policy to each cluster of grouped assets; storing asset metadata signatures for each asset in an anonymized analytics database; using, in the component, the asset metadata signatures to identify one or more policies to apply to a specific user of a new computer system, wherein the asset metadata is derived using a cluster analysis process, wherein for each asset: defining a set of metrics characterizing each asset in the system; extracting metadata of the set of metrics from an asset to be assigned the protection policy; comparing each metric of the asset with corresponding metadata of a plurality of clusters each containing one or more other assets, wherein a unique protection policy is assigned to each cluster of the plurality of clusters to be applied to each asset within a respective cluster; determining an overall affinity score of the asset relative to each cluster; and automatically grouping the asset into a cluster with the highest overall
to a system of files or objects, e.g. local or distributed file system or database · CPC title
Backup scheduling policy · CPC title
Clustering or classification · CPC title
by anonymising data, e.g. decorrelating personal data from the owner's identification · CPC title
Backup or restore · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.