Managing objects stored at a remote storage
US-2023079486-A1 · Mar 16, 2023 · US
US12153638B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12153638-B2 |
| Application number | US-202217975035-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 27, 2022 |
| Priority date | Oct 27, 2022 |
| Publication date | Nov 26, 2024 |
| Grant date | Nov 26, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Providing content based data protection for data stored in a large-scale data storage system by creating a dataset by grouping metadata for unstructured data objects that are grouped together by one or more filters. The dataset can span multiple storage devices of different types, so that it defines a single data protection unit for the corresponding content data. A user initiated query input through a search engine interface generates the one or more filters, and a protection policy is defined that protects the dataset as the single unit based on data content rather than data location. Datasets are stored in a catalog, and are generated by running queries on the catalog, where a query comprises metadata selectors as tags applied to the catalog, where the tags define at least one of a file type, name, location, creation time, or file characteristic.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method of providing content-based data protection for data stored in a large-scale data storage system, comprising: receiving a query through an interface to a computerized search engine accessing data in the data storage system; defining a protection policy to protect selected data stored in different storage devices and network environments through a deduplication backup program executed by a backup server in the data storage system, the selected data having corresponding metadata, wherein the metadata comprises information describing the selected data through one or more characteristics to establish a unique data identifier for corresponding content data; generating a response to the query comprising a dataset produced upon processing the query, wherein the dataset automatically tracks data added, removed or relocated to content data protected by the defined protection policy; grouping, to create the dataset, metadata for unstructured data objects spanning multiple storage devices of different storage types by one or more filters, wherein the protection policy protects the selected data as a single unit based on data content rather than data location; and applying the defined protection policy to the dataset to perform a data protection operation by the backup server on the selected data. 2. The method of claim 1 wherein the dataset represents a subset of data that a user categorizes for specific purposes, wherein actions performed on the dataset will affect only the subset of data. 3. The method of claim 2 wherein the data protection operation comprises one of: backing up data from operating memory to storage memory, restoring data from the storage to the operating memory, moving data among storage devices, and tiering data between different storage devices. 4. The method of claim 1 further comprising tagging the selected data with a defined metadata tag. 5. The method of claim 4 further comprising: generating the one or more filters upon entry of the query. 6. The method of claim 5 wherein the query comprises metadata selectors applied to a catalog. 7. The method of claim 6 wherein the metadata selectors comprise tags consisting of alphanumeric strings applied to respective data objects based on user-defined rules, and wherein the tags define at least one of a file type, name, location, creation time, or characteristic. 8. The method of claim 7 wherein the dataset is one of a static dataset or a dynamic dataset, wherein the static dataset comprises a fixed amount of data set at a time of creation, and the dynamic dataset comprises an amount of data that changes over time, and wherein the dataset is organized into collection information and per file and object information. 9. The method of claim 8 wherein collection information comprises a dataset creation time, the query, role-based access control (RBAC) for the dataset, and first free-form metadata, and wherein the per file and object information comprises location of data of the dataset, unstructured metadata information, and second free-form metadata. 10. The method of claim 1 wherein the dataset spans multiple storage device types and multiple operating environments including edge networks, core networks and public or cloud networks. 11. A computer-implemented method of providing content-based data protection for data stored in a large-scale data storage system, comprising: defining a protection policy to protect selected data stored in different storage devices or network environments through a deduplication backup program executed by a backup server in the data storage system, the selected data having corresponding metadata, wherein the metadata comprises information describing the selected data through one or more characteristics to establish a unique data identifier for corresponding content data; storing the metadata in a catalog; executing, through an interface to a computerized search engine, a user entered query against the catalog to generate a dataset; generating a response to the query that comprises the dataset; grouping, to create the dataset, metadata for unstructured data objects spanning multiple storage devices of different storage types by one or more filters, wherein the protection policy protects the selected data as a single unit based on data content rather than data location; and applying the defined protection policy to the dataset to protect or otherwise operate on the selected data by the backup server. 12. The method of claim 11 further comprising tagging the selected data with a defined metadata tag. 13. The method of claim 12 further comprising: generating the one or more filters upon entry of the query. 14. The method of claim 13 wherein the query comprises metadata selectors applied to the catalog. 15. The method of claim 14 wherein the metadata selectors comprise tags consisting of alphanumeric strings applied to respective data objects based on user-defined rules, and wherein the tags define at least one of a file type, name, location, creation time, or characteristic. 16. The method of claim 15 wherein the dataset is one of a static dataset or a dynamic dataset, wherein the static dataset comprises a fixed amount of data set at a time of creation, and the dynamic dataset comprises an amount of data that changes over time, and wherein the dataset is organized into collection information and per file and object information. 17. The method of claim 16 wherein collection information comprises a dataset creation time, the query, role-based access control (RBAC) for the dataset, and first free-form metadata, and wherein the per file and object information comprises location of data of the dataset, unstructured metadata information, and second free-form metadata. 18. The method of claim 17 wherein the dataset spans multiple storage device types and multiple operating environments including edge networks, core networks and public or cloud networks. 19. The method of claim 11 wherein the defined protection policy comprises at least one of: backing up data from operating memory to storage memory, restoring data from the storage to the operating memory, moving data among memory, and tiering data between different storage memory. 20. A system for providing content-based data protection for data stored in a large-scale data storage system, comprising: a computerized search engine receiving a query through an interface to access data in the data storage system; a backup server in the data storage system executing a deduplication backup program using a defined a protection policy to protect selected data stored in different storage devices or network environments, the selected data having corresponding metadata, wherein the metadata comprises information describing the selected data through one or more characteristics to establish a unique data identifier for corresponding content data; a search engine component generating a response to the query comprising a dataset produced upon processing the query, wherein the dataset automatically tracks data added, removed or relocated to content data protected by the defined protection policy; a component creating the dataset by grouping metadata for unstructured data objects spanning multiple storage devices of different storage types by one or more filters, wherein the protection policy protects the selected data as a single unit based on data content rather than data location; and a backup server component applying the defin
Querying, e.g. by the use of web search engines · CPC title
to a system of files or objects, e.g. local or distributed file system or database · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.