Architecture for managing i/o and storage for a virtualization environment using executable containers and virtual machines
US-2016359955-A1 · Dec 8, 2016 · US
US10909136B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-10909136-B1 |
| Application number | US-201715428134-A |
| Country | US |
| Kind code | B1 |
| Filing date | Feb 8, 2017 |
| Priority date | Feb 8, 2017 |
| Publication date | Feb 2, 2021 |
| Grant date | Feb 2, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The disclosed computer-implemented method for automatically linking data analytics to storage may include (1) identifying a request to provision storage for a data analytics task, (2) collecting information relating to the data analytics task, the information comprising at least one of a data type of the data being used as input for the data analytics task and a characteristic of the data analytics task, (3) using a self-service provisioning tool to automatically compute, based on the collected information, a suggested type and size of data storage for the data analytics task, and (4) automatically provisioning data storage for the data analytics task based on the suggested type and size. Various other methods, systems, and computer-readable media are also disclosed.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for automatically linking data analytics to storage, at least a portion of the method being performed by a computing device comprising at least one processor, the method comprising: identifying one or more requests to provision storage within a container-based environment for a plurality of data analytics tasks comprising a first data analytics task, a second data analytics task, and a third data analytics task, the one or more requests to provision storage comprising one or more requests to provision containers within the container-based environment; collecting information relating to each of the data analytics tasks, the information comprising: a data type of the data being used as input for the data analytics task, the data type comprising a file format of the data and a source of the data; and a characteristic of the data analytics task; using a self-service provisioning tool to automatically compute, based on the collected information, a suggested type and size of data storage for each of the data analytics tasks, wherein computing the suggested type and size of data storage comprises: suggesting for the first data analytics task, based on the information collected for the first data analytics task, an object store within a storage architecture that manages data as objects; suggesting for the second data analytics task, based on the information collected for the second data analytics task, a file system that manages data as a file hierarchy; and suggesting for the third data analytics task, based on the information collected for the third data analytics task, a clustered file system configured to be simultaneously mounted on multiple servers but managed as a single system; and automatically provisioning data storage for each of the data analytics tasks, based on the suggested type and size, by (1) automatically provisioning the object store for the first data analytics task, (2) automatically provisioning the file system for the second data analytics task, and (3) automatically provisioning the clustered file system for the third data analytics task, wherein automatically provisioning the data storage for each of the data analytics tasks comprises creating the data storage of the suggested type and size for each of the data analytics tasks and connecting the data storage of the suggested type and size to one or more containers within the container-based environment to be used for the data analytics task. 2. The computer-implemented method of claim 1 , wherein the requests to provision storage for the data analytics tasks comprises, for at least one of the data analytics tasks, a request to provision at least one of: scratch space to hold intermediate analytic results; and storage for copy data. 3. The computer-implemented method of claim 1 , wherein identifying the requests to provision storage for the data analytics tasks comprises at least one of: receiving a request from an analyst via user input submitted by the analyst; and receiving a request from a data analytics tool being used to perform a data analytics task. 4. The computer-implemented method of claim 1 , wherein identifying the requests to provision storage for the data analytics tasks comprises, for at least one of the data analytics tasks, inferring a request in response to determining that an analyst has digitally initiated a task that requires data storage. 5. The computer-implemented method of claim 1 , wherein the steps of the method are performed by at least one of: the self-service provisioning tool; and a data analytics tool. 6. The computer-implemented method of claim 1 , wherein collecting information relating to each of the data analytics tasks comprises, for at least one of the data analytics tasks, collecting the information in response to prompting a user to submit the information. 7. The computer-implemented method of claim 1 , wherein collecting information relating to each of the data analytics tasks comprises, for at least one of the data analytics tasks, inferring the information based on attributes of at least one of: the data analytics task; and a data analytics tool being used to perform the data analytics task. 8. The computer-implemented method of claim 1 , wherein the data type of the data further comprises a structure of the data. 9. The computer-implemented method of claim 1 , wherein the characteristic of the data analytics task comprises at least one of: a type of data analytics being performed; a programming language of a data analytics application performing the data analytics task; extract, transform, and load (ETL) functions to be performed as part of the data analytics task; and an amount of data to be ingested for the data analytics task. 10. The computer-implemented method of claim 1 , wherein the characteristic of the data analytics task comprises a feature of the code used by a data analytics application performing the data analytics task. 11. The computer-implemented method of claim 1 , further comprising, for each of the data analytics tasks, after computing the suggested type and size of data storage, presenting the suggested type and size of data storage to an analyst via a display element of a device associated with the analyst. 12. The computer-implemented method of claim 11 , further comprising: after presenting the suggested type and size of data storage, allowing the analyst to adjust one or more settings associated with at least one of the suggested type and the suggested size; wherein automatically provisioning the data storage is further based on the analyst's adjustments. 13. The computer-implemented method of claim 12 , further comprising altering one or more specifications of the self-service provisioning tool based on the analyst's adjustments. 14. The computer-implemented method of claim 1 , wherein automatically provisioning the data storage for each of the data analytics tasks comprises provisioning the data storage of the suggested type and size without requiring user input. 15. A system for automatically linking data analytics to storage, the system comprising: a request module, stored in memory, that identifies one or more requests to provision storage within a container-based environment for a plurality of data analytics tasks comprising a first data analytics task, a second data analytics task, and a third data analytics task, the one or more requests to provision storage comprising one or more requests to provision containers within the container-based environment; a collection module, stored in memory, that collects information relating to each of the data analytics tasks, the information comprising: a data type of the data being used as input for the data analytics task, the data type comprising a file format of the data and a source of the data; and a characteristic of the data analytics task; a computation module, stored in memory, that uses a self-service provisioning tool to automatically compute, based on the collected information, a suggested type and size of data storage for each of the data analytics tasks, wherein computing the type and size of data storage comprises: suggesting for the first data analytics task, based on the information collected for the first data analytics task, an object store within a storage architecture that manages data as objects; suggesting for the second data analytics task, based on the information collected for the second data analytics task, a file system that manages data as a file hierarchy; and suggesting for the third data analytics task, based on the
Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title
the resources being hardware resources other than CPUs, Servers and Terminals · CPC title
the resource being the memory · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.