Systems and methods for automatically linking data analytics to storage

US10909136B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10909136-B1
Application numberUS-201715428134-A
CountryUS
Kind codeB1
Filing dateFeb 8, 2017
Priority dateFeb 8, 2017
Publication dateFeb 2, 2021
Grant dateFeb 2, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosed computer-implemented method for automatically linking data analytics to storage may include (1) identifying a request to provision storage for a data analytics task, (2) collecting information relating to the data analytics task, the information comprising at least one of a data type of the data being used as input for the data analytics task and a characteristic of the data analytics task, (3) using a self-service provisioning tool to automatically compute, based on the collected information, a suggested type and size of data storage for the data analytics task, and (4) automatically provisioning data storage for the data analytics task based on the suggested type and size. Various other methods, systems, and computer-readable media are also disclosed.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for automatically linking data analytics to storage, at least a portion of the method being performed by a computing device comprising at least one processor, the method comprising: identifying one or more requests to provision storage within a container-based environment for a plurality of data analytics tasks comprising a first data analytics task, a second data analytics task, and a third data analytics task, the one or more requests to provision storage comprising one or more requests to provision containers within the container-based environment; collecting information relating to each of the data analytics tasks, the information comprising: a data type of the data being used as input for the data analytics task, the data type comprising a file format of the data and a source of the data; and a characteristic of the data analytics task; using a self-service provisioning tool to automatically compute, based on the collected information, a suggested type and size of data storage for each of the data analytics tasks, wherein computing the suggested type and size of data storage comprises: suggesting for the first data analytics task, based on the information collected for the first data analytics task, an object store within a storage architecture that manages data as objects; suggesting for the second data analytics task, based on the information collected for the second data analytics task, a file system that manages data as a file hierarchy; and suggesting for the third data analytics task, based on the information collected for the third data analytics task, a clustered file system configured to be simultaneously mounted on multiple servers but managed as a single system; and automatically provisioning data storage for each of the data analytics tasks, based on the suggested type and size, by (1) automatically provisioning the object store for the first data analytics task, (2) automatically provisioning the file system for the second data analytics task, and (3) automatically provisioning the clustered file system for the third data analytics task, wherein automatically provisioning the data storage for each of the data analytics tasks comprises creating the data storage of the suggested type and size for each of the data analytics tasks and connecting the data storage of the suggested type and size to one or more containers within the container-based environment to be used for the data analytics task. 2. The computer-implemented method of claim 1 , wherein the requests to provision storage for the data analytics tasks comprises, for at least one of the data analytics tasks, a request to provision at least one of: scratch space to hold intermediate analytic results; and storage for copy data. 3. The computer-implemented method of claim 1 , wherein identifying the requests to provision storage for the data analytics tasks comprises at least one of: receiving a request from an analyst via user input submitted by the analyst; and receiving a request from a data analytics tool being used to perform a data analytics task. 4. The computer-implemented method of claim 1 , wherein identifying the requests to provision storage for the data analytics tasks comprises, for at least one of the data analytics tasks, inferring a request in response to determining that an analyst has digitally initiated a task that requires data storage. 5. The computer-implemented method of claim 1 , wherein the steps of the method are performed by at least one of: the self-service provisioning tool; and a data analytics tool. 6. The computer-implemented method of claim 1 , wherein collecting information relating to each of the data analytics tasks comprises, for at least one of the data analytics tasks, collecting the information in response to prompting a user to submit the information. 7. The computer-implemented method of claim 1 , wherein collecting information relating to each of the data analytics tasks comprises, for at least one of the data analytics tasks, inferring the information based on attributes of at least one of: the data analytics task; and a data analytics tool being used to perform the data analytics task. 8. The computer-implemented method of claim 1 , wherein the data type of the data further comprises a structure of the data. 9. The computer-implemented method of claim 1 , wherein the characteristic of the data analytics task comprises at least one of: a type of data analytics being performed; a programming language of a data analytics application performing the data analytics task; extract, transform, and load (ETL) functions to be performed as part of the data analytics task; and an amount of data to be ingested for the data analytics task. 10. The computer-implemented method of claim 1 , wherein the characteristic of the data analytics task comprises a feature of the code used by a data analytics application performing the data analytics task. 11. The computer-implemented method of claim 1 , further comprising, for each of the data analytics tasks, after computing the suggested type and size of data storage, presenting the suggested type and size of data storage to an analyst via a display element of a device associated with the analyst. 12. The computer-implemented method of claim 11 , further comprising: after presenting the suggested type and size of data storage, allowing the analyst to adjust one or more settings associated with at least one of the suggested type and the suggested size; wherein automatically provisioning the data storage is further based on the analyst's adjustments. 13. The computer-implemented method of claim 12 , further comprising altering one or more specifications of the self-service provisioning tool based on the analyst's adjustments. 14. The computer-implemented method of claim 1 , wherein automatically provisioning the data storage for each of the data analytics tasks comprises provisioning the data storage of the suggested type and size without requiring user input. 15. A system for automatically linking data analytics to storage, the system comprising: a request module, stored in memory, that identifies one or more requests to provision storage within a container-based environment for a plurality of data analytics tasks comprising a first data analytics task, a second data analytics task, and a third data analytics task, the one or more requests to provision storage comprising one or more requests to provision containers within the container-based environment; a collection module, stored in memory, that collects information relating to each of the data analytics tasks, the information comprising: a data type of the data being used as input for the data analytics task, the data type comprising a file format of the data and a source of the data; and a characteristic of the data analytics task; a computation module, stored in memory, that uses a self-service provisioning tool to automatically compute, based on the collected information, a suggested type and size of data storage for each of the data analytics tasks, wherein computing the type and size of data storage comprises: suggesting for the first data analytics task, based on the information collected for the first data analytics task, an object store within a storage architecture that manages data as objects; suggesting for the second data analytics task, based on the information collected for the second data analytics task, a file system that manages data as a file hierarchy; and suggesting for the third data analytics task, based on the

Assignees

Inventors

Classifications

  • G06F16/254Primary

    Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title

  • G06F9/5011Primary

    the resources being hardware resources other than CPUs, Servers and Terminals · CPC title

  • the resource being the memory · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10909136B1 cover?
The disclosed computer-implemented method for automatically linking data analytics to storage may include (1) identifying a request to provision storage for a data analytics task, (2) collecting information relating to the data analytics task, the information comprising at least one of a data type of the data being used as input for the data analytics task and a characteristic of the data analy…
Who is the assignee on this patent?
Veritas Technologies Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/254. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 02 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).