Genomic application data storage

US10324907B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10324907-B2
Application numberUS-201514747347-A
CountryUS
Kind codeB2
Filing dateJun 23, 2015
Priority dateMar 14, 2013
Publication dateJun 18, 2019
Grant dateJun 18, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

It is decided whether to increase a total amount of storage in a pool of Hadoop storage and whether to increase a total amount of processing in a pool of Hadoop processing. If it is decided to increase the total amount of storage and not increase the total amount of processing, the total amount of storage is increased without increasing processing. If it is decided to not increase the total amount of storage and increase the total amount of processing, the total amount of processing is increased without increasing storage. In response to receiving a request to perform a process on a set of data, processing is allocated from the pool of processing and storage is allocated from the pool of storage where the allocated processing and storage are used to perform the process on the set of data.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: a hardware processor; and a memory coupled with the hardware processor, wherein the memory is configured to provide the hardware processor with instructions which when executed cause the processor to: decide whether to increase a total amount of storage in a pool of Hadoop storage; decide whether to increase a total amount of processing in a pool of Hadoop processing; in the event it is decided to: (1) increase the total amount of storage in the pool of Hadoop storage and (2) not increase the total amount of processing in the pool of Hadoop processing, increase the total amount of storage in the pool of Hadoop storage without increasing the total amount of processing in the pool of Hadoop processing; and in the event it is decided to: (1) not increase the total amount of storage in the pool of Hadoop storage and (2) increase the total amount of processing in the pool of Hadoop processing, increase the total amount of processing in the pool of Hadoop processing without increasing the total amount of storage in the pool of Hadoop storage, wherein in response to receiving a request to perform a process on a set of data: one or more storage resources from the pool of Hadoop storage is allocated, including a Hadoop file system; and one or more processing resources from the pool of Hadoop processing is allocated, including a virtual and customized Hadoop compute node wherein an application or toolkit which runs on the virtual and customized Hadoop compute node is decoupled from a specific implementation of an underlying processor below a virtualization, including by: storing a Hadoop compute node as a template; deploying the Hadoop compute node stored as the template in order to obtain a default virtual Hadoop compute node; and customizing the default virtual Hadoop compute node in order to obtain the virtual and customized Hadoop compute node, including by using a custom script to link the virtual and customized Hadoop compute node to the Hadoop file system. 2. The system recited in claim 1 , wherein the pool of Hadoop processing includes one or more of the following: virtual processing or Greenplum HD. 3. The system recited in claim 1 , wherein the pool of Hadoop storage includes one or more of the following: virtual storage, Isilon, a storage technology that supports a plurality of file system protocols on a single storage platform, or a storage technology that supports petabytes of storage. 4. The system recited in claim 1 , wherein the set of data includes genome data and a genome analysis toolkit runs on the allocated processing and the allocated storage. 5. A method, comprising: using a processor to decide whether to increase a total amount of storage in a pool of Hadoop storage; using the processor to decide whether to increase a total amount of processing in a pool of Hadoop processing; in the event it is decided to: (1) increase the total amount of storage in the pool of Hadoop storage and (2) not increase the total amount of processing in the pool of Hadoop processing, using the processor to increase the total amount of storage in the pool of Hadoop storage without increasing the total amount of processing in the pool of Hadoop processing; and in the event it is decided to: (1) not increase the total amount of storage in the pool of Hadoop storage and (2) increase the total amount of processing in the pool of Hadoop processing, using the processor to increase the total amount of processing in the pool of Hadoop processing without increasing the total amount of storage in the pool of Hadoop storage, wherein in response to receiving a request to perform a process on a set of data: one or more storage resources from the pool of Hadoop storage is allocated, including a Hadoop file system; and one or more processing resources from the pool of Hadoop processing is allocated, including a virtual and customized Hadoop compute node wherein an application or toolkit which runs on the virtual and customized Hadoop compute node is decoupled from a specific implementation of an underlying processor below a virtualization, including by: storing a Hadoop compute node as a template; deploying the Hadoop compute node stored as the template in order to obtain a default virtual Hadoop compute node; and customizing the default virtual Hadoop compute node in order to obtain the virtual and customized Hadoop compute node, including by using a custom script to link the virtual and customized Hadoop compute node to the Hadoop file system. 6. The method recited in claim 5 , wherein the pool of Hadoop processing includes one or more of the following: virtual processing or Greenplum HD. 7. The method recited in claim 5 , wherein the pool of Hadoop storage includes one or more of the following: virtual storage, Isilon, a storage technology that supports a plurality of file system protocols on a single storage platform, or a storage technology that supports petabytes of storage. 8. The method recited in claim 5 , wherein the set of data includes genome data and a genome analysis toolkit runs on the allocated processing and the allocated storage. 9. A computer program product, the computer program product being embodied in a non-transitory computer readable storage medium and comprising computer instructions for: deciding whether to increase a total amount of storage in a pool of Hadoop storage; deciding whether to increase a total amount of processing in a pool of Hadoop processing; in the event it is decided to: (1) increase the total amount of storage in the pool of Hadoop storage and (2) not increase the total amount of processing in the pool of Hadoop processing, increasing the total amount of storage in the pool of Hadoop storage without increasing the total amount of processing in the pool of Hadoop processing; and in the event it is decided to: (1) not increase the total amount of storage in the pool of Hadoop storage and (2) increase the total amount of processing in the pool of Hadoop processing, increasing the total amount of processing in the pool of Hadoop processing without increasing the total amount of storage in the pool of Hadoop storage, wherein in response to receiving a request to perform a process on a set of data: one or more storage resources from the pool of Hadoop storage is allocated, including a Hadoop file system; and one or more processing resources from the pool of Hadoop processing is allocated, including a virtual and customized Hadoop compute node wherein an application or toolkit which runs on the virtual and customized Hadoop compute node is decoupled from a specific implementation of an underlying processor below a virtualization, including by: storing a Hadoop compute node as a template; deploying the Hadoop compute node stored as the template in order to obtain a default virtual Hadoop compute node; and customizing the default virtual Hadoop compute node in order to obtain the virtual and customized Hadoop compute node, including by using a custom script to link the virtual and customized Hadoop compute node to the Hadoop file system. 10. The computer program product recited in claim 9 , wherein the pool of Hadoop processing includes one or more of the following: virtual processing or Greenplum HD. 11. The computer program product recited in claim 9 , wherein the pool of Hadoop storage includes one or more of the following: virtual storage, Isilon, a storage technology that supports a plurality of file system protocols on a single storage platform, or a storage technology that supports petabytes of storage. 12. The computer program product recited in claim 9 , wherein the set of data i

Assignees

Inventors

Classifications

  • Grid computing · CPC title

  • ICT programming tools or database systems specially adapted for bioinformatics · CPC title

  • Indexing; Data structures therefor; Storage structures · CPC title

  • Allocation of resources, e.g. of the central processing unit [CPU] · CPC title

  • G06F16/185Primary

    Hierarchical storage management [HSM] systems, e.g. file migration or policies thereof (details of archiving G06F16/11) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10324907B2 cover?
It is decided whether to increase a total amount of storage in a pool of Hadoop storage and whether to increase a total amount of processing in a pool of Hadoop processing. If it is decided to increase the total amount of storage and not increase the total amount of processing, the total amount of storage is increased without increasing processing. If it is decided to not increase the total amo…
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/185. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 18 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).