Host system and method for managing data consumption rate in a virtual data processing environment

US10860352B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10860352-B2
Application numberUS-201414341596-A
CountryUS
Kind codeB2
Filing dateJul 25, 2014
Priority dateJul 25, 2013
Publication dateDec 8, 2020
Grant dateDec 8, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments disclosed herein provide systems, methods, and computer readable media for managing data consumption rate in a virtual data processing environment. In a particular embodiment, a method provides, in a cache node of a host system, identifying read completions for one or more virtual machines instantiated in the host system, with the one or more virtual machines processing one or more processing jobs. The method further provides allocating the read completions to individual processing jobs of the one or more processing jobs and accumulating the read completions on a per-job basis, with the cache node determining a data consumption rate for each processing job of the one or more processing jobs.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of managing data consumption rate in a virtual data processing environment, the virtual data processing environment comprising a plurality of processing jobs executing on a plurality of virtual machines, wherein each processing job in the plurality of processing jobs is allocated two or more virtual machines in the plurality of virtual machines, and wherein each processing job in the plurality of processing jobs executes on virtual machines separate from the other processing jobs in the plurality of processing jobs, the method comprising: in a cache node, identifying read completion data for each virtual machine in the plurality of virtual machines, wherein the read completion data for each virtual machine tracks read completions from one or more data storage systems by the processing job allocated to the virtual machine, and wherein the plurality of processing jobs each comprise a big data processing job, wherein the cache node facilitates data transfers of job data to be processed by the plurality of processing jobs between the one or more data storage systems and the plurality of virtual machines via a cache memory of the cache node by receiving the job data requested by the plurality of processing jobs from the one or more data storage systems and providing the job data to the plurality of processing jobs by writing the job data to the cache memory; in the cache node, tracking job read completions for each processing job in the plurality of processing jobs based on the read completion data identified from the plurality virtual machines; in the cache node, determining a data consumption rate for each processing job of the plurality of processing jobs based on the job read completions associated with each processing job in the plurality of processing jobs; and allocating or de-allocating host system resources to at least one of the plurality of processing jobs based on at least a corresponding data consumption rate. 2. The method of claim 1 , wherein allocating or de-allocating host system resources to the at least one of the plurality of processing jobs based on at least a corresponding data consumption rate comprises allocating or de-allocating host system resources to the at least one of the plurality of processing jobs based on at least the data consumption rate for each processing job in the plurality of processing jobs. 3. The method of claim 1 , further comprising predicting one or more job completion times of the plurality of processing jobs, with the one or more job completion times based on one or more corresponding data consumption rates. 4. The method of claim 1 , further comprising: predicting a job completion time of a particular processing job based on at least the corresponding data consumption rate; and allocating or de-allocating host system resources to the particular processing job based on the predicted job completion time. 5. The method of claim 1 , wherein the plurality of processing jobs comprise Hadoop processing jobs. 6. The method of claim 1 , wherein the plurality of virtual machines execute on a first host system with the cache node, wherein the virtual data processing environment comprises a second plurality of virtual machines executing on a second host system capable of executing the plurality of processing jobs, and wherein the method further comprises: in the cache node, transferring the job read completions for each processing job in the plurality of processing jobs to a master cache node; and in the master cache node, receiving, from the cache node and a second cache node associated with the second plurality of virtual machines on the second host system, the job read completions and additional job read completions, and determining an overall data consumption rate for the plurality of processing jobs based on the job read completions and the additional job read completions. 7. The method of claim 1 , wherein increased data throughput of the job data is achieved by mapping a portion of memory of each of the plurality of virtual machines to a corresponding portion of the cache memory of the cache node. 8. An apparatus comprising: one or more non-transitory computer readable storage media; program instructions stored on the one or more non-transitory computer readable media that, when executed by a processing system of a host system, direct the processing system to perform a method of managing data consumption rate in a virtual data processing environment, the virtual data processing environment comprising a plurality of processing jobs executing on a plurality of virtual machines, wherein each processing job in the plurality of processing jobs is allocated two or more virtual machines in the plurality of virtual machines, and wherein each processing job in the plurality of processing jobs executes on virtual machines separate from the other processing jobs in the plurality of processing jobs, the method comprising: facilitating, by a cache node interposed between one or more data storage systems and the plurality of virtual machines, data transfers of job data to be processed by the plurality of processing jobs between the one or more data storage systems and the plurality of virtual machines, including identifying read completion data for each virtual machine in the plurality of virtual machines via a cache memory of the cache node by receiving the job data requested by the plurality of processing jobs from the one or more data storage systems and providing the job data to the plurality of processing jobs by writing the job data to the cache memory, wherein the read completion data for each virtual machine tracks read completions from the one or more data storage systems by the processing job allocated to the virtual machine; tracking, by the cache node, job read completions for each processing job in the plurality of processing jobs based on the read completion data identified from the plurality of virtual machines; and determining, by the cache node, a data consumption rate for each processing job of the plurality of processing jobs based on the job read completions associated with each processing job in the plurality of processing jobs; and allocating or de-allocating host system resources to at least one of the plurality of processing jobs based on at least a corresponding data consumption rate. 9. The apparatus of claim 8 , wherein allocating or de-allocating host system resources to the at least one of the plurality of processing jobs based on at least a corresponding data consumption rate comprises allocating or de-allocating host system resources to the at least one of the plurality of processing jobs based on at least the data consumption rate for each processing job in the plurality of processing jobs. 10. The apparatus of claim 8 , wherein the method further comprises predicting one or more job completion times of the plurality of processing jobs being processed by the host system, with the one or more job completion times based on one or more corresponding data consumption rates. 11. The apparatus of claim 8 , wherein the method further comprises: predicting a job completion time of a particular processing job based on at least the corresponding data consumption rate; and allocating or de-allocating host system resources to the particular processing job based on the predicted job completion time. 12. The apparatus of claim 8 , wherein the virtual machines comprise a Hadoop virtual processing cluster, and wherein the plurality of processing jobs comprise Hadoop processing jobs. 13. The apparatus of claim 8 , wherein the method further comprises: transferring the job read co

Assignees

Inventors

Classifications

  • Hypervisor-specific management and integration aspects · CPC title

  • the resource being the memory · CPC title

  • Guest-host, i.e. hypervisor is an application program itself, e.g. VirtualBox · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10860352B2 cover?
Embodiments disclosed herein provide systems, methods, and computer readable media for managing data consumption rate in a virtual data processing environment. In a particular embodiment, a method provides, in a cache node of a host system, identifying read completions for one or more virtual machines instantiated in the host system, with the one or more virtual machines processing one or more …
Who is the assignee on this patent?
Hewlett Packard Entpr Dev Lp
What technology area does this patent fall under?
Primary CPC classification G06F9/45558. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 08 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).