Performing resynchronization jobs in a distributed storage system based on a parallelism policy

US11494083B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11494083-B2
Application numberUS-201916504204-A
CountryUS
Kind codeB2
Filing dateJul 5, 2019
Priority dateJul 5, 2019
Publication dateNov 8, 2022
Grant dateNov 8, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosure herein describes performing resynchronization (“resync”) jobs in a distributed storage system based on a parallelism policy. A resync job is obtained from a queue and input/output (I/O) resources that will be used during execution of the resync job are identified. Available bandwidth slots of each I/O resource of the identified I/O resources are determined. The parallelism policy is applied to the identified I/O resources and the available bandwidth slots. Based on the application of the parallelism policy, a bottleneck resource of the I/O resources is determined and a parallel I/O value is calculated based on the available bandwidth slots of the bottleneck resource, wherein the parallel I/O value indicates a quantity of I/O tasks that can be performed in parallel. The resync job is executed using the I/O resources, the execution of the resync job including performance of I/O tasks in parallel based on the parallel I/O value.

First claim

Opening claim text (preview).

What is claimed is: 1. A computerized method for performing resync jobs in a distributed storage system based on a parallelism policy, the method comprising: obtaining, by a processor, a resync job from a pending resync job queue, wherein the resync job includes information describing a current data location and a destination data location; identifying, based on the information, by the processor, a plurality of input/output (I/O) resources of the distributed storage system that will be used during execution of the obtained resync job; applying, by the processor, the parallelism policy to the identified plurality of I/O resources to determine one bottleneck I/O resource of the identified plurality of I/O resources and a quantity of available bandwidth slots of the determined bottleneck I/O resource, the bottleneck I/O resource being an I/O resource that has a smallest quantity of available bandwidth slots among the identified plurality of I/O resources; assign bandwidth slots to the plurality of I/O resources based on the quantity of available bandwidth slots of the determined bottleneck I/O resource; and causing, by the processor, the resync job to be executed using the plurality of I/O resources, the execution of the resync job including performance of a quantity of I/O tasks in parallel based on the assigned bandwidth slots. 2. The computerized method of claim 1 , wherein execution of the resync job further includes synchronizing data at the destination data location with data at the current data location via an I/O path and, upon more than one I/O path being available, selecting a most direct path. 3. The computerized method of claim 1 , the method further comprising: grouping, by the processor, I/O tasks of the resync job into task batches, each task batch including one or more I/O tasks; wherein causing the resync job to be executed includes executing one task batch of the task batches of the resync job at a time, wherein applying the parallelism policy to determine the bottleneck I/O resource is performed in preparation for execution of each task batch of the task batches of the resync job. 4. The computerized method of claim 3 , wherein the parallelism policy uses data associated with the plurality of input/output (I/O) resources to determine the bottleneck I/O resource. 5. The computerized method of claim 1 , wherein the resync job includes a priority value; and wherein determining the bottleneck I/O resource is based on a ratio of the priority value of the resync job to overall priority values of each I/O resource, wherein an overall priority value of an I/O resource is a sum of priority values of all jobs using the I/O resource. 6. The computerized method of claim 1 , the method further comprising: collecting, by the processor, performance data of the resync job during execution of the resync job; and based on determining, from the collected performance data, that the resync job has diminishing performance returns based on the quantity of assigned bandwidth slots, donating, by the processor, one or more of the bandwidth slots assigned to the resync job, wherein donated bandwidth slots are returned to a pool of available bandwidth slots and a quantity of the assigned bandwidth slots is reduced based on the donated bandwidth slots. 7. The computerized method of claim 6 , the method further comprising: based on a donation time period associated with donation of one or more bandwidth slots ending, reclaiming, by the processor, one or more donated bandwidth slots to the resync job, wherein the quantity of the assigned bandwidth slots is increased based on the reclaimed bandwidth slots. 8. One or more non-transitory computer storage media having computer-executable instructions for performing resync jobs in a distributed storage system based on a parallelism policy that, upon execution by a processor, cause the processor to at least: obtain a resync job from a pending resync job queue, wherein the resync job includes information describing a current data location and a destination data location; identify, based on the information, a plurality of input/output (I/O) resources of the distributed storage system that will be used during execution of the obtained resync job; apply the parallelism policy to the identified plurality of I/O resources to determine one bottleneck I/O resource of the identified plurality of I/O resources and a quantity of available bandwidth slots of the determined bottleneck I/O resource, the bottleneck I/O resource being an I/O resource that has a smallest quantity of available bandwidth slots among the identified plurality of I/O resources; assign bandwidth slots to the plurality of I/O resources based on the quantity of available bandwidth slots of the determined bottleneck I/O resource; and cause the resync job to be executed using the plurality of I/O resources, the execution of the resync job including performance of a quantity of I/O tasks in parallel based on the assigned bandwidth slots. 9. The one or more computer storage media of claim 8 , wherein causing the resync job to be executed includes preventing the resync job from occupying bandwidth of other I/O resources of the distributed storage system, by assigning one or more bandwidth slots from the quantity of available bandwidth slots to the resync job. 10. The one or more computer storage media of claim 8 , wherein the computer-executable instructions, upon execution by a processor, further cause the processor to at least group I/O tasks of the resync job into task batches, each task batch including one or more I/O tasks; wherein causing the resync job to be executed includes executing one task batch of the task batches of the resync job at a time, wherein applying the parallelism policy to determine the bottleneck I/O resource is performed in preparation for execution of each task batch of the task batches of the resync job. 11. The one or more computer storage media of claim 10 , wherein a quantity of I/O tasks grouped into each task batch is based on a calculated parallel I/O value. 12. The one or more computer storage media of claim 8 , wherein the resync job includes a priority value; and wherein determining the bottleneck I/O resource is based on a ratio of the priority value of the resync job to overall priority values of each I/O resource, wherein an overall priority value of an I/O resource is a sum of priority values of all jobs using the I/O resource. 13. The one or more computer storage media of claim 8 , wherein the computer-executable instructions, upon execution by a processor, further cause the processor to at least: collect performance data of the resync job during execution of the resync job; and based on determining, from the collected performance data, that the resync job has diminishing performance returns based on the quantity of assigned bandwidth slots, donate one or more of the bandwidth slots assigned to the resync job, wherein donated bandwidth slots are returned to a pool of available bandwidth slots and a quantity of the assigned bandwidth slots is reduced based on the donated bandwidth slots. 14. A system for performing resync jobs in a distributed storage system based on a parallelism policy, the system comprising: at least one processor; and at least one memory comprising computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the at least one processor to: obtain a resync job from a pending resync job queue, wherein the resync job includes information describing a current data location and a destination data loca

Assignees

Inventors

Classifications

  • Replication mechanisms · CPC title

  • Hypervisors; Virtual machine monitors · CPC title

  • G06F3/061Primary

    Improving I/O performance · CPC title

  • considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration (scheduling strategies G06F9/4881 and subgroups) · CPC title

  • Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11494083B2 cover?
The disclosure herein describes performing resynchronization (“resync”) jobs in a distributed storage system based on a parallelism policy. A resync job is obtained from a queue and input/output (I/O) resources that will be used during execution of the resync job are identified. Available bandwidth slots of each I/O resource of the identified I/O resources are determined. The parallelism policy…
Who is the assignee on this patent?
Vmware Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/061. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 08 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).