Elastic request handling technique for optimizing workload performance

US2023359359A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2023359359-A1
Application numberUS-202217853123-A
CountryUS
Kind codeA1
Filing dateJun 29, 2022
Priority dateMay 4, 2022
Publication dateNov 9, 2023
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An elastic request handling technique limits a number of threads used to service input/output (I/O) requests of a low-latency I/O workload received by a file system server executing on a cluster having a plurality of nodes deployed in a virtualization environment. The limited number of threads (server threads) is constantly maintained as “active” and running on virtual central processing units (vCPUs) of a node. The file system server spawns and organizes the active server threads as one or more pools of threads. The server prioritizes the low-latency I/O requests by loading them onto the active threads and allowing the requests to run on those active threads to completion, thereby obviating overhead associated with lock contention and vCPU migration after a context switch (i.e., to avoid rescheduling a thread on a different vCPU after execution of the thread was suspended).

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: receiving input/output (I/O) requests at a node of a storage system coupled to persistent storage media; processing the I/O requests using a pool of threads executing on one or more processors of the node; measuring an I/O latency for storing data of the I/O requests on the persistent storage media; determining whether the measured I/O latency exceeds a predetermined threshold; and in response to the measured I/O latency exceeding the predetermined threshold, increasing a number of threads from the pool deployed to process the I/O requests. 2 . The method of claim 1 wherein a minimum number of threads is maintained as deployed for low latency workloads identified from the I/O requests. 3 . The method of claim 1 wherein the processors are virtual central processing units (vCPUs) and wherein a minimum number of threads is maintained as active threads to run on the vCPUs. 4 . The method of claim 3 further comprising prioritizing low latency I/O requests to run to completion on the active threads to obviate overhead associated with lock contention and vCPU migration after a context switch. 5 . The method of claim 3 further comprising, in response to the measured I/O latency not exceeding the predetermined threshold, maintaining a minimum number of the active threads so that each vCPU has a dedicated thread running to accommodate processing of the I/O requests. 6 . The method of claim 1 wherein a maximum number of threads supported in the pool is based on memory and processing capacity configuration of the node. 7 . The method of claim 1 wherein the number of threads in the pool is based on a hardware architecture of the node. 8 . The method of claim 1 wherein the number of threads in the pool is determined dynamically by measuring factors affecting the I/O latency of a workload. 9 . The method of claim 8 wherein the factors include processor time such as context switches and queue delays. 10 . The method of claim 8 wherein the factors include backend I/O time to storage such as time to read or write to the persistent storage media. 11 . A method comprising: receiving input/output (I/O) requests at a node of a storage system coupled to persistent storage media; processing the I/O requests using a pool of threads executing on one or more processors of the node; measuring an I/O latency for storing data of the I/O requests on the persistent storage media; and adjusting a number of threads from the pool deployed to process the I/O requests according to the measured I/O latency. 12 . The method of claim 11 further comprising: determining whether the measured I/O latency exceeds a predetermined threshold; and in response to the measured I/O latency exceeding the predetermined threshold, increasing the adjusted number of threads. 13 . The method of claim 11 wherein a minimum number of threads is maintained as deployed for low latency workloads identified from the I/O requests. 14 . The method of claim 11 wherein the processors are virtual central processing units (vCPUs) and wherein a minimum number of threads is maintained as active threads to run on the vCPUs. 15 . The method of claim 14 further comprising prioritizing low latency I/O requests to run to completion on the active threads to obviate overhead associated with lock contention and vCPU migration after a context switch. 16 . The method of claim 11 wherein a maximum number of threads supported in the pool is based on memory and processing capacity configuration of the node. 17 . The method of claim 11 wherein the number of threads in the pool is based on a hardware architecture of the node. 18 . The method of claim 11 wherein the number of threads in the pool is determined dynamically by measuring factors affecting the I/O latency of a workload. 19 . The method of claim 18 wherein the factors include processor time and backend I/O time to storage. 20 . A non-transitory computer readable medium including program instructions for execution on one or more processors of a node of a storage system, the program instructions configured to: receive input/output (I/O) requests at the node coupled to persistent storage media; process the I/O requests using a pool of threads executing on the one or more processors of the node; measure an I/O latency for storing data of the I/O requests on the persistent storage media; and adjust a number of threads from the pool deployed to process the I/O requests according to the measured I/O latency. 21 . The non-transitory computer readable medium of claim 20 wherein the program instructions for execution on the one or more processors are further configured to: determine whether the measured I/O latency exceeds a predetermined threshold; and in response to the measured I/O latency exceeding the predetermined threshold, increase the adjusted number of threads. 22 . The non-transitory computer readable medium of claim 20 wherein a minimum number of threads is maintained as deployed for low latency workloads identified from the I/O requests. 23 . The non-transitory computer readable medium of claim 20 wherein the processors are virtual central processing units (vCPUs) and wherein a minimum number of threads is maintained as active threads to run on the vCPUs. 24 . The non-transitory computer readable medium of claim 23 wherein the program instructions for execution on the one or more processors are further configured to prioritize low latency I/O requests to run to completion on the active threads to obviate overhead associated with lock contention and vCPU migration after a context switch. 25 . The non-transitory computer readable medium of claim 23 wherein the program instructions for execution on the one or more processors are further configured to, in response to the measured I/O latency not exceeding the predetermined threshold, maintain a minimum number of the active threads so that each vCPU has a dedicated thread running to accommodate processing of the I/O requests. 26 . The non-transitory computer readable medium of claim 20 wherein a maximum number of threads supported in the pool is based on memory and processing capacity configuration of the node. 27 . The non-transitory computer readable medium of claim 20 wherein the number of threads in the pool is based on a hardware architecture of the node. 28 . The non-transitory computer readable medium of claim 20 wherein the number of threads in the pool is determined dynamically by measuring factors affecting the I/O latency of a workload. 29 . The non-transitory computer readable medium of claim 28 wherein the factors include I/O time to read or write to the persistent storage media. 30 . The non-transitory computer readable medium of claim 28 wherein the factors include processor execution time of context switches and queue delays. 31 . A system comprising: a storage system having a node with one or more processors coupled to persistent storage media, the one or more processors configured to execute program instructions to: receive input/output (I/O) requests at the node; process the I/O requests using a pool of threads executing on the one or more processors; measure an I/O latency for storing data of the I/O re

Assignees

Inventors

Classifications

  • G06F3/0611Primary

    in relation to response time · CPC title

  • Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices · CPC title

  • Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP] · CPC title

  • G06F3/0659Primary

    Command handling arrangements, e.g. command buffers, queues, command scheduling · CPC title

  • Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2023359359A1 cover?
An elastic request handling technique limits a number of threads used to service input/output (I/O) requests of a low-latency I/O workload received by a file system server executing on a cluster having a plurality of nodes deployed in a virtualization environment. The limited number of threads (server threads) is constantly maintained as “active” and running on virtual central processing units …
Who is the assignee on this patent?
Nutanix Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/0611. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Nov 09 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).