Method for using local BMC to allocate shared GPU resources inside NVMe over fabrics system

US10394604B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10394604-B2
Application numberUS-201715603437-A
CountryUS
Kind codeB2
Filing dateMay 23, 2017
Priority dateMar 15, 2017
Publication dateAug 27, 2019
Grant dateAug 27, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

According to one general aspect, a system may include a non-volatile memory (NVM), a resource arbitration circuit, and a shared resource. The non-volatile memory may be configured to store data and manage the execution of a task. The non-volatile memory may include a network interface configured to receive data and the task, a NVM processor configured to determine if the processor will execute that task or if the task will be assigned to a shared resource within the system, and a local communication interface configured to communicate with at least one other device within the system. The resource arbitration circuit may be configured to receive a request to assign the task to the shared resource, and manage the execution of the task by the shared resource. The shared resource may be configured to execute the task.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: a non-volatile memory (NVM) device stores data and manages execution of a task, and wherein the NVM device comprises: a network interface configured to receive data and the task, a NVM processor configured to determine if the NVM processor will execute that task or if the task will he assigned to a shared resource within the system based on the shared resource more efficiently performing the task than the NVM processor, and a local communication interface configured to communicate with at least one other device within the system; a main board sub-system comprising: a switched fabric in communication with the NVM device, wherein the switched fabric sends the data and task to the NVM device as a destination for the task, and a resource arbitration circuit configured to: receive, a request to assign the task to the shared resource, and manage the execution of the task by the shared resource; and the shared resource configured to execute the task. 2. The system of claim 1 , wherein the network interface is configured to receive data and the task via a Non-Volatile Memory Express over Fabric protocol. 3. The system of claim 1 , wherein the resource arbitration circuit comprises a baseboard management controller. 4. The system of claim 1 , further comprising a plurality of non-volatile memory devices, each configured to request an assignment of a. respective task to the shared resource; and wherein the resource arbitration circuit configured to: determine if the shared resource is available for a respective task, arbitrate between a plurality of requests to assign a respective task to the shared resource, determine a selected non-volatile memory device that has won the arbitration, and inform the selected non-volatile memory device that the selected non-volatile memory device's task is assigned to the shared resource. 5. The system of claim 1 , wherein the shared resource includes a graphics processor. 6. The system of claim 1 wherein, if the task is assigned to the shared resource, data associated with the task is transferred between the non-volatile memory device and the shared resource via the local communication interface. 7. The system of claim 1 , wherein the NVM processor is configured to determine if the NVM processor will execute that task or if the task will be assigned to the shared resource, based, at least in part, upon hint information included with the task. 8. The system of claim 1 , wherein the task comprises a neural network task. 9. The system of claim 1 , wherein, even if the NVM processor determined that the task will be assigned to the shared resource, but the shared resource is not available, the NVM processor is configured to execute the task. 10. An apparatus comprising: a switched fabric configured to communicate with a plurality at non-volatile memory (NVM) devices, wherein the switched fabric sends tasks to the plurality of NVM devices respectively and each of the NVM devices is a destination for its corresponding task: a resource arbitration circuit configured to: receive a request, from a requesting non-volatile memory device from the plurality of NVM devices, to assign a task to a shared processor, wherein a NVM processor of the requesting WM device determines the shared processor more efficiently performs the task than the NVM processor of the requesting NVM device, and manage the execution of the task by the shared processor; and the shared processor configured to execute the task. 11. The apparatus of claim 10 , wherein the resource arbitration circuit comprises: a requester table associating tasks with the non-volatile memory device that requested the respective task's execution; and an availability table indicating the availability state of the shared processor. 12. The apparatus of claim 10 , wherein the shared processor is configured to notify the resource arbitration circuit when the task is completed, and wherein the resource arbitration circuit is configured to notify the requesting non-volatile memory device when the task is completed. 13. The apparatus of claim 10 , wherein the resource arbitration circuit and the requesting non-volatile memory device communicate, regarding the task, via a high-speed serial computer expansion bus. 14. The apparatus of claim 13 , wherein the high-speed serial computer expansion bus includes a Peripheral Component Interconnect Express bus. 15. The apparatus of claim 13 , wherein the shared processor communicates with the resource arbitration circuit and the requesting non-volatile memory device regarding the task, via a high-speed serial computer expansion bus. 16. The apparatus of claim 13 , further comprising a local communication bus; wherein the switched fabric is configured to communicate directly with the plurality of non-volatile memory devices via a network interface of each respective non-volatile memory device, and at least one initiator device that is external to the apparatus, wherein the at least one initiator device transfers data to, at least a portion of, the plurality of non-volatile memory devices via the switched fabric; and wherein the local communication bus is configured to communicate directly with the shared processor, the resource arbitration circuit, and the plurality of non-volatile memory devices via a local communications interface of each respective non-volatile memory device. 17. A method comprising: receiving, by a non-volatile memory (NVM) device, data from a switched fabric of a main board sub-system via a Non-Volatile Memory Express over Fabric protocol, wherein the NVM device is a destination for the data; determining, by a NVM processor of the NVM device, whether the data is to be processed by a shared processor that is external to the NVM device, based on the shared processor more efficiently processing the data than the NVM processor; forwarding, via a local expansion bus, the data to a resource arbitration circuit of the main hoard sub-system that is external to the NVM device; processing the data by the shared processor; and returning, via the local expansion bus, the processed data to the NVM device. 18. The method of claim 17 , wherein the shared processor includes a graphical processing unit, and wherein the local expansion bus includes a Peripheral Component Interconnect Express bus. 19. The method of claim 17 , wherein determining comprises determining if the data is to be processed employing machine learning. 20. The method of claim 17 , wherein determining includes examining hint information included in the data.

Assignees

Inventors

Classifications

  • Offload · CPC title

  • G06F9/5027Primary

    the resource being a machine, e.g. CPUs, Servers, Terminals · CPC title

  • G06F9/5016Primary

    the resource being the memory · CPC title

  • G06F9/544Primary

    Buffers; Shared memory; Pipes · CPC title

  • Local · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10394604B2 cover?
According to one general aspect, a system may include a non-volatile memory (NVM), a resource arbitration circuit, and a shared resource. The non-volatile memory may be configured to store data and manage the execution of a task. The non-volatile memory may include a network interface configured to receive data and the task, a NVM processor configured to determine if the processor will execute …
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F9/5027. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 27 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).