Techniques for configuring a processor to function as multiple, separate processors

US11249905B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11249905-B2
Application numberUS-201916562361-A
CountryUS
Kind codeB2
Filing dateSep 5, 2019
Priority dateSep 5, 2019
Publication dateFeb 15, 2022
Grant dateFeb 15, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: partitioning a first portion of a set of memory resources included in a processor to generate a plurality of logical memory partitions, wherein each logical memory partition is associated with at least one engine included in a plurality of engines and corresponds to a different address range within a first set of memory addresses, and wherein each address range is for accessing a different subset of cache slices included in a plurality of cache slices; partitioning a second portion of the set of memory resources to generate a memory section that includes a portion of each cache slice in the plurality of cache slices, wherein the second portion is accessed via a second set of memory addresses for accessing any of the plurality of cache slices, and wherein the second portion is inaccessible via the first set of memory addresses; receiving a plurality of memory access requests from the plurality of engines; and transmitting each memory access request included in the plurality of memory access requests to the memory section or to one of the logical memory partitions, wherein each memory access request is serviced by the memory section or the respective logical memory partition. 2. The computer-implemented method of claim 1 , wherein partitioning the first portion of the set of memory resources comprises activating a set of logical boundaries associated with the set of memory resources based on a set of bits, wherein each bit included in the set of bits corresponds to a different logical boundary included in the set of logical boundaries. 3. The computer-implemented method of claim 1 , wherein a first memory access request included in the plurality of memory access requests is transmitted to first logical memory partition included in the plurality of logical memory partitions based on a set of bits, wherein the first portion of the set of memory resources is partitioned based, at least in part, on the set of bits. 4. The computer-implemented method of claim 1 , further comprising translating a first address associated with a first memory access request included in the plurality of memory access requests based on a set of bits to generate a second address, wherein the second address is associated with a memory location residing within a first logical memory partition included in the plurality of logical memory partitions, and the set of memory resources is partitioned based, at least in part, on the set of bits. 5. The computer-implemented method of claim 1 , further comprising designating the second portion of the set of memory resources for access by an authorized entity, wherein the second portion of the set of memory resources is not included in the plurality of logical memory partitions, and the authorized entity partitions the first portion of the set of memory resources to generate the plurality of logical memory partitions. 6. The computer-implemented method of claim 1 , further comprising: determining a first fault associated with a first memory access request that is associated with a memory location residing within a first logical memory partition included in the plurality of logical memory partitions; translating a virtual fault identifier corresponding to the first fault to a global fault identifier corresponding to the first fault; and resetting a first engine based on the global fault identifier, wherein the first engine is included in the plurality of engines and is configured to access the first logical memory partition. 7. The computer-implemented method of claim 1 , further comprising translating a virtual fault identifier included in a local address space associated with a first logical memory partition included in the plurality of logical memory partitions to generate a global fault identifier included in a global address space associated with the set of memory resources, wherein the virtual fault identifier is generated by the first logical memory partition when servicing a first memory access request included in the plurality of memory access requests. 8. The computer-implemented method of claim 1 , further comprising: configuring a first logical memory partition included in the plurality of logical memory partitions to include a first region of memory that stores data associated with a first engine included in the plurality of engines; and configuring the first logical memory partition to include a second region of memory that stores data associated with a second engine included in the plurality of engines. 9. The computer-implemented method of claim 8 , wherein the first engine executes a first set of processing tasks associated with a first processing context using the first region of memory, and the second engine executes a second set of processing tasks associated with a second processing context using the second region of memory. 10. The computer-implemented method of claim 1 , further comprising: configuring a first logical memory partition included in the plurality of logical memory partitions to include a first region of memory that stores data associated with a first processing subcontext that is derived from a first processing context; and configuring the first logical memory partition to include a second region of memory that stores data associated with a second processing subcontext that also is derived from the first processing context. 11. A non-transitory computer-readable medium storing program instructions that, when executed by a processor, cause the processor to perform steps of: partitioning a first portion of a set of memory resources included in a processor to generate a plurality of logical memory partitions, wherein each logical memory partition is associated with at least one engine included in a plurality of engines and corresponds to a different address range within a first set of memory addresses, and wherein each address range is for accessing a different subset of cache slices included in a plurality of cache slices; partitioning a second portion of the set of memory resources to generate a memory section that includes a portion of each cache slice in the plurality of cache slices, wherein the second portion is accessed via a second set of memory addresses for accessing any of the plurality of cache slices, and wherein the second portion is inaccessible via the first set of memory addresses; and transmitting each memory access request included in a plurality of memory access requests received from the plurality of engines to the memory section or to one of the logical memory partitions, wherein each memory access request is serviced by the memory section or the respective logical memory partition. 12. The non-transitory computer-readable medium of claim 11 , wherein the step of partitioning the first set of the set of memory resources comprises activating a set of logical boundaries associated with the set of memory resources based on a set of bits, wherein each bit included in the set of bits corresponds to a different logical boundary included in the set of logical boundaries. 13. The non-transitory computer-readable medium of claim 11 , wherein a first memory access request included in the plurality of memory access requests is transmitted to a first logical memory partition included in the plurality of logical memory partitions based on a set of bits, wherein the first set of the set of memory resources is partitioned based, at least in part, on the set of bits. 14. The non-transitory computer-readable medium of claim 11 , further comprising the step of designating the second portion of the set of memory resources for a

Assignees

Inventors

Classifications

  • Resource optimization · CPC title

  • Task decomposition · CPC title

  • G06F9/5077Primary

    Logical partitioning of resources; Management or configuration of virtualized resources (specific details on emulation or internal functioning of virtual machines G06F9/455) · CPC title

  • Configuration or reconfiguration · CPC title

  • G06F9/5027Primary

    the resource being a machine, e.g. CPUs, Servers, Terminals · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11249905B2 cover?
A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tas…
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/5077. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 15 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).