Using a vector processor to configure a direct memory access system for feature tracking operations in a system on a chip

US11934829B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11934829-B2
Application numberUS-202218064119-A
CountryUS
Kind codeB2
Filing dateDec 9, 2022
Priority dateAug 2, 2021
Publication dateMar 19, 2024
Grant dateMar 19, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: determining, using a processor and based at least on first data stored in one or more memories using a direct memory access (DMA) system, one or more locations of a tracked feature as identified using sensor data; storing, using the processor and in the one or more memories, second data representative of the one or more locations; determining, using the DMA system and based at least on the second data stored in the one or more memories, one or more descriptors corresponding to the tracked feature; and storing, using the DMA system and in the one or more memories, third data representative of the one or more descriptors corresponding to the tracked feature. 2. The method of claim 1 , further comprising storing, using the DMA system and in the one or more memories, the first data representative of one or more second descriptors corresponding to the tracked feature. 3. The method of claim 1 , further comprising: determining, using the processor and based at least on the third data representative of the one or more descriptors, one or more second locations of the tracked feature as identified using the sensor data; and storing, using the processor and in the one or more memories, fourth data representative of the one or more second locations. 4. The method of claim 1 , wherein: the second data is representative of one or more address/data pairs corresponding to the one or more locations; and the determining the one or more descriptors corresponding to the tracked feature is based at least on the one or more address/data pairs. 5. The method of claim 1 , wherein the second data corresponds to a data format including one or more bytes that represent an address of one or more address/data pairs corresponding to the one or more locations and one or more bytes that represent data of the one or more address/data pairs. 6. The method of claim 1 , further comprising configuring, using a processing controller, the processor to determine the one or more locations and the DMA system to determine the one or more descriptors. 7. The method of claim 1 , further comprising configuring, using a processing controller, the processor to store the second data in the one or more memories and the DMA system to store the third data in the one or more memories. 8. A system comprising: a processor; and a direct memory access (DMA) system, wherein the system executes operations comprising: determining, using the processor and based at least on first data stored in a one or more memories using the DMA system, one or more locations of a tracked feature identified using sensor data; storing, using the processor and in the one or more memories, second data representing the one or more locations of the tracked feature; determining, using the DMA system and based at least on the second data stored in the one or more memories, one or more descriptors corresponding to the tracked feature; and storing, using the DMA system and in the one or more memories, third data representative of the one or more descriptors corresponding to the tracked feature. 9. The system of claim 8 , wherein the first data is representative of one or more second descriptors corresponding to the tracked feature. 10. The system of claim 8 , wherein the system further executes operations comprising: determining, using the processor and based at least on the third data stored in the one or more memories, one or more second locations of the tracked feature; and storing, using the processor and in the one or more memories, fourth data representative of the one or more second locations of the tracked feature. 11. The system of claim 8 , wherein: the second data is representative of one or more address/data pairs corresponding to the one or more locations; and the one or more descriptors corresponding to the tracked feature are determined based at least on the one or more address/data pairs. 12. The system of claim 8 , wherein the second data corresponds to a data format including one or more bytes that represent an address of one or more address/data pairs corresponding to the one or more locations and one or more bytes that represent data of the one or more address/data pairs. 13. The system of claim 8 , wherein the system further executes operations comprising configuring, using a processing controller, the processor to determine the one or more locations and the DMA system to determine the one or more descriptors. 14. The system of claim 8 , wherein the system further executes operations comprising configuring, using a processing controller, the processor to store the second data in the one or more memories and the DMA system to store the third data in the one or more memories. 15. The system of claim 8 , wherein: the one or more memories is a vector memory (VMEM); the processor is a vector processing unit (VPU) coupled to the VMEM; and the DMA system is coupled to the VMEM. 16. The system of claim 8 , wherein the system is comprised in at least one of: a control system for an autonomous or semi-autonomous machine; a perception system for an autonomous or semi-autonomous machine; a system for performing simulation operations; a system for performing deep learning operations; a system on chip (SoC); a system including a programmable vision accelerator (PVA); a system including a vison processing unit; a system implemented using an edge device; a system implemented using a robot; a system incorporating one or more virtual machines (VMs); a system implemented at least partially in a data center; or a system implemented at least partially using cloud computing resources. 17. A processor comprising: one or more processing units to; retrieve, from one or more memories, first data representative of one or more descriptors associated with a tracked feature identified using sensor data, the first data being stored in the one or more memories using a direct memory access (DMA) system; determine, based at least on the first data, one or more locations of the tracked feature; and store, in the one or more memories, second data representative of the one or more locations of the tracked feature. 18. The processor of claim 17 , wherein the one or more processing units are further to store, in the one or more memories, third data representative of one or more second locations of the tracked feature, the one or more second locations being determined based at least on fourth data stored in the one or more memories by the DMA system. 19. The processor of claim 17 , wherein the processor is comprised in at least one of: a control system for an autonomous or semi-autonomous machine; a perception system for an autonomous or semi-autonomous machine; a system for performing simulation operations; a system for performing deep learning operations; a system on chip (SoC); a system including a programmable vision accelerator (PVA); a system including a vison processing unit; a system implemented using an edge device; a system implemented using a robot; a system incorporating one or more virtual machines (VMs); a system implemented at least partially in a data center; or a system implemented at least partially using cloud computing resources. 20. The method of claim 1 , wherein the processor comprises a vector processing unit (VPU) and the one or more memories comprise a vector memory (VMEM).

Assignees

Inventors

Classifications

  • using a mask · CPC title

  • Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title

  • G06F9/3004Primary

    to perform operations on memory · CPC title

  • G06F13/28Primary

    using burst mode transfer, e.g. direct memory access {DMA}, cycle steal (G06F13/32 takes precedence) · CPC title

  • Details on data memory access · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11934829B2 cover?
In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardwa…
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/3004. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 19 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).