Composable neural network kernels

US12190225B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12190225-B2
Application numberUS-202016779557-A
CountryUS
Kind codeB2
Filing dateJan 31, 2020
Priority dateJun 27, 2019
Publication dateJan 7, 2025
Grant dateJan 7, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A technique for manipulating a generic tensor is provided. The technique includes receiving a first request to perform a first operation on a generic tensor descriptor associated with the generic tensor, responsive to the first request, performing the first operation on the generic tensor descriptor, receiving a second request to perform a second operation on generic tensor raw data associated with the generic tensor, and responsive to the second request, performing the second operation on the generic tensor raw data.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for manipulating a generic tensor, the method comprising: receiving a first request to perform a first operation on a generic tensor descriptor associated with the generic tensor, wherein the generic tensor descriptor indicates how to translate one or more index values into one or more memory addresses for generic tensor raw data associated with the generic tensor, wherein the generic tensor descriptor includes one or more of a tensor type for the generic tensor, a number of dimensions of the generic tensor, lengths of each dimension of the generic tensor, and a base address for the generic tensor raw data; performing the first operation on the generic tensor descriptor, the first operation comprising an operation to modify the generic tensor descriptor by performing one or more of limiting indices to use, modifying dimension order, or modifying dimension number, and the first operation does not include modifying any of the generic tensor raw data; receiving a second request to perform a second operation on generic tensor raw data associated with the generic tensor; and performing the second operation on the generic tensor raw data. 2. The method of claim 1 , wherein the first operation comprises one of: a slice, a strided slice, a reorder, a fold, a merge, an embed, and a move slicing window operation. 3. The method of claim 1 , wherein the second operation comprises one of: a slice, a reorder, a copy, a general matrix multiply or batched general matrix multiply, a reduction, and an algorithm-specific transformation. 4. The method of claim 1 , wherein the first operation generates a modified generic tensor descriptor based on the generic tensor descriptor and the first operation without modifying the generic tensor raw data. 5. The method of claim 1 , wherein the generic tensor raw data comprises: data elements of the generic tensor. 6. The method of claim 1 , wherein: the first request, first operation, second request, and second operation are specified by instructions of a program. 7. The method of claim 1 , wherein: the first request is specified by program source; the first operation is performed by a compiler configured to compile the program source to generate a compiled program; the second request is specified by the compiled program; and the second operation is performed by the compiled program. 8. The method of claim 1 , wherein: the first request and the second request are specified by a program; and at least one of the first operation and the second operation are performed by a specialized hardware circuit configured to perform at least one operation on generic tensor descriptors or on generic tensor raw data. 9. The method of claim 1 , wherein the one or more index values are translated into one or more memory addresses for the generic tensor raw data where the translating includes a reduction of dimensionality for a merged tensor or an increase of dimensionality for an embedded tensor. 10. The method of claim 1 , wherein the generic tensor descriptor further indicates any one or a combination of the following for the generic tensor: a type of generic tensor, a number of dimensions of the generic tensor, lengths for each dimension of the generic tensor, or a base address for the generic tensor raw data. 11. A system for manipulating a generic tensor, the system comprising: a memory storing a generic tensor descriptor, wherein the generic tensor descriptor indicates how to translate one or more index values into one or more memory addresses for generic tensor raw data associated with the generic tensor, wherein the generic tensor descriptor includes one or more of a tensor type for the generic tensor, a number of dimensions of the generic tensor, lengths of each dimension of the generic tensor, and a base address for the generic tensor raw data; and a processor, configured to: receive a first request to perform a first operation on the generic tensor descriptor associated with the generic tensor; perform the first operation on the generic tensor descriptor, the first operation comprising an operation to modify the generic tensor descriptor by performing one or more of limiting indices to use, modifying dimension order, or modifying dimension number, and the first operation does not include modifying any of the generic tensor raw data; receive a second request to perform a second operation on generic tensor raw data associated with the generic tensor; and performing the second operation on the generic tensor raw data. 12. The system of claim 11 , wherein the first operation comprises one of: a slice, a strided slice, a reorder, a fold, a merge, an embed, and a move slicing window operation. 13. The system of claim 11 , wherein the second operation comprises one of: a slice, a reorder, a copy, a general matrix multiply or batched general matrix multiply, a reduction, and an algorithm-specific transformation. 14. The system of claim 11 , wherein the first operation generates a modified generic tensor descriptor based on the generic tensor descriptor and the first operation without modifying the generic tensor raw data. 15. The system of claim 11 , wherein the generic tensor raw data comprises: data elements of the generic tensor. 16. The system of claim 11 , wherein: the first request, first operation, second request, and second operation are specified by instructions of a program. 17. The system of claim 11 , wherein: the first request is specified by program source; the first operation is performed by a compiler configured to compile the program source to generate a compiled program; the second request is specified by the compiled program; and the second operation is performed by the compiled program. 18. The system of claim 11 , wherein: the first request and the second request are specified by a program; and at least one of the first operation and the second operation are performed by a specialized hardware circuit configured to perform at least one operation on generic tensor descriptors or on generic tensor raw data. 19. The system of claim 11 , wherein the one or more index values are translated into one or more memory addresses for the generic tensor raw data where the translating includes a reduction of dimensionality for a merged tensor or an increase of dimensionality for an embedded tensor. 20. The system of claim 11 , wherein the generic tensor descriptor further indicates any one or a combination of the following for the generic tensor: a type of generic tensor, a number of dimensions of the generic tensor, lengths for each dimension of the generic tensor, or a base address for the generic tensor raw data. 21. A non-transitory computer-readable medium storing instructions that, when executed by a processor, cause the processor to manipulate a generic tensor, by: receiving a first request to perform a first operation on a generic tensor descriptor associated with the generic tensor, wherein the generic tensor descriptor indicates how to translate one or more index values into one or more memory addresses for generic tensor raw data associated with the generic tensor, wherein the generic tensor descriptor includes one or more of a tensor type for the generic tensor, a number of dimensions of the generic tensor, lengths of each dimension of the generic tensor, and a base address for the generic tensor raw data; performing the first operation on the generic tensor descriptor, the first operation co

Assignees

Inventors

Classifications

  • Convolutional networks [CNN, ConvNet] · CPC title

  • using a mask · CPC title

  • controlled by a single instruction for multiple threads [SIMT] in parallel · CPC title

  • Divergence aspects · CPC title

  • controlled by a single instruction for multiple data lanes [SIMD] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12190225B2 cover?
A technique for manipulating a generic tensor is provided. The technique includes receiving a first request to perform a first operation on a generic tensor descriptor associated with the generic tensor, responsive to the first request, performing the first operation on the generic tensor descriptor, receiving a second request to perform a second operation on generic tensor raw data associated …
Who is the assignee on this patent?
Advanced Micro Devices Inc
What technology area does this patent fall under?
Primary CPC classification G06N3/04. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 07 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).