Data shuffle offload

US12229072B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12229072-B2
Application numberUS-202418598382-A
CountryUS
Kind codeB2
Filing dateMar 7, 2024
Priority dateFeb 1, 2022
Publication dateFeb 18, 2025
Grant dateFeb 18, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Devices, methods, and systems are provided. In one example, a device is described to include a device interface that receives data from at least one data source; a data shuffle unit that collects the data received from the at least one data source, receives a descriptor that describes a data shuffle operation to perform on the data received from the at least one data source, performs the data shuffle operation on the collected data to produce shuffled data, and provides the shuffled data to at least one data target.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: a device interface to receive data from at least one data source; and circuitry to collect the data received from the at least one data source, receive a descriptor that describes a data shuffle operation to perform on the data received from the at least one data source, perform the data shuffle operation on the data to produce shuffled data, and then provide the shuffled data to at least one data target, wherein the descriptor comprises at least one of a processor instruction description, a work queue element (WQE) posted to a queue pair, a memory region description, a description of a Remote Direct Memory Access (RDMA) request, and a description of an application-level request. 2. The system of claim 1 , wherein the data shuffle operation is performed according to a descriptor provided by the at least one data source. 3. The system of claim 1 , wherein the data received from the at least one data source is stored until a predetermined amount of data is collected. 4. The system of claim 1 , wherein the at least one data source comprises a host memory device. 5. The system of claim 1 , wherein the at least one data source comprises an on-network device memory. 6. The system of claim 1 , wherein the at least one data source comprises a peer memory device. 7. The system of claim 1 , wherein the data is received in a plurality of network packets. 8. The system of claim 1 , wherein the at least one data target comprises a plurality of data targets. 9. The system of claim 1 , wherein the at least one data target comprises at least one of a host memory device, a peer memory device, and an on-network device memory. 10. The system of claim 1 , wherein the at least one data target is located remotely from the circuitry and further comprising: a second device interface that communicates with the at least one data target, wherein the device interface comprises a communication port and wherein the second device interface also comprises a communication port. 11. The system of claim 1 , wherein the at least one data source is located remotely from the circuitry. 12. The system of claim 1 , wherein the descriptor comprises the WQE posted to the queue pair and represents a single data shuffle operation. 13. The system of claim 1 , wherein the descriptor is obtained from a memory device and comprises a memory region description that is usable to perform multiple shuffle operations. 14. The system of claim 1 , wherein the descriptor is received in a network packet via the device interface. 15. The system of claim 1 , wherein the descriptor is received as part of the RDMA request or the application-level request. 16. The system of claim 1 , wherein the device interface and the circuitry are provided as part of a Network Interface Controller (NIC). 17. The system of claim 1 , wherein the device interface and the circuitry are provided as part of a network switch. 18. The system of claim 1 , wherein the data shuffle operation comprises at least one of a matrix transpose, a non-homogenous transpose, a removal of padding bits, an addition of padding bits, a tensor layout conversion, a bit packing, a component packing, and a bit tiling. 19. The system of claim 1 , wherein an application receiving the shuffled data is unaware of the data shuffle operation performed by the circuitry. 20. A system, comprising: circuitry in communication with a device interface that receives data from at least one data source, wherein the circuitry collects the data received from the device interface, receives a descriptor that describes a data shuffle operation to perform on the data received from the at least one data source, performs the data shuffle operation on the data to produce shuffled data, and then provides the shuffled data to at least one data target, wherein the descriptor comprises at least one of a processor instruction description, a work queue element (WQE) posted to a queue pair, a memory region description, a description of a Remote Direct Memory Access (RDMA) request, and a description of an application-level request.

Assignees

Inventors

Classifications

  • Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data · CPC title

  • the resource being a machine, e.g. CPUs, Servers, Terminals · CPC title

  • G06F13/42Primary

    Bus transfer protocol, e.g. handshake; Synchronisation · CPC title

  • G06F9/3004Primary

    to perform operations on memory · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12229072B2 cover?
Devices, methods, and systems are provided. In one example, a device is described to include a device interface that receives data from at least one data source; a data shuffle unit that collects the data received from the at least one data source, receives a descriptor that describes a data shuffle operation to perform on the data received from the at least one data source, performs the data s…
Who is the assignee on this patent?
Mellanox Technologies Ltd
What technology area does this patent fall under?
Primary CPC classification G06F13/42. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 18 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).