What technology area does this patent fall under?

Primary CPC classification G06F13/28. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 14 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Active memory device gather, scatter, and filter

US10049061B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10049061-B2
Application number	US-201213674520-A
Country	US
Kind code	B2
Filing date	Nov 12, 2012
Priority date	Nov 12, 2012
Publication date	Aug 14, 2018
Grant date	Aug 14, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments relate to loading and storing of data. An aspect includes a method for transferring data in an active memory device that includes memory and a processing element. An instruction is fetched and decoded for execution by the processing element. Based on determining that the instruction is a gather instruction, the processing element determines a plurality of source addresses in the memory from which to gather data elements and a destination address in the memory. One or more gathered data elements are transferred from the source addresses to contiguous locations in the memory starting at the destination address. Based on determining that the instruction is a scatter instruction, a source address in the memory from which to read data elements at contiguous locations and one or more destination addresses in the memory to store the data elements at non-contiguous locations are determined, and the data elements are transferred.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for transferring data in an active memory device comprising a three-dimensional memory cube that includes memory divided into three-dimensional blocked regions as memory vaults, one or more memory controllers, and a processing element, the method comprising: fetching and decoding an instruction for execution by the processing element of the active memory device; and based on determining that the instruction is a gather instruction comprising a first source address pointer, a first destination address pointer, a first stride pointer, and a first count, the processing element performing: determining a plurality of source addresses in the memory from which to gather data elements based on the first source address pointer pointing to a list of the source addresses and the first count indicating a number of the source addresses in the list of the source addresses, the plurality of source addresses identifying non-contiguous locations in one or more of the memory vaults; determining a destination address in the memory as pointed to by the first destination address pointer; determining a first stride size vector as pointed to by the first stride pointer, wherein the first stride size vector supports different stride sizes associated with each of the source addresses; accessing the non-contiguous locations in one or more of the memory vaults through the one or more memory controllers in the active memory device as gathered data elements by realizing multiple vector data element accesses simultaneously, wherein the memory vaults each comprise at least one data element from each of a plurality of memory layers; and transferring the gathered data elements from the plurality of source addresses to contiguous locations in the memory starting at the destination address and incrementing the destination address based on the first stride size vector as each of the gathered data elements is transferred; and based on determining that the instruction is a scatter instruction comprising a second source address pointer, a second destination address pointer, a second stride pointer, and a second count: determining a source address in the memory from which to read a plurality of data elements at contiguous locations as pointed to by the second source address pointer; determining a plurality of destination addresses in the memory to store the data elements at non-contiguous locations based on the second destination address pointer pointing to a list of the destination addresses and the second count indicating a number of the destination addresses in the list of the destination addresses; determining a second stride size vector as pointed to by the second stride pointer, wherein the second stride size vector supports different stride sizes associated with the source address; identifying filter criteria associated with the instruction; and transferring one or more of the data elements from the source address to the destination addresses while applying the filter criteria to limit transferring between the source and destination addresses according to the filter criteria based on a data value of the one or more of data elements to be transferred, wherein the filter criteria prevent one or more excluded data values from being stored at the destination addresses while continuing to store one or more included data values at the destination addresses and incrementing the source address based on the second stride size vector regardless of the filter criteria; wherein the processing element provides virtual address computation functionality that supports an execution of the gather instruction or the scatter instruction. 2. The method of claim 1 , wherein the instruction, the plurality of source addresses, and the destination address are provided by a main processor in communication with the processing element. 3. The method of claim 2 , wherein the plurality of source addresses and the destination address are received from the main processor in an effective address format and are translated by the processing element to a real address format when performing load and store operations to the memory. 4. The method of claim 2 , wherein determining the plurality of source addresses in the memory from which to gather data elements further comprises receiving the first source address pointer from the main processor that identifies a location in the memory containing the plurality of source addresses. 5. The method of claim 1 , wherein the active memory device further comprises multiple instances of the processing element coupled to an interconnect network, the multiple instances of the processing element operable to access any of the memory vaults across the interconnect network. 6. A processing element of an active memory device comprising a three-dimensional memory cube that includes memory divided into three-dimensional blocked regions as memory vaults, one or more memory controllers, and the processing element, comprising: a load store queue that interfaces with one or more of the memory vaults in the active memory device; an instruction buffer coupled to the load store queue; and a decoder coupled to the instruction buffer, the decoder decodes an instruction received at the instruction buffer and based on determining that the instruction is a gather instruction comprising a first source address pointer, a first destination address pointer, a first stride pointer, and a first count, the processing element performs: determining a plurality of source addresses in the memory from which to gather data elements based on the first source address pointer pointing to a list of the source addresses and the first count indicating a number of the source addresses in the list of the source addresses, the plurality of source addresses identifying non-contiguous locations in one or more of the memory vaults; determining a destination address in the memory as pointed to by the first destination address pointer; determining a first stride size vector as pointed to by the first stride pointer, wherein the first stride size vector supports different stride sizes associated with each of the source addresses; accessing the non-contiguous locations in one or more of the memory vaults through the one or more memory controllers in the active memory device as gathered data elements by realizing multiple vector data element accesses simultaneously, wherein the memory vaults each comprise at least one data element from each of a plurality of memory layers; and transferring the gathered data elements from the plurality of source addresses to contiguous locations in the memory starting at the destination address and incrementing the destination address based on the first stride size vector as each of the gathered data elements is transferred; and based on determining that the instruction is a scatter instruction comprising a second source address pointer, a second destination address pointer, a second stride pointer, and a second count: determining a source address in the memory from which to read a plurality of data elements at contiguous locations as pointed to by the second source address pointer; determining a plurality of destination addresses in the memory to store the data elements at non-contiguous locations based on the second destination address pointer pointing to a list of the destination addresses and the second count indicating a number of the destination addresses in the list of the destination addresses; determining a second stride size vector as pointed to by the second stride pointer, wherein the second stride size vector supports different stride sizes associated with the source address; identifying filter criteria associated with the instruction; and transferring one or more of the data ele

Assignees

Inventors

Classifications

G06F15/7821
Tightly coupled to memory, e.g. computational memory, smart memory, processor in memory · CPC title
G06F9/30043
LOAD or STORE instructions; Clear instruction · CPC title
G06F9/3455
using stride · CPC title
G06F13/28Primary
using burst mode transfer, e.g. direct memory access {DMA}, cycle steal (G06F13/32 takes precedence) · CPC title
Y02D10/14
Cross-Sectional Technologies · mapped topic

Patent family

Related publications grouped by family.

View patent family 50682878

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10049061B2 cover?: Embodiments relate to loading and storing of data. An aspect includes a method for transferring data in an active memory device that includes memory and a processing element. An instruction is fetched and decoded for execution by the processing element. Based on determining that the instruction is a gather instruction, the processing element determines a plurality of source addresses in the mem…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06F13/28. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 14 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).