In-memory popcount support for real time analytics

US9836277B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9836277-B2
Application numberUS-201514687676-A
CountryUS
Kind codeB2
Filing dateApr 15, 2015
Priority dateOct 1, 2014
Publication dateDec 5, 2017
Grant dateDec 5, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A Processing-In-Memory (PIM) model in which computations related to the POPCOUNT and logical bitwise operations are implemented within a memory module and not within a host Central Processing Unit (CPU). The in-memory executions thus eliminate the need to shift data from large bit vectors throughout the entire system. By off-loading the processing of these operations to the memory, the redundant data transfers over the memory-CPU interface are greatly reduced, thereby improving system performance and energy efficiency. A controller and a dedicated register in the logic die of the memory module operate to interface with the host and provide in-memory executions of popcounting and logical bitwise operations requested by the host. The PIM model of the present disclosure thus frees up the CPU for other tasks because many real-time analytics tasks can now be executed within a PIM-enabled memory itself. The memory module may be a Three Dimensional Stack (3DS) memory or any other semiconductor memory.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving at a memory module an instruction from a host to perform a POPCOUNT operation on a bit vector stored in the memory module; and executing the POPCOUNT operation within the memory module, without transferring the bit vector to the host for the execution, wherein executing the POPCOUNT operation includes: dividing bits in the bit vector into a plurality of non-overlapping segments, calculating a segment-specific bitcount for each of the plurality of non-overlapping segments, and adding all segment-specific bitcounts to generate a result. 2. The method of claim 1 , further comprising: storing the result of the execution of the POPCOUNT operation within the memory module; and providing the result from the memory module to the host. 3. The method of claim 1 , wherein each segment includes 8 bits. 4. The method of claim 1 , wherein calculating the segment-specific bitcount includes one of the following: using a Look-Up Table (LUT) stored in the memory module to obtain the segment-specific bitcount; and performing a sequence of shifts and logical bitwise operations on each of the plurality of non-overlapping segments to generate the segment-specific bitcount. 5. The method of claim 1 , wherein adding all segment-specific bitcounts includes: using each segment-specific bitcount as an input to a corresponding one of a plurality of adders within the memory module; and accumulating outputs of all adders in the plurality of adders to generate the result. 6. The method of claim 1 , wherein the memory module is one of the following: a Three Dimensional Stack (3DS) memory module; a High Bandwidth Memory (HBM) module; a Hybrid Memory Cube (HMC) memory module; a Solid State Drive (SSD); a Dynamic Random Access Memory (DRAM) module; a Static Random Access Memory (SRAM); a Phase-Change Random Access Memory (PRAM); a Resistive Random Access Memory (ReRAM); a Conductive-Bridging RAM (CBRAM); a Magnetic RAM (MRAM); and a Spin-Transfer Torque MRAM (STT-MRAM). 7. The method of claim 1 , wherein the bit vector is generated by an encryption algorithm. 8. The method of claim 7 , further comprising: determining encryption quality of the encryption algorithm based on a result of the execution of the POPCOUNT operation. 9. A method comprising: receiving at a memory module an instruction from a host to perform a POPCOUNT operation on a bit vector stored in the memory module; and executing the POPCOUNT operation within the memory module, without transferring the bit vector to the host for the execution, wherein executing the POPCOUNT operation includes: receiving from the host a physical address of a memory location in the memory module where a respective portion of the bit vector is stored, for each received physical address, retrieving the respective portion of the bit vector from the memory location, performing a partial bitcount on the retrieved portion of the bit vector, and combining results of all partial bitcounts to effectuate the execution of the POPCOUNT operation on the bit vector. 10. The method of claim 9 , further comprising: storing each received physical address in a pre-defined storage location within the memory module; accessing the pre-defined storage location to obtain each received physical address for retrieving the respective portion of the bit vector; and storing a combined result of all partial bitcounts in the pre-defined storage location for submission to the host as a final outcome of the execution of the POPCOUNT operation. 11. A method comprising: receiving at a memory module an instruction from a host to perform a logical bitwise operation on two or more bit vectors stored in the memory module; and executing the logical bitwise operation within the memory module, without transferring the bit vectors to the host for the execution, wherein executing the logical bitwise operation includes: dividing each bit vector into a plurality of bit vector-specific non-overlapping segments, aligning corresponding bit vector-specific segments from all bit vectors into a plurality of groups of aligned segments, performing the logical bitwise operation on each group of aligned segments to thereby generate a plurality of partial results, and combining all partial results to effectuate the execution of the logical bitwise operation. 12. The method of claim 11 , further comprising: storing a result of the execution of the logical bitwise operation within the memory module; and providing the result from the memory module to the host. 13. The method of claim 11 , wherein the logical bitwise operation is one of the following: an OR operation; an AND operation; a NOT operation; a NAND operation; a NOR operation; and an XOR operation. 14. The method of claim 11 , wherein each bit vector-specific segment includes 8 bits. 15. The method of claim 11 , further comprising performing the following prior to dividing each bit vector into the plurality of bit vector-specific segments: receiving from the host physical addresses of memory locations in the memory module where respective bit vectors are stored; and retrieving the bit vectors from the corresponding memory locations. 16. The method of claim 15 , further comprising: storing each received physical address in a pre-defined storage location within the memory module; accessing the pre-defined storage location to obtain each received physical address for retrieving the respective bit vector; and storing in the pre-defined storage location a final outcome of combining all partial results for future submission to the host. 17. The method of claim 11 , wherein the memory module is one of the following: a Three Dimensional Stack (3DS) memory module; a High Bandwidth Memory (HBM) module; a Hybrid Memory Cube (HMC) memory module; a Solid State Drive (SSD); a Dynamic Random Access Memory (DRAM) module; a Static Random Access Memory (SRAM); a Phase-Change Random Access Memory (PRAM); a Resistive Random Access Memory (ReRAM); a Conductive-Bridging RAM (CBRAM); a Magnetic RAM (MRAM); and a Spin-Transfer Torque MRAM (STT-MRAM). 18. A memory module, comprising: a memory chip; and a logic die connected to the memory chip and operative to control data transfer between the memory chip and an external host, wherein the logic die includes a controller that is operative to: receive an instruction from the host to perform at least one of the following: a POPCOUNT operation on a first bit vector stored in the memory chip, and a logical bitwise operation on two or more second bit vectors stored in the memory chip; and perform at least one of the following: execute the POPCOUNT operation, without transferring the first bit vector to the host for the execution of the POPCOUNT operation, and execute the logical bitwise operation, without transferring the second bit vectors to the host for the execution of the logical bitwise operation, wherein the controller includes a processing logic that comprises a plurality of adders, wherein the processing logic is operative to perform the following as part of executing the POPCOUNT operation: retrieve the first bit vector from the memory chip; divide bits in the first bit vector into a plurality of non-overlapping segments; calculate a segment-specific bitcount for each of the plurality of non-overlapping segments; use each segment-specific bitcount as an input to a corresponding one of the plurality of adders; accu

Assignees

Inventors

Classifications

  • using a secondary processor, e.g. coprocessor (peripheral processor G06F13/12) · CPC title

  • Architectures of general purpose stored program computers (with program plugboard G06F15/08; multicomputers G06F15/16) · CPC title

  • G06F7/00Primary

    Methods or arrangements for processing data by operating upon the order or content of the data handled (logic circuits H03K19/00) · CPC title

  • G06F9/3001Primary

    Arithmetic instructions · CPC title

  • Tightly coupled to memory, e.g. computational memory, smart memory, processor in memory · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9836277B2 cover?
A Processing-In-Memory (PIM) model in which computations related to the POPCOUNT and logical bitwise operations are implemented within a memory module and not within a host Central Processing Unit (CPU). The in-memory executions thus eliminate the need to shift data from large bit vectors throughout the entire system. By off-loading the processing of these operations to the memory, the redundan…
Who is the assignee on this patent?
Guz Zvika, Yin Liang, Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F7/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 05 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).