Methods, Systems, and Circuits for Coordinated Optimization in In-Memory Sorting

US2023401034A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2023401034-A1
Application numberUS-202318330666-A
CountryUS
Kind codeA1
Filing dateJun 7, 2023
Priority dateJun 9, 2022
Publication dateDec 14, 2023
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed herein are systems, methods, and computer-readable media for sorting datasets within a Processing in Memory (PIM)-based system. A request to sort a dataset stored in a 3D-stacked memory can be received. The request can identify a specific dataset and sorting criteria, which includes a plurality of keys. The dataset can be partitioned into several subarrays across various memory banks within the 3D-stacked memory. Each piece of data within these subarrays can be separated into buckets based on the keys. Local histograms for each subarray and bank histograms based on the local histograms can be generated. A prefix-sum operation on the bank histograms can determine individual positions for the sorted dataset. Aggregation of the subarrays from all memory banks can form the sorted dataset, which can be subsequently returned.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for sorting a dataset in a processing in memory (PIM)-based system, the method comprising: receiving a request to sort a dataset stored in a 3D-stacked memory of the PIM-based system, wherein the request identifies a specific dataset and sorting criteria, wherein the sorting criteria includes a plurality of keys used to sort pieces of data in the dataset; partitioning the dataset into a plurality of subarrays across a plurality of memory banks within the 3D-stacked memory, wherein each subarray includes at least one piece of data from the dataset; separating each piece of data in each subarray into a plurality of buckets based on the plurality of keys, wherein each bucket corresponds to a particular key of the plurality of keys, wherein a same plurality of buckets is used for each subarray; generating a plurality of local histogram for each subarray, wherein each local histogram of a particular subarray corresponds to a particular bucket of the plurality of buckets, and wherein each local histogram indicates a frequency of a particular key in the corresponding bucket; generating a plurality of bank histograms based on the plurality of local histograms, wherein each bank histogram corresponds to a particular bucket of the plurality of buckets, wherein the generating comprises combining key counts from all local histograms corresponding to a same bucket of the plurality of buckets; performing a prefix-sum operation on the plurality of bank histograms to determine individual positions of the pieces of data for a sorted dataset; aggregating the subarrays from all memory banks to form the sorted dataset based on the individual positions; and returning the sorted dataset. 2 . The method of claim 1 , wherein said partitioning the dataset comprises partitioning the dataset in a manner that substantially evenly distributes the data across the memory banks. 3 . The method of claim 1 , wherein separating each piece of data in each subarray into a plurality of buckets based on the plurality of keys comprises performing local binary radix sorting within each subarray to sort the pieces of data according to their corresponding keys. 4 . The method of claim 3 , wherein the local binary radix sorting within each subarray is performed concurrently across all subarrays, thereby concurrently sorting the pieces of data based on their corresponding keys. 5 . The method of claim 1 , wherein said separating is based on a radix sort technique. 6 . The method of claim 1 , wherein said generating the plurality of local histograms for each subarray is performed using subarray-level processing units (SPUs) that cooperate to create intermediate arrays. 7 . The method of claim 1 , wherein said generating the plurality of bank histograms comprises combining key counts from local histograms corresponding to a same bucket across all subarrays in a memory bank. 8 . The method of claim 1 , further comprising determining a radix for sorting the dataset based on key size and distribution, wherein the radix includes at least one of a binary radix or a decimal radix. 9 . The method of claim 1 , wherein said generating a plurality of local histogram comprises iteratively generating local histograms until covering a full radix range. 10 . The method of claim 1 , wherein the 3D-stacked memory of the PIM-based system comprises multiple layers, the multiple layers including the plurality of subarrays, each layer being interconnected through vertical interconnects to enable efficient communication between the layers. 11 . The method of claim 10 , wherein the vertical interconnects include through-silicon vias (TSVs) to facilitate high-speed data transmission and reduce latency. 12 . The method of claim 1 , wherein the 3D-stacked memory of the PIM-based system utilizes a Dragon topography with 8 memory layers, and each memory layer comprises 64 memory banks. 13 . The method of claim 1 , wherein said performing the prefix-sum operation on the plurality of bank histograms comprises using parallel prefix-sum techniques. 14 . A computer-readable medium storing instructions, that when executed by a processing in memory (PIM) based system, cause the system to perform a method for sorting a dataset, the method comprising: receiving a request to sort a dataset stored in a 3D-stacked memory of the PIM-based system, wherein the request identifies a specific dataset and sorting criteria, wherein the sorting criteria includes a plurality of keys used to sort pieces of data in the dataset; partitioning the dataset into a plurality of subarrays across a plurality of memory banks within the 3D-stacked memory, wherein each subarray includes at least one piece of data from the dataset; separating each piece of data in each subarray into a plurality of buckets based on the plurality of keys, wherein each bucket corresponds to a particular key of the plurality of keys, wherein a same plurality of buckets is used for each subarray; generating a plurality of local histogram for each subarray, wherein each local histogram of a particular subarray corresponds to a particular bucket of the plurality of buckets, and wherein each local histogram indicates a frequency of a particular key in the corresponding bucket; generating a plurality of bank histograms based on the plurality of local histograms, wherein each bank histogram corresponds to a particular bucket of the plurality of buckets, wherein the generating comprises combining key counts from all local histograms corresponding to a same bucket of the plurality of buckets; performing a prefix-sum operation on the plurality of bank histograms to determine individual positions of the pieces of data for a sorted dataset; aggregating the subarrays from all memory banks to form the sorted dataset based on the individual positions; and returning the sorted dataset. 15 . The computer-readable medium of claim 14 , wherein said partitioning the dataset comprises partitioning the dataset in a manner that substantially evenly distributes the data across the memory banks. 16 . The computer-readable medium of claim 14 , wherein separating each piece of data in each subarray into a plurality of buckets based on the plurality of keys comprises performing local binary radix sorting within each subarray to sort the pieces of data according to their corresponding keys. 17 . The computer-readable medium of claim 14 , wherein the local binary radix sorting within each subarray is performed concurrently across all subarrays, thereby concurrently sorting the pieces of data based on their corresponding keys. 18 . The computer-readable medium of claim 14 , wherein said separating is based on a radix sort technique. 19 . The computer-readable medium of claim 14 , wherein the method further comprises determining a radix for sorting the dataset based on key size and distribution, wherein the radix includes at least one of a binary radix or a decimal radix. 20 . A processing in memory (PIM) based system for sorting a dataset, the system comprising: a 3D-stacked memory configured to store a dataset, the 3D-stacked memory including a plurality of memory banks and partitionable into a plurality of subarrays across the memory banks, wherein each subarray includes at least one piece of data from the dataset; a processing unit configured to receive a request to sort the dataset, the request identifying a specific dataset and sorting criteria, wherein the sorting criteria includes a plura

Assignees

Inventors

Classifications

  • G06F7/26Primary

    the sorted data being recorded on the original record carrier within the same space in which the data had been recorded prior to their sorting, without using intermediate storage · CPC title

  • G06F7/49Primary

    Computations with a radix, other than binary, 8, 16 or decimal, e.g. ternary, negative or imaginary radices, mixed radix {non-linear PCM (G06F7/4824 takes precedence)} · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2023401034A1 cover?
Disclosed herein are systems, methods, and computer-readable media for sorting datasets within a Processing in Memory (PIM)-based system. A request to sort a dataset stored in a 3D-stacked memory can be received. The request can identify a specific dataset and sorting criteria, which includes a plurality of keys. The dataset can be partitioned into several subarrays across various memory banks …
Who is the assignee on this patent?
Univ Virginia Patent Foundation
What technology area does this patent fall under?
Primary CPC classification G06F7/26. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Dec 14 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).