Radix sort acceleration using custom ASIC

US9928261B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9928261-B2
Application numberUS-201414582314-A
CountryUS
Kind codeB2
Filing dateDec 24, 2014
Priority dateJan 29, 2014
Publication dateMar 27, 2018
Grant dateMar 27, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An information processing system, computer readable storage medium, and method for accelerated radix sort processing of data elements in an array in memory. The information processing system stores an array of data elements in a buffer memory in an application specific integrated circuit radix sort accelerator. The array has a head end and a tail end. The system radix sort processing, with a head processor, data elements starting at the head end of the array and progressively advancing radix sort processing data elements toward the tail end of the array. The system radix sort processing, with a tail processor, data elements starting at the tail end of the array and progressively advancing radix sort processing data elements toward the head end of the array, the tail processor radix sort processing data elements in the array contemporaneously with the head processor radix sort processing data elements in the array.

First claim

Opening claim text (preview).

What is claimed is: 1. An information processing system comprising: at least one host processor; main memory, communicatively coupled with the at least one host processor; non-volatile memory, communicatively coupled with the at least one host processor; a radix sort memory manager stored in the non-volatile memory and communicatively coupled with the at least one host processor for radix sort processing data elements in an array in main memory; and an application specific integrated circuit radix sort accelerator (Accelerator), communicatively coupled with the at least one host processor and the main memory, the Accelerator and the host processor contemporaneously share access to the data elements in the array in the main memory in a data streaming architecture in which the Accelerator radix sort processes the data elements in the array in main memory, the Accelerator comprising: buffer memory for at least storing data elements from the array in main memory in a plurality of radix sort buckets in the buffer memory, the buffer memory capacity to store data elements from the array in the main memory being a small portion of the array of data elements in the main memory; a plurality of radix sort processors for radix sort processing the data elements in each radix sort bucket in the plurality of radix sort buckets in the buffer memory in the Accelerator; and a pre-fetching engine for copying a small portion of the data elements from the array in the main memory to the buffer memory in the Accelerator to be radix sort processed, and for transferring data elements, in the data streaming architecture, between the array in main memory and a radix sort bucket selected by the pre-fetching engine from the plurality of radix sort buckets in the buffer memory in the Accelerator, the pre-fetching engine predicting which data elements will be needed in a near future radix sort process in the Accelerator, transferring the needed data elements, and hiding memory latency when accessing the array of data elements in the main memory, wherein the pre-fetching engine transfers at least one data element from the array in main memory to the selected radix sort bucket, based on determining that a total number of data elements in the selected radix sort bucket reaches a low threshold of data elements remaining to be radix sort processed by the plurality of radix sort processors, and wherein the pre-fetching engine transfers at least one data element from the selected radix sort bucket to the array in main memory, based on determining that a total number of data elements in the selected radix sort bucket reaches a high threshold of data elements remaining to be radix sort processed by the plurality of radix sort processors. 2. The information processing system of claim 1 , wherein the pre-fetching engine transfers data elements between the array in main memory and the selected radix sort bucket contemporaneously with the plurality of radix sort processors radix sort processing data elements in the selected radix sort bucket. 3. The information processing system of claim 2 , wherein the pre-fetching engine transfers data elements between the array in main memory and the selected radix sort bucket before a total number of data elements to be radix sort processed in the selected radix sort bucket by the plurality of radix sort processors reaches zero. 4. The information processing system of claim 1 , wherein the radix sort memory manager, interoperating with the at least one host processor, communicates with a processor in the Accelerator to structurally decompose a large radix sorting problem of radix sorting data elements in the array into a set of multiple independent smaller radix sub-sorting problems of radix sorting data elements in the array, the Accelerator creating and informing the at least one host processor of the creation of such set of multiple independent smaller radix sub-sorting problems from the larger radix sorting problem, the radix sort memory manager assigning to the Accelerator a first independent smaller radix sub-sorting problem in the set of multiple independent smaller radix sub-sorting problems to be sorted by the Accelerator, and the radix sort memory manager assigning to the at least one host processor a second independent smaller radix sub-sorting problem in the set of multiple independent smaller radix sub-sorting problems to be sorted by the at least one host processor. 5. A computer readable storage medium, comprising computer instructions which, responsive to being executed by a processor of a plurality of processors of an information processing system, cause the processor to perform operations for accelerated radix sort processing of an array of data elements in main memory, the operations comprising: transferring a portion of data elements from an array of data elements being radix sort processed in main memory to an array of data elements in a radix sort bucket in a first memory in an application specific integrated circuit radix sort accelerator (Accelerator) to be radix sort processed in the Accelerator, the first memory capacity to store data elements from the array in the main memory being a small portion of the array of data elements in the main memory; storing the transferred data elements in the array of data elements in the radix sort bucket in the first memory, the array having a head end and a tail end; radix sort processing, with a head processor, data elements in the array of data elements in the radix sort bucket starting at the head end of the array and progressively advancing radix sort processing data elements toward the tail end of the array; and radix sort processing, with a tail processor, data elements in the array of data elements in the radix sort bucket starting at the tail end of the array and progressively advancing radix sort processing data elements toward the head end of the array, the tail processor radix sort processing data elements in the array contemporaneously with the head processor radix sort processing data elements in the array. 6. The computer readable storage medium of claim 5 , wherein the array of data elements in the radix sort bucket in the first memory comprises a first radix sort bucket in a plurality of radix sort buckets in the first memory, and the computer readable storage medium further comprising computer instructions which, responsive to being executed by a processor of a plurality of processors of the information processing system, cause the processor to perform operations comprising: radix sort processing, with at least one of the head processor and the tail processor, data elements in the first radix sort bucket by the at least one of the head processor and the tail processor using a respective one of a head pointer and tail pointer to point to each data element in the first radix sort bucket; applying a radix sort mask to a data element in the first radix sort bucket pointed to by a respective one of the head pointer and tail pointer, thereby identifying a significant radix sort symbol in the data element; determining, based on the identified significant radix sort symbol in the data element, whether the data element belongs in the first radix sort bucket according to a radix sort algorithm; and progressively advancing a value in the respective one of the head pointer and tail pointer to point to a next data element in the first radix sort bucket, based on determining that the data element belongs in the first radix sort bucket according to the radix sort algorithm. 7. The computer readable storage medium of claim 6 , wherein progressively advancing a value in the head pointer comprises updating a value in the head pointer to point with the value in the head pointer to a next data element in the first radi

Assignees

Inventors

Classifications

  • Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory · CPC title

  • Temporary buffering, e.g. using volatile buffer or dedicated buffer blocks · CPC title

  • Physics · mapped topic

  • Latency reduction · CPC title

  • Sorting, i.e. extracting data from one or more carriers, rearranging the data in numerical or other ordered sequence, and rerecording the sorted data on the original carrier or on a different carrier or set of carriers {sorting methods in general}(G06F7/36 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9928261B2 cover?
An information processing system, computer readable storage medium, and method for accelerated radix sort processing of data elements in an array in memory. The information processing system stores an array of data elements in a buffer memory in an application specific integrated circuit radix sort accelerator. The array has a head end and a tail end. The system radix sort processing, with a he…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F12/0238. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 27 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).