Radix sort acceleration using custom ASIC

US9953044B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9953044-B2
Application numberUS-201514750072-A
CountryUS
Kind codeB2
Filing dateJun 25, 2015
Priority dateJan 29, 2014
Publication dateApr 24, 2018
Grant dateApr 24, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An information processing system, computer readable storage medium, and method for accelerated radix sort processing of data elements in an array in memory. The information processing system stores an array of data elements in a buffer memory in an application specific integrated circuit radix sort accelerator. The array has a head end and a tail end. The system radix sort processing, with a head processor, data elements starting at the head end of the array and progressively advancing radix sort processing data elements toward the tail end of the array. The system radix sort processing, with a tail processor, data elements starting at the tail end of the array and progressively advancing radix sort processing data elements toward the head end of the array, the tail processor radix sort processing data elements in the array contemporaneously with the head processor radix sort processing data elements in the array.

First claim

Opening claim text (preview).

What is claimed is: 1. A method with an information processing system for accelerated radix sort processing of an array of data elements, the method comprising: transferring, with a pre-fetching engine, data elements from an array of data elements being radix sort processed in main memory to an array of data elements in a radix sort bucket in a first memory in an application specific integrated circuit radix sort accelerator (Accelerator) to be radix sort processed in the Accelerator, the Accelerator and a host processor that is separate from the Accelerator contemporaneously sharing access to the data elements in the array in the main memory in a data streaming architecture in which the Accelerator radix sort processes the data elements in the array in main memory, the first memory capacity to store data elements from the array in the main memory being a small portion of the array of data elements in the main memory; storing the transferred data elements in the array of data elements in the radix sort bucket in the first memory, the array having a head end and a tail end; radix sort processing, with a head processor, data elements in the array of data elements in the radix sort bucket starting at the head end of the array and progressively advancing radix sort processing data elements toward the tail end of the array; and radix sort processing, with a tail processor, data elements in the array of data elements in the radix sort bucket starting at the tail end of the array and progressively advancing radix sort processing data elements toward the head end of the array, the tail processor radix sort processing data elements in the array contemporaneously with the head processor radix sort processing data elements in the array. 2. The method of claim 1 , further comprising: terminating radix sort processing by at least one of the head processor and the tail processor, in response to determining that there are no more data elements remaining in the array of data elements in the radix sort bucket in the first memory that have not been radix sort processed with at least one of the head processor and the tail processor. 3. The method of claim 1 , wherein the array of data elements in the radix sort bucket in the first memory comprises a first radix sort bucket in a plurality of radix sort buckets in the first memory, the method further comprising: radix sort processing, with at least one of the head processor and the tail processor, data elements in the first radix sort bucket by the at least one of the head processor and the tail processor using a respective one of a head pointer and tail pointer to point to each data element in the first radix sort bucket; applying a radix sort mask to a data element in the first radix sort bucket pointed to by a respective one of the head pointer and tail pointer, thereby identifying a significant radix sort symbol in the data element; determining, based on the identified significant radix sort symbol in the data element, whether the data element belongs in the first radix sort bucket according to a radix sort algorithm; and progressively advancing a value in the respective one of the head pointer and tail pointer to point to a next data element in the first radix sort bucket, based on determining that the data element belongs in the first radix sort bucket according to the radix sort algorithm. 4. The method of claim 3 , wherein progressively advancing a value in the head pointer comprises updating a value in the head pointer to point with the value in the head pointer to a next data element in the first radix sort bucket starting at the head end of the first radix sort bucket and progressively advancing toward the tail end of the first radix sort bucket. 5. The method of claim 3 , wherein progressively advancing a value in the tail pointer comprises updating a value in the tail pointer to point with the value in the tail pointer to a next data element in the first radix sort bucket starting at the tail end of the first radix sort bucket and progressively advancing toward the head end of the first radix sort bucket. 6. The method of claim 3 , further comprising: swapping out the data element pointed to by the respective one of the head pointer and tail pointer from the first radix sort bucket into a second radix sort bucket in the plurality of radix sort buckets in the first memory, based on determining that the data element belongs in the second radix sort bucket according to the radix sort algorithm. 7. The method of claim 1 , wherein the array of data elements in the radix sort bucket in the first memory comprises a first radix sort bucket in a plurality of radix sort buckets in the first memory, and wherein the transferring comprises: transferring data elements, with a pre-fetching engine in the Accelerator, between the array in main memory and a radix sort bucket selected from the plurality of radix sort buckets in the first memory in the Accelerator, the pre-fetching engine in a data streaming architecture in which the Accelerator radix sort processes the data elements in the array in main memory, the pre-fetching engine predicting which data elements will be needed in a near future radix sort process in the Accelerator, transferring the needed data elements, and hiding memory latency when accessing the array of data elements in the main memory. 8. The method of claim 7 , wherein the transferring comprises transferring at least one data element from the array in main memory to the selected radix sort bucket, based on determining that a total number of data elements in the selected radix sort bucket reaches a low threshold of data elements remaining to be radix sort processed by the plurality of radix sort processors, and wherein the transferring comprises transferring at least one data element from the selected radix sort bucket to the array in main memory, based on determining that a total number of data elements in the selected radix sort bucket reaches a high threshold of data elements remaining to be radix sort processed by the plurality of radix sort processors.

Assignees

Inventors

Classifications

  • Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory · CPC title

  • Temporary buffering, e.g. using volatile buffer or dedicated buffer blocks · CPC title

  • Combined merging and sorting · CPC title

  • Physics · mapped topic

  • Latency reduction · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9953044B2 cover?
An information processing system, computer readable storage medium, and method for accelerated radix sort processing of data elements in an array in memory. The information processing system stores an array of data elements in a buffer memory in an application specific integrated circuit radix sort accelerator. The array has a head end and a tail end. The system radix sort processing, with a he…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F12/0238. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 24 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).