What technology area does this patent fall under?

Primary CPC classification G06F9/383. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu May 05 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method for performing random read access to a block of data using parallel lut read instruction in vector processors

US2016124651A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2016124651-A1
Application number	US-201514920365-A
Country	US
Kind code	A1
Filing date	Oct 22, 2015
Priority date	Nov 3, 2014
Publication date	May 5, 2016
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

This invention deals with the problem of paralleling random read access within a reasonably sized block of data for a vector SIMD processor. The invention sets up plural parallel look up tables, moves data from main memory to each plural parallel look up table and then employs a look up table read instruction to simultaneously move data from each parallel look up table to a corresponding part a vector destination register. This enables data processing by vector single instruction multiple data (SIMD) operations. This vector destination register load can be repeated if the tables store more used data. New data can be loaded into the original tables if appropriate. A level one memory is preferably partitioned as part data cache and part directly addressable memory. The look up table memory is stored in the directly addressable memory.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of data processing according to a predetermined algorithm having at least one data access pattern comprising the steps of: determining whether overhead of defining look up tables, moving data from memory to the look up tables and moving data to vector registers of each data access pattern is less than overhead of moving data to vector registers by plural scalar loads; and if the overhead of defining look up tables, moving data from memory to the look up tables and moving data to vector registers for a data access pattern is less than overhead of moving data to vector registers by plural scalar loads setting up plural parallel look up tables, moving data required by the algorithm from main memory to each of said plural parallel look up tables, simultaneously moving data from each of said parallel look up tables to corresponding locations of a vector destination register, and performing at least one vector single instruction multiple data (SIMD) operation upon data in said vector destination register. 2 . The method of data processing of claim 1 , wherein: said step of setting up plural look up tables includes selecting an element size corresponding to a data size of said data access pattern. 3 . The method of data processing of claim 2 , wherein: said step of selecting an element size corresponding to a data size of said data access pattern selects an element size greater than or equal to said data size of said data access pattern. 4 . The method of data processing of claim 1 , wherein: said step of setting up plural look up tables includes selecting a number of parallel tables corresponding to said selected element size relative to a data width of vector registers. 5 . The method of data processing of claim 1 , wherein: said step of setting up plural look up tables includes selecting a table size corresponding to a density of data elements accessed to maximize a number of data elements accessible in a single look up table read instruction. 6 . The method data processing of claim 5 , further comprising the steps of: partitioning a level one memory as part data cache and part directly addressable memory available as look up table memory; wherein said step of selecting a table size enabling said partitioning of the level one memory to include an amount of data cache greater than a minimum data cache required by the algorithm. 7 . The method data processing of claim 1 , further comprising the steps of: following performing the at least one vector single instruction multiple data (SIMD) operation, determining whether the algorithm may operate upon more data currently stored in the look up tables; if the algorithm may operate upon more data currently stored in the look up tables simultaneously moving further data from each of said parallel look up tables to corresponding locations of said vector destination register, and performing at least one further vector single instruction multiple data (SIMD) operation upon data in said vector destination register. 8 . The method data processing of claim 7 , further comprising the steps of: if the algorithm cannot operate upon more data currently stored in the look up tables, determining if the algorithm may operate on more data of the currently set up look up tables; if the algorithm may operate on more data of the currently set up look up tables moving further data required by the algorithm from main memory to each of said plural parallel look up tables, simultaneously moving further data from each of said parallel look up tables to corresponding locations of said vector destination register, and performing at least one further vector single instruction multiple data (SIMD) operation upon data in said vector destination register. 9 . The method of data processing of claim 1 , wherein: said step of simultaneously moving data from each of said parallel look up tables to corresponding locations of a vector destination register includes receiving a plurality of table indexes equal in number to said number of tables, said table indexes from corresponding locations of a vector source register, recalling from each table an element corresponding to a corresponding table index, and storing each recalled element in said vector destination register at a location corresponding to a location of said corresponding table index in said vector source register. 10 . The method of data processing of claim 9 , wherein: said vector destination register includes sixteen data slots; and upon selecting a number of tables equal to one, said step of storing each recalled element in said vector destination register at a location stores said recalled element in a first data slot. 11 . The method of data processing of claim 9 , wherein: said vector destination register includes sixteen data slots; and upon selecting a number of tables equal to two, said step of storing each recalled element in said vector destination register at a location stores a first recalled element in a first data slot and a second recalled element in a ninth data slot. 12 . The method of data processing of claim 9 , wherein: said vector destination register includes sixteen data slots; and upon selecting a number of tables equal to four, said step of storing each recalled element in said vector destination register at a location stores a first recalled element in a first data slot, a second recalled element in a fifth data slot, a third recalled element in a ninth data slot and a fourth recalled element in a thirteenth data slot. 13 . The method of data processing of claim 9 , wherein: said vector destination register includes sixteen data slots; and upon selecting a number of tables equal to eight, said step of storing each recalled element in said vector destination register at a location stores a first recalled element in a first data slot, a second recalled element in a third data slot, a third recalled element in a fifth data slot and a fourth recalled element in a seventh data slot, fifth recalled element in a ninth data slot, a sixth recalled element in an eleventh data slot, a seventh recalled element in s thirteenth data slot and an eight recalled element in a fifteenth data slot. 14 . The method of data processing of claim 9 , wherein: said vector destination register includes sixteen data slots; and upon selecting a number of tables equal to sixteen, said step of storing each recalled element in said vector destination register at a location stores a first recalled element in a first data slot, a second recalled element in a second data slot, a third recalled element in a third data slot and a fourth recalled element in a fourth data slot, a fifth recalled element in a fifth data slot, a sixth recalled element in an sixth data slot, a seventh recalled element in a seventh data slot, an eight recalled element in a eighth data slot, a ninth recalled element in a ninth data slot, a tenth recalled element in a tenth data slot, an eleventh recalled element in an eleventh data slot, a twelfth recalled element in a twelfth data slot, a thirteenth recalled element in a thirteenth data slot, a fourteenth recalled element in a fourteenth data slot, a fifteenth recalled element in a fifteenth data slot and a sixteenth recalled element in a sixteenth data slot. 15 . The method of data processing of claim 9 , wherein: said table indexes are not related to said corresponding elements as a function argument to a function value.

Assignees

Texas Instruments Inc

Inventors

Classifications

G06F9/383Primary
Operand prefetching (cache prefetching G06F12/0862) · CPC title
G06F9/30043
LOAD or STORE instructions; Clear instruction · CPC title
G06F9/3004
to perform operations on memory · CPC title
G06F3/0647
Migration mechanisms · CPC title
G06F3/0673
Single storage device · CPC title

Patent family

Related publications grouped by family.

View patent family 55852690

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016124651A1 cover?: This invention deals with the problem of paralleling random read access within a reasonably sized block of data for a vector SIMD processor. The invention sets up plural parallel look up tables, moves data from main memory to each plural parallel look up table and then employs a look up table read instruction to simultaneously move data from each parallel look up table to a corresponding part a…
Who is the assignee on this patent?: Texas Instruments Inc
What technology area does this patent fall under?: Primary CPC classification G06F9/383. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu May 05 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).