What technology area does this patent fall under?

Primary CPC classification G06F15/8007. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 12 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Efficient hardware instructions for single instruction multiple data processors

US10229089B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10229089-B2
Application number	US-201715639110-A
Country	US
Kind code	B2
Filing date	Jun 30, 2017
Priority date	Dec 8, 2011
Publication date	Mar 12, 2019
Grant date	Mar 12, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and apparatus for efficiently processing data in various formats in a single instruction multiple data (“SIMD”) architecture is presented. Specifically, a method to unpack a fixed-width bit values in a bit stream to a fixed width byte stream in a SIMD architecture is presented. A method to unpack variable-length byte packed values in a byte stream in a SIMD architecture is presented. A method to decompress a run length encoded compressed bit-vector in a SIMD architecture is presented. A method to return the offset of each bit set to one in a bit-vector in a SIMD architecture is presented. A method to fetch bits from a bit-vector at specified offsets relative to a base in a SIMD architecture is presented. A method to compare values stored in two SIMD registers is presented.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor configured to fetch values of bits stored in a bit-vector that are indexed by an index-vector, comprising: a first SIMD register; and a second SIMD register; wherein the index-vector comprises a contiguous series of codes; wherein each code in the index-vector represents an index value of a bit in the bit-vector; the processor is configured to respond to a set of instructions to fetch values of bits stored in the bit-vector that are indexed by the index-vector by: establishing a first plurality of partitions within the first SIMD register; establishing a second plurality of partitions within the second SIMD register; wherein each partition in the first plurality of partitions has a corresponding partition in the second plurality of partitions; loading a copy of the bit-vector into each partition of the first plurality of partitions; loading the second plurality of partitions with contiguous codes from the index-vector; wherein loading the second plurality of partitions with contiguous codes comprises loading each partition of the second plurality of partitions with a single code from the index-vector; performing a variable shift on the copy of the bit-vector that is loaded in each partition of the first plurality of partitions; wherein the amount of the variable shift on each copy of the bit-vector is based on the code stored in the partition, of the second plurality of partitions, that corresponds to the partition, of the first plurality of partitions, in which the copy is stored; for each partition in the first plurality of partitions, causing a bit at a particular position within the partition to be moved to an output bit-vector. 2. The processor of claim 1 , wherein the processor is further configured such that performing the variable shift on the copy of the bit-vector that is loaded in each partition of the first plurality of partitions places a targeted bit of each copy of the bit-vector in the little endian position of each partition of the first plurality of partitions. 3. The processor of claim 1 , wherein the processor is further configured such that causing the bit at the particular position within the partition to be moved to the output bit-vector involves performing a final move mask operation to gather the targeted bits in each partition. 4. A processor configured to fetch values of bits stored in a bit-vector that are indexed by an index-vector, comprising: a first SIMD register; and a second SIMD register; wherein the index-vector comprises a contiguous series of codes; wherein each code in the index-vector represents an index value of a bit in the bit-vector; wherein the processor is configured to respond to a set of instructions to fetch values of bits stored in the bit-vector that are indexed by the index-vector by: establishing a first plurality of partitions within the first SIMD register; loading the first plurality of partitions with contiguous codes from the index-vector; wherein loading the first plurality of partitions with contiguous codes comprises loading each partition of the first plurality of partitions with a single code from the index-vector; determining a plurality of byte offsets by dividing the codes in each partition of the first plurality of partitions by 8; based on the plurality of byte offsets, loading a corresponding plurality of bytes from the bit-vector into a second plurality of partitions of the second SIMD register; wherein each partition in the first plurality of partitions has a corresponding partition in the second plurality of partitions; based on the contiguous codes loaded in the first plurality of partitions, determining a target bit position for each byte in the plurality of bytes; performing a variable shift on each byte, of the plurality of bytes, that is loaded in each partition of the second plurality of partitions; wherein the amount of the variable shift on each byte of the plurality of bytes is based on the target bit position determined for each byte in the plurality of bytes; for each partition in the second plurality of partitions, causing a bit at a particular position within the partition to be moved to an output bit-vector. 5. The processor of claim 4 , wherein the processor is further configured to respond to a set of instructions to load the corresponding plurality of bytes from the bit-vector into a second plurality of partitions of the second SIMD register by performing a gather operation on the bit vector. 6. The processor of claim 4 , wherein the processor is further configured to respond to a set of instructions to determine a target bit position for each byte in the plurality of bytes by performing a modulo by eight operation on the codes in each partition of the first plurality of partitions to obtain the target bit position for each byte in the plurality of bytes. 7. The processor of claim 4 , wherein the processor is further configured such that performing the variable shift on each byte, of the plurality of bytes, that is loaded in each partition of the second plurality of partitions, further comprises shifting byte to place a targeted bit in the low-end position of each partition of the second plurality of partitions. 8. The processor of claim 4 , wherein the processor is further configured such that causing a bit to at a particular position within the partition to be moved to an output bit-vector involves performing a final move mask operation to gather the targeted bits in each partition. 9. A processor configured to fetch values of bits stored in a bit-vector that are indexed by an index-vector, comprising: a first SIMD register; and a second SIMD register; wherein the index-vector comprises a contiguous series of codes; wherein each code in the index-vector represents an index value of a bit in the bit-vector; wherein the processor implements multiple techniques for fetching values of bits stored in a bit-vector that are indexed by an index-vector; wherein the multiple techniques include a first technique and a second technique; wherein width of the SIMD registers is N bits; wherein the number of bits in each code is K; wherein the processor is configured to respond to a set of instructions to fetch values of bits stored in the bit-vector that are indexed by the index-vector by: determining whether 2^K<=N; responsive to determining that 2^K<=N, performing the first technique to fetching values of bits stored in the bit-vector that are indexed by the index-vector; and responsive to determining that 2^K>N, performing the second technique to fetch values of bits stored in the bit-vector that are indexed by the index-vector. 10. The processor of claim 9 , wherein the processor is configured to respond to a set of instructions to perform the first technique to fetch values of bits stored in the bit-vector that are indexed by the index-vector by: loading a copy of the bit-vector into each partition of a first plurality of partitions of the first SIMD register; loading contiguous codes from the index-vector into each partition of a second plurality of partitions of the second SIMD register; performing a variable shift on the copy of the bit-vector that is loaded in each partition of the first plurality of partitions based on the code stored in the partition, of the second plurality of partitions, that corresponds to the partition, of the first plurality of partitions, in which the copy is stored; and for each partition in the first plurality of partitions, causing a bit at a particular position within the partition to be moved to an output bit-vector. 11. The processor of claim 9 , wherein the processor is configured to re

Assignees

Oracle Int Corp

Inventors

Classifications

G06F17/30315
Physics · mapped topic
G06F9/30036
Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title
G06F15/8007Primary
single instruction multiple data [SIMD] multiprocessors · CPC title
G06F9/30038
using a mask · CPC title
G06F16/221Primary
Column-oriented storage; Management thereof · CPC title

Patent family

Related publications grouped by family.

View patent family 49879427

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10229089B2 cover?: A method and apparatus for efficiently processing data in various formats in a single instruction multiple data (“SIMD”) architecture is presented. Specifically, a method to unpack a fixed-width bit values in a bit stream to a fixed width byte stream in a SIMD architecture is presented. A method to unpack variable-length byte packed values in a byte stream in a SIMD architecture is presented. A…
Who is the assignee on this patent?: Oracle Int Corp
What technology area does this patent fall under?: Primary CPC classification G06F15/8007. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 12 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Efficient hardware instructions for single instruction multiple data processors: fast fixed-length value compression

Use Of Dynamic Dictionary Encoding With An Associated Hash Table To Support Many-To-Many Joins And Aggregations

Techniques for evaluating query predicates during in-memory table scans

Aggregation framework system architecture and method

Query execution plan revision for error recovery

Frequently asked questions