No-locality hint vector memory access processors, methods, systems, and instructions

US2016019184A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016019184-A1
Application numberUS-201414335006-A
CountryUS
Kind codeA1
Filing dateJul 18, 2014
Priority dateJul 18, 2014
Publication dateJan 21, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode a no-locality hint vector memory access instruction. The no-locality hint vector memory access instruction to indicate a packed data register of the plurality of packed data registers that is to have a source packed memory indices. The source packed memory indices to have a plurality of memory indices. The no-locality hint vector memory access instruction is to provide a no-locality hint to the processor for data elements that are to be accessed with the memory indices. The processor also includes an execution unit coupled with the decode unit and the plurality of packed data registers. The execution unit, in response to the no-locality hint vector memory access instruction, is to access the data elements at memory locations that are based on the memory indices.

First claim

Opening claim text (preview).

What is claimed is: 1 . A processor comprising: a plurality of packed data registers; a decode unit to decode a no-locality hint vector memory access instruction, the no-locality hint vector memory access instruction to indicate a packed data register of the plurality of packed data registers that is to have a source packed memory indices, the source packed memory indices to have a plurality of memory indices, wherein the no-locality hint vector memory access instruction is to provide a no-locality hint to the processor for data elements that are to be accessed with the memory indices; and an execution unit coupled with the decode unit and the plurality of packed data registers, the execution unit, in response to the no-locality hint vector memory access instruction, to access the data elements at memory locations that are based on the memory indices. 2 . The processor of claim 1 , further comprising a cache hierarchy, wherein the no-locality hint vector memory access instruction comprises a no-locality hint vector load instruction, wherein the execution unit, in response to the no-locality hint vector load instruction, is to load the data elements from the memory locations, and wherein the cache hierarchy, in response to the no-locality hint vector load instruction, is not to cache the data elements loaded from the memory locations. 3 . The processor of claim 1 , further comprising a cache hierarchy, wherein the no-locality hint vector memory access instruction comprises a no-locality hint vector load instruction, wherein the execution unit, in response to the no-locality hint vector load instruction, is to load the data elements from the memory locations, and wherein the cache hierarchy, in response to the no-locality hint vector load instruction, upon a cache miss for a data element, is not to allocate space in the cache hierarchy for the data element that is to be loaded from memory. 4 . The processor of claim 1 , further comprising a cache hierarchy, wherein the no-locality hint vector memory access instruction comprises a no-locality hint vector load instruction, wherein the execution unit, in response to the no-locality hint vector load instruction, is to load the data elements from the memory locations, and wherein the cache hierarchy, in response to the no-locality hint vector load instruction, upon a cache hit for a data element, is to output no more than half a cache line from the cache hierarchy. 5 . The processor of claim 4 , wherein the cache hierarchy, in response to the no-locality hint vector load instruction, upon the cache hit for the data element, is to output no more than a single data element from the cache hierarchy. 6 . The processor of claim 1 , further comprising a memory controller, wherein the no-locality hint vector memory access instruction comprises a no-locality hint vector load instruction, and wherein the memory controller, in response to the no-locality hint vector load instruction, is to load no more than half a cache line amount of data, for each of the data elements loaded from memory. 7 . The processor of claim 6 , wherein the memory controller, in response to the no-locality hint vector load instruction, is to load no more than 128-bits for each of the data elements loaded from memory. 8 . The processor of claim 1 , wherein the no-locality hint vector memory access instruction comprises a no-locality hint gather instruction, wherein the no-locality hint gather instruction is to indicate a destination packed data register of the plurality of packed data registers, wherein the execution unit, in response to the no-locality hint gather instruction, is to store a packed data result in the destination packed data register, and wherein the packed data result is to include the data elements gathered from the memory locations. 9 . The processor of claim 1 , further comprising a memory controller, wherein the no-locality hint vector memory access instruction comprises a no-locality hint vector write instruction, wherein the execution unit, in response to the no-locality hint vector write instruction, is to write data elements of a source packed data indicated by the instruction over the data elements at the memory locations, and wherein the memory controller, in response to the no-locality hint vector write instruction, is to write no more than half a cache line amount of data, for each of the data elements of the source packed data that is written to memory. 10 . The processor of claim 1 , further comprising a cache hierarchy, wherein the no-locality hint vector memory access instruction comprises a no-locality hint vector write instruction, wherein the execution unit, in response to the no-locality hint vector write instruction, is to write data elements of a source packed data indicated by the instruction over the data elements at the memory locations, and wherein the cache hierarchy, in response to the no-locality hint vector write instruction, upon a cache hit for a data element in a lower level cache, is not to bring a cache line associated with the cache hit into a higher level cache. 11 . The processor of claim 1 , wherein the no-locality hint vector memory access instruction comprises a no-locality hint scatter instruction, wherein the no-locality hint scatter instruction is to indicate a second packed data register of the plurality of packed data registers that is to have a source packed data that is to include a plurality of data elements, wherein the execution unit, in response to the no-locality hint scatter instruction, is to write the data elements of the source packed data over the data elements at the memory locations. 12 . The processor of claim 1 , wherein the decode unit is to decode the no-locality hint vector memory access instruction that is to have at least one bit that is to have a first value to indicate the no-locality hint, and is to have a second value to indicate lack of the no-locality hint. 13 . The processor of claim 1 , wherein the decode unit is to decode the no-locality hint vector memory access instruction that is to have a plurality of bits that are to have a first value to indicate that the no-locality hint is a no-temporal locality hint, a second value to indicate that the no-locality hint is a no-spatial locality hint, and a third value to indicate that the no-locality hint is a no-temporal and no-spatial locality hint. 14 . The processor of claim 1 , wherein the decode unit is to decode the no-locality hint vector memory access instruction that is to indicate a source packed data operation mask. 15 . A method in a processor comprising: receiving a no-locality hint vector memory access instruction, the no-locality hint vector memory access instruction indicating a source packed memory indices having a plurality of memory indices, wherein the no-locality hint vector memory access instruction provides a no-locality hint to the processor for data elements that are to be accessed with the memory indices; and accessing the data elements at memory locations that are based on the memory indices in response to the no-locality hint vector memory access instruction. 16 . The method of claim 15 , wherein receiving the no-locality hint vector memory access instruction comprises receiving a no-locality hint vector load instruction, wherein accessing comprises loading the data elements from the memory locations, and further comprising omitting caching data elements that are loaded from memory in a cache hierarchy. 17 . The method of claim 15 , wherein receiving the no-locality hint vector memory access i

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016019184A1 cover?
A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode a no-locality hint vector memory access instruction. The no-locality hint vector memory access instruction to indicate a packed data register of the plurality of packed data registers that is to have a source packed memory indices. The source packed memory indices to have a plurality of memory in…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F12/0811. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jan 21 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).