No-locality hint vector memory access processors, methods, systems, and instructions

US9600442B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9600442-B2
Application numberUS-201414335006-A
CountryUS
Kind codeB2
Filing dateJul 18, 2014
Priority dateJul 18, 2014
Publication dateMar 21, 2017
Grant dateMar 21, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode a no-locality hint vector memory access instruction. The no-locality hint vector memory access instruction to indicate a packed data register of the plurality of packed data registers that is to have a source packed memory indices. The source packed memory indices to have a plurality of memory indices. The no-locality hint vector memory access instruction is to provide a no-locality hint to the processor for data elements that are to be accessed with the memory indices. The processor also includes an execution unit coupled with the decode unit and the plurality of packed data registers. The execution unit, in response to the no-locality hint vector memory access instruction, is to access the data elements at memory locations that are based on the memory indices.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor comprising: a plurality of packed data registers; a cache hierarchy; a decode unit to decode a no-locality hint vector memory access instruction, the no-locality hint vector memory access instruction to indicate a packed data register of the plurality of packed data registers that is to have a source packed memory indices, the source packed memory indices to have a plurality of memory indices, wherein the no-locality hint vector memory access instruction is to provide a no-locality hint to the processor for data elements that are to be accessed with the memory indices; and an execution unit coupled with the decode unit and the plurality of packed data registers, the execution unit, in response to the no-locality hint vector memory access instruction, to access the data elements at memory locations that are based on the memory indices, wherein the no-locality hint vector memory access instruction comprises a no-locality hint vector load instruction, wherein the execution unit, in response to the no-locality hint vector load instruction, is to load the data elements from the memory locations, and wherein the cache hierarchy, in response to the no-locality hint vector load instruction, upon a cache hit for a data element, is to output no more than half a cache line from the cache hierarchy. 2. The processor of claim 1 , wherein the cache hierarchy, in response to the no-locality hint vector load instruction, is not to cache data elements loaded from the memory locations which do not hit in the cache hierarchy. 3. The processor of claim 1 , wherein the cache hierarchy, in response to the no-locality hint vector load instruction, upon a cache miss for a data element, is not to allocate space in the cache hierarchy for the data element that is to be loaded from memory. 4. The processor of claim 1 , wherein the cache hierarchy, in response to the no-locality hint vector load instruction, upon the cache hit for the data element, is to output no more than a single data element from the cache hierarchy. 5. The processor of claim 1 , wherein the no-locality hint vector memory access instruction comprises a no-locality hint gather instruction, wherein the no-locality hint gather instruction is to indicate a destination packed data register of the plurality of packed data registers, wherein the execution unit, in response to the no-locality hint gather instruction, is to store a packed data result in the destination packed data register, and wherein the packed data result is to include the data elements gathered from the memory locations. 6. The processor of claim 1 , wherein the decode unit is to decode the no-locality hint vector memory access instruction that is to have at least one bit that is to have a first value to indicate the no-locality hint, and is to have a second value to indicate lack of the no-locality hint. 7. The processor of claim 1 , wherein the decode unit is to decode the no-locality hint vector memory access instruction that is to have a plurality of bits that are to have a first value to indicate that the no-locality hint is a no-temporal locality hint, a second value to indicate that the no-locality hint is a no-spatial locality hint, and a third value to indicate that the no-locality hint is a no-temporal and no-spatial locality hint. 8. The processor of claim 1 , wherein the decode unit is to decode the no-locality hint vector memory access instruction that is to indicate a source packed data operation mask. 9. A processor comprising: a plurality of packed data registers; a decode unit to decode a no-locality hint vector memory access instruction, the no-locality hint vector memory access instruction to indicate a packed data register of the plurality of packed data registers that is to have a source packed memory indices, the source packed memory indices to have a plurality of memory indices, wherein the no-locality hint vector memory access instruction is to provide a no-locality hint to the processor for data elements that are to be accessed with the memory indices; an execution unit coupled with the decode unit and the plurality of packed data registers, the execution unit, in response to the no-locality hint vector memory access instruction, to access the data elements at memory locations that are based on the memory indices; and a memory controller, wherein the no-locality hint vector memory access instruction comprises a no-locality hint vector load instruction, and wherein the memory controller, in response to the no-locality hint vector load instruction, is to load no more than half a cache line amount of data, for each of the data elements loaded from memory. 10. The processor of claim 9 , wherein the memory controller, in response to the no-locality hint vector load instruction, is to load no more than 128-bits for each of the data elements loaded from memory. 11. A processor comprising: a plurality of packed data registers; a decode unit to decode a no-locality hint vector memory access instruction, the no-locality hint vector memory access instruction to indicate a packed data register of the plurality of packed data registers that is to have a source packed memory indices, the source packed memory indices to have a plurality of memory indices, wherein the no-locality hint vector memory access instruction is to provide a no-locality hint to the processor for data elements that are to be accessed with the memory indices; an execution unit coupled with the decode unit and the plurality of packed data registers, the execution unit, in response to the no-locality hint vector memory access instruction, to access the data elements at memory locations that are based on the memory indices; and a memory controller, wherein the no-locality hint vector memory access instruction comprises a no-locality hint vector write instruction, wherein the execution unit, in response to the no-locality hint vector write instruction, is to write data elements of a source packed data indicated by the instruction over the data elements at the memory locations, and wherein the memory controller, in response to the no-locality hint vector write instruction, is to write no more than half a cache line amount of data, for each of the data elements of the source packed data that is written to memory. 12. A method in a processor comprising: receiving a no-locality hint vector memory access instruction, the no-locality hint vector memory access instruction indicating a source packed memory indices having a plurality of memory indices, wherein the no-locality hint vector memory access instruction provides a no-locality hint to the processor for data elements that are to be accessed with the memory indices, wherein receiving the no-locality hint vector memory access instruction comprises receiving a no-locality hint vector load instruction; and accessing the data elements at memory locations that are based on the memory indices in response to the no-locality hint vector memory access instruction, wherein accessing comprises loading the data elements from the memory locations, including loading no more than half a cache line amount of data, for each data element loaded from memory. 13. The method of claim 12 , further comprising omitting caching data elements that are loaded from memory in a cache hierarchy. 14. The method of claim 12 , further comprising, upon a cache hit for a data element in a cache hierarchy, outputting no more than half a cache line from the cache hierarchy. 15. An article of manufacture comprising a non-transitory machine-readable storage medium, the non-transitory machine-

Assignees

Inventors

Classifications

  • using a cache · CPC title

  • Arrangements for executing machine instructions, e.g. instruction decode (for executing microinstructions G06F9/22) · CPC title

  • with two or more cache hierarchy levels (with multilevel cache hierarchies G06F12/0811) · CPC title

  • Prefetching based on hints or prefetch instructions · CPC title

  • with prefetch · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9600442B2 cover?
A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode a no-locality hint vector memory access instruction. The no-locality hint vector memory access instruction to indicate a packed data register of the plurality of packed data registers that is to have a source packed memory indices. The source packed memory indices to have a plurality of memory in…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F15/8069. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 21 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).