Hybrid parallelization strategies for machine learning programs on top of mapreduce
US-2015378696-A1 · Dec 31, 2015 · US
US10228956B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10228956-B2 |
| Application number | US-201615282266-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 30, 2016 |
| Priority date | Sep 30, 2016 |
| Publication date | Mar 12, 2019 |
| Grant date | Mar 12, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In one implementation, a processing device is provided that includes a memory to store instructions and a processor core to execute the instructions. The processor core is to receive a sequence of instructions reordered by a binary translator for execution. A first load of the sequence of instructions is identified. The first load references a memory location that stores a data item to be loaded. An occurrence of a second load is detected. The second load to access the memory location subsequent to an execution of the first load instruction. A protection field in the first load is enabled based on the detected occurrence of the second load. The enabled protection field indicates that the first load is to be checked for an aliasing associated with the memory location with respect to a subsequent store instruction. The second load is eliminated based on the enabled of the protection field.
Opening claim text (preview).
What is claimed is: 1. A processing device comprising: a memory to store a plurality of instructions; a processor core, operatively coupled to the memory, to execute the instructions, the processor core to: receive a sequence of instructions reordered by a binary translator for execution by the processor core; identify a first load instruction of the sequence of instructions, the first load instruction to reference a memory location that stores a data item to be loaded; detect an occurrence of a second load instruction of the sequence of instructions, the second load instruction to access the memory location subsequent to an execution of the first load instruction; enable a protection field in the first load instruction based on the detected occurrence of the second load instruction, the enabled protection field to indicate that the first load instruction is to be checked for an aliasing associated with the memory location with respect to an execution of a subsequent store instruction; and eliminate the second load instruction from the sequence of instructions that are reordered based on the protection field in the first load instruction. 2. The processing device of claim 1 , wherein to check for the aliasing associated with the memory location, the processor core is further to determine whether the store instruction is to execute an intermediate store operation between an execution of the first load instruction and the second load instruction. 3. The processing device of claim 1 , wherein the processor core is further to, responsive to enabling the protection field in the first load instruction, determine an alias set identifier for the first load instruction, the alias set identifier to identify a grouping of speculated memory accesses associated with the memory location. 4. The processing device of claim 3 , wherein the processor core is further to: identify a store instruction to be executed subsequent to the first load instruction; and enable a C-bit field of the store instruction based on the enabling of the protection field in the first load instruction. 5. The processing device of claim 4 , wherein the processor core is further to incorporate the alias set identifier of the first load instruction into the store instruction based on the enabled C-bit field. 6. The processing device of claim 5 , wherein the processor core is further to, responsive to detecting the enabled C-bit field of the store instruction: identify a plurality of load instructions having an enabled protection field, the plurality of load instructions being in the grouping of speculated memory accesses identified by the alias set identifier; and check the plurality of load instructions for the aliasing of the memory location. 7. The processing device of claim 6 , wherein responsive to detecting the enabled C-bit field of the store instruction, the processor core is further to disable the protection filed in each of the plurality of load instructions based on the check. 8. A method, comprising: detecting, by a processing device, a load instruction associated with a memory location, the load instruction is at least one of a sequence of instructions reordered by a binary translator for execution by the processing device; detecting, by the processing device, a store instruction of the sequence of instructions, the store instruction to access the memory location subsequent to an execution of the load instruction; responsive to detecting the store instruction, determining whether a protection field of the load instruction is enabled; responsive to detecting the protection field is enabled, checking, by the processing device, the load instruction for aliasing information associated with the memory location with respect to an execution of the store instruction; and determining, by the processing device, whether to eliminate the store instruction from the sequence of instructions that are reordered based on the aliasing information. 9. The method of claim 8 , further comprising responsive to detecting the aliasing information associated with the memory location, generating a fault condition. 10. The method of claim 8 , further comprising identifying an alias set identifier in the store instruction, the alias set identifier identifying a grouping of speculated memory accesses associated with the memory location. 11. The method of claim 10 , further comprising determining whether the load instruction is in the grouping of speculated memory accesses based on the alias set identifier. 12. The method of claim 11 , further comprising responsive to determining that the load instruction is in the grouping of speculated memory accesses, determining whether a C-bit field of the store instruction is enabled. 13. The method of claim 12 , further comprising responsive to detecting that C-bit field of the store instruction is enabled, disabling the protection field of the load instruction subsequent to an execution of the store instruction. 14. A non-transitory computer-readable medium comprising instructions that, when executed by a processing device, cause the processing device to: receive, by the processing device, a sequence of instructions reordered by a binary translator for execution by the processing device; identify a first load instruction of the sequence of instructions, the first load instruction to reference a memory location that stores a data item to be loaded; detect an occurrence of a second load instruction of the sequence of instructions, the second load instruction to access the memory location subsequent to an execution of the first load instruction; enable a protection field in the first load instruction based on the detected occurrence of the second load instruction, the enabled protection field to indicate that the first load instruction is to be checked for an aliasing associated with the memory location with respect to an execution of a subsequent store instruction; and eliminate the second load instruction from the sequence of instructions that are reordered based on the protection field in the first load instruction. 15. The non-transitory computer-readable medium of claim of claim 14 , wherein to enable the protection field, the binary translator is further to set a bit of the protection field to a value. 16. The non-transitory computer-readable medium of claim of claim 14 , wherein the processing device is further to, responsive to enabling a P-bit field in the first load instruction, determine an alias set identifier for the first load instruction, the alias set identifier to identify a grouping of speculated memory accesses associated with the memory location. 17. The non-transitory computer-readable medium of claim of claim 16 , wherein the processing device is further to: identify a store instruction to be executed subsequent to the first load instruction; and enable a C-bit field of the store instruction based on the enabling of the protection field in the first load instruction. 18. The non-transitory computer-readable medium of claim 17 , wherein the processing device is further to incorporate the alias set identifier of the first load instruction into the store instruction based on the enabled C-bit field. 19. The non-transitory computer-readable medium of claim 18 , wherein the processing device is further to, responsive to detecting the enabled C-bit field of the store instruction: identify a plurality of load instructions having an enabled protection field, the plurality of load instructions being in the grouping of spec
Maintaining memory consistency · CPC title
LOAD or STORE instructions; Clear instruction · CPC title
Binary to binary · CPC title
Reducing the memory space required by the program code · CPC title
Runtime instruction translation, e.g. macros · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.