System and method for adapting executable object to a processing unit

US11550600B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11550600-B2
Application numberUS-202017090295-A
CountryUS
Kind codeB2
Filing dateNov 5, 2020
Priority dateNov 7, 2019
Publication dateJan 10, 2023
Grant dateJan 10, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments are generally directed to a system and method for adapting executable object to a processing unit. An embodiment of a method to adapt an executable object from a first processing unit to a second processing unit, comprises: adapting the executable object optimized for the first processing unit of a first architecture, to the second processing unit of a second architecture, wherein the second architecture is different from the first architecture, wherein the executable object is adapted to perform on the second processing unit based on a plurality of performance metrics collected while the executable object is performed on the first processing unit and the second processing unit.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: adapting an executable object optimized for a first processing unit of a first architecture, to a second processing unit of a second architecture, wherein the second architecture is different from the first architecture, wherein the executable object is adapted to perform on the second processing unit based on a plurality of performance metrics collected while the executable object is performed on the first processing unit and the second processing unit; identifying a performance aspect of the executable object; and determining whether an identified performance aspect is present in a database that defines a correspondence between the performance aspect and an adaptation operation. 2. The method of claim 1 , further comprising: wherein identifying the performance aspect of the executable object is based on a first plurality of performance metrics of the executable object while the executable object is performed on the first processing unit and a second plurality of performance metrics of the executable object while the executable object is performed on the second processing unit, wherein the plurality of performance metrics include the first and second plurality of performance metrics; and applying an adaptation operation in the database that corresponds to the identified performance aspect to the executable object in response to a determination that the identified performance aspect is present in the database. 3. The method of claim 2 , wherein the database further includes architectural changes corresponding to the performance aspect, and said applying the adaptation operation to the executable object comprises: determining architectural changes of the second architecture with respect to the first architecture based on the identified performance aspect; and applying the adaptation operation to the executable object that corresponds to the determined architectural changes, wherein the identified performance aspect includes an instruction cache utilization, a constant cache utilization, a data cache utilization, and a data processing efficiency. 4. The method of claim 3 , wherein the instruction cache utilization includes an instruction cache latency, and the adaptation operation corresponding to the instruction cache latency includes disabling loop unrolling, wherein the constant cache utilization includes a constant cache latency coverage, and the adaptation operation corresponding to the constant cache latency coverage includes constant folding, wherein the data cache utilization includes a data cache miss ratio, and the adaptation operation corresponding to the data cache miss ratio includes decreasing a working set or changing a data access pattern, and wherein the data processing efficiency includes a calculation throughput, and the adaptation operation corresponding to the calculation throughput includes reducing an instruction count, wherein the performance aspect is identified using a machine learning based algorithm or a decision tree flow. 5. The method of claim 3 , further comprising presenting the determined architectural changes in response to a determination that the identified performance aspect is not present in the database, wherein the first processing unit is a graphics processing unit supporting SIMD architecture and the second processing unit is a graphics processing unit supporting SIMT architecture. 6. The method of claim 2 , further comprising presenting the identified performance aspect in response to a determination that the identified performance aspect is not present in the database. 7. An apparatus comprising: a processor to: adapt an executable object optimized for a first processing unit of a first architecture, to a second processing unit of a second architecture, wherein the second architecture is different from the first architecture, wherein the executable object is adapted to perform on the second processing unit based on a plurality of performance metrics collected while the executable object is performed on the first processing unit and the second processing unit; identify a performance aspect of the executable object; and determine whether an identified performance aspect is present in a database that defines a correspondence between the performance aspect and an adaptation operation. 8. The apparatus of claim 7 , wherein to identify the performance aspect of the executable object is based on a first plurality of performance metrics of the executable object while the executable object is performed on the first processing unit and a second plurality of performance metrics of the executable object while the executable object is performed on the second processing unit, wherein the plurality of performance metrics include the first and second plurality of performance metrics, wherein the processor is further to; apply an adaptation operation in the database that corresponds to the identified performance aspect to the executable object in response to a determination that the identified performance aspect is present in the database. 9. The apparatus of claim 8 , wherein the database further includes architectural changes corresponding to the performance aspect, and when applying the adaptation operation to the executable object, the processor is further to: determine architectural changes of the second architecture with respect to the first architecture based on the identified performance aspect; and apply the adaptation operation to the executable object that corresponds to the determined architectural changes, wherein the identified performance aspect includes an instruction cache utilization, a constant cache utilization, a data cache utilization, and a data processing efficiency. 10. The apparatus of claim 9 , wherein the instruction cache utilization includes an instruction cache latency, and the adaptation operation corresponding to the instruction cache latency includes disabling loop unrolling, wherein the constant cache utilization includes a constant cache latency coverage, and the adaptation operation corresponding to the constant cache latency coverage includes constant folding, wherein the data cache utilization includes a data cache miss ratio, and the adaptation operation corresponding to the data cache miss ratio includes decreasing a working set or changing a data access pattern, and wherein the data processing efficiency includes a calculation throughput, and the adaptation operation corresponding to the calculation throughput includes reducing an instruction count, wherein the performance aspect is identified using a machine learning based algorithm or a decision tree flow, wherein the performance aspect is identified using a machine learning based algorithm or a decision tree flow. 11. The apparatus of claim 10 , wherein the processor is further to present the identified performance aspect in response to a determination that the identified performance aspect is not present in the database. 12. The apparatus of claim 10 , wherein the processor is further to present the determined architectural changes in response to a determination that the identified performance aspect is not present in the database, wherein the first processing unit is a graphics processing unit supporting SIMD architecture and the second processing unit is a graphics processing unit supporting SIMT architecture. 13. At least one non-transitory computer-readable medium comprising a plurality of instructions which, when executed, cause a computing device to perform operations comprising: adapting an executable object optimized for a first processing unit of a first architecture, to a

Assignees

Inventors

Classifications

  • Training; Learning · CPC title

  • Backpropagation, e.g. using gradient descent · CPC title

  • Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title

  • Activation functions · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11550600B2 cover?
Embodiments are generally directed to a system and method for adapting executable object to a processing unit. An embodiment of a method to adapt an executable object from a first processing unit to a second processing unit, comprises: adapting the executable object optimized for the first processing unit of a first architecture, to the second processing unit of a second architecture, wherein t…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/4552. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 10 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).