Hybrid inference system for cogs reduction
US-2024385814-A1 · Nov 21, 2024 · US
US12578940B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12578940-B2 |
| Application number | US-202318512215-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 17, 2023 |
| Priority date | Sep 14, 2023 |
| Publication date | Mar 17, 2026 |
| Grant date | Mar 17, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer-implemented, machine learning method for preprocessing code for performance portability includes extracting performance critical code segments from an application and obtaining input data. Ground truth data is generated based on the input data and the application. Original code of the application is transpiled using a large language model (LLM) into a tensor computation language (TCL) candidate. Correctness of an implementation of the TCL candidate is verified using the ground truth data. The method has applications including, but not limited to, use cases in medical/healthcare, and other artificial intelligence applications for preprocessing and optimizing code for performance portability. The method can also support decision making and could be implemented with machine learning.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method for preprocessing code for performance portability, the computer-implemented method comprising: extracting performance critical code segments from an application, wherein the performance critical code segments represent a portion of an original code of the application; obtaining input data; generating ground truth data based on the input data and the application; transpiling the performance critical code segments extracted from the original code of the application using a large language model (LLM) into a tensor computation language (TCL) candidate; and verifying correctness of an implementation of the TCL candidate using the ground truth data. 2 . The computer-implemented method according to claim 1 , further comprising optimizing the TCL candidate using the LLM or another LLM. 3 . The computer-implemented method according to claim 2 , wherein optimizing the TCL candidate is performed using performance data from optimizing compiler systems or runtime systems. 4 . The computer-implemented method according to claim 3 , further comprising passing the TCL candidate through the optimizing compiler systems or runtime systems to generate heuristics and the performance data. 5 . The computer-implemented method according to claim 4 , wherein the heuristics and the performance data are used as feedback to update the LLM that is used for optimizing the TCL candidate. 6 . The computer-implemented method according to claim 2 , further comprising evaluating correctness of an implementation of the optimized TCL candidate, and accepting the implementation of the optimized TCL candidate based on a result of the implementation being correct and one or more user-provided constraints being met. 7 . The computer-implemented method according to claim 2 , further comprising repeating the steps of transpiling, verifying and optimizing based on a determination that an implementation of the optimized TCL candidate is not correct. 8 . The computer-implemented method according to claim 1 , wherein the performance critical code segments are identified using a trained LLM that weights performance criticality of code segments from the application. 9 . The computer-implemented method according to claim 8 , wherein the trained LLM is trained based on code examples and a target function that increases a score based on a mathematical transformation being used, or data in a vectorizable data structure being accessed. 10 . The computer-implemented method according to claim 1 , wherein obtaining the input data includes obtaining user provided input data and/or generating inferred or synthetically generated input data using automated checkpointing or static code analysis. 11 . The computer-implemented method according to claim 1 , wherein extracting the performance critical code segments from the application includes automatically inferring data dtype, shape, and potential value ranges from code of the performance critical code segments. 12 . The computer-implemented method according to claim 11 , wherein automatically inferring the potential value ranges includes analyzing loop ranges, memory access patterns, static or templated data types, and/or mathematical operations. 13 . The computer-implemented method according to claim 1 , further comprising using user provided input data and/or inferred or synthetically generated input data to automatically reject invalid or corrupted code segments. 14 . A computer system for preprocessing code for performance portability, the computer system comprising one or more hardware processors which, alone or in combination, are configured to provide for execution of the method according to claim 1 . 15 . A tangible, non-transitory computer-readable medium having instructions thereon which, upon being executed by one or more processors, provide for preprocessing code for performance portability by execution of the method according to claim 1 . 16 . A computer-implemented method for preprocessing code for performance portability, the computer-implemented method comprising: extracting performance critical code segments from an application; obtaining input data; generating ground truth data based on the input data and the application; transpiling original code of the application using a large language model (LLM) into a tensor computation language (TCL) candidate; verifying correctness of an implementation of the TCL candidate using the ground truth data; and optimizing the TCL candidate using the LLM or another LLM, wherein optimizing the TCL candidate is performed using performance data from optimizing compiler systems or runtime systems. 17 . The computer-implemented method according to claim 16 , further comprising passing the TCL candidate through the optimizing compiler systems or runtime systems to generate heuristics and the performance data. 18 . The computer-implemented method according to claim 17 , wherein the heuristics and the performance data are used as feedback to update the LLM that is used for optimizing the TCL candidate. 19 . A computer-implemented method for preprocessing code for performance portability, the computer-implemented method comprising: extracting performance critical code segments from an application, wherein the performance critical code segments are identified using a trained LLM that weights performance criticality of code segments from the application; obtaining input data; generating ground truth data based on the input data and the application; transpiling original code of the application using a large language model (LLM) into a tensor computation language (TCL) candidate; and verifying correctness of an implementation of the TCL candidate using the ground truth data. 20 . The computer-implemented method according to claim 19 , wherein the trained LLM is trained based on code examples and a target function that increases a score based on a mathematical transformation being used, or data in a vectorizable data structure being accessed.
Checking; Contextual analysis · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.