Validating machine learning models for deployment to cloud infrastructure
US-2024420019-A1 · Dec 19, 2024 · US
US2025147811A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2025147811-A1 |
| Application number | US-202318500110-A |
| Country | US |
| Kind code | A1 |
| Filing date | Nov 2, 2023 |
| Priority date | Nov 2, 2023 |
| Publication date | May 8, 2025 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An information handling system includes resource detection circuitry that collects data associated with resources being utilized in the information handling system. The system determines resources for execution of an inference model, and receives the data associated with the resources from the resource detection circuitry. Based on the resources for the execution of the inference model, the system determines one performance of an application when the inference model is executed in the information handling system. The system determines another performance level of the application when the inference model is not executed in the information handling system. Based on the two performance levels, the system determines whether the application has a performance gain by the inference model not being executed in the information handling system. In response to the performance gain, the system migrates the inference model to an edge server for execution.
Opening claim text (preview).
What is claimed is: 1 . An information handling system comprising: resource detection circuitry to collect data associated with resources being utilized in the information handling system; and a processor to communicate with the resource detection circuitry, the processor to: determine resources for execution of an inference model; receive the data associated with the resources from the resource detection circuitry; based on the resources for the execution of the inference model, determine a first performance of an application when the inference model is executed in the information handling system; determine a second performance level of the application when the inference model is not executed in the information handling system; based on the first and second performance levels, determine whether the application has a performance gain by the inference model not being executed in the information handling system; and in response to the application having the performance gain, migrate the inference model to an edge server for execution. 2 . The information handling system of claim 1 , further comprising workload detection circuitry to communicate with the processor, the workload detection circuitry to: determine a current workload level within the information handling system; and provide the workload level to the processor. 3 . The information handling system of claim 1 , wherein prior to the migration of the inference model, the processor further to: determine a latency associated with the migration of the inference model to the edge server; compare the latency to a threshold quality of service latency; and in response to the latency being less than the threshold quality of service latency, determine that the inference model is to be migrated to the edge server. 4 . The information handling system of claim 1 , wherein prior to the migration of the inference model, the processor further to: determine a power associated with the migration of the inference model to the edge server; compare the power to a threshold power; and in response to the power being less than the threshold power, determine that the inference model is to be migrated to the edge server. 5 . The information handling system of claim 1 , wherein prior to the migration of the inference model, the processor further to: determine a power associated with an execution of the inference model in the information handling system; compare the power to a threshold power; and in response to the power being greater than the threshold power, determine that the inference model is to be migrated to the edge server. 6 . The information handling system of claim 1 , wherein the processor further to in response to the application not having the performance gain, determine that the inference model is to be executed in the information handling system. 7 . The information handling system of claim 1 , wherein a default state is for the inference model to be executed in the information handling system. 8 . The information handling system of claim 1 , wherein the resouces include a graphics processing unit, a memory, and a power capability. 9 . A method comprising: determining, by a processor of an information handling system, resources for execution of an inference model; based on the resources for the execution of the inference model, determining a first performance of an application when the inference model is executed in the information handling system; determining a second performance level of the application when the inference model is not executed in the information handling system; based on the first and second performance levels, determining whether the application has a performance gain by the inference model not being executed in the information handling system; and in response to the application having the performance gain, migrating, by the processor, the inference model to an edge server for execution. 10 . The method of claim 9 , further comprising: determining a current workload level within the information handling system. 11 . The method of claim 9 , wherein prior to the migrating of the inference model, the method further comprising: determining a latency associated with the migration of the inference model to the edge server; comparing the latency to a threshold quality of service latency; and in response to the latency being less than the threshold quality of service latency, determining that the inference model is to be migrated to the edge server. 12 . The method of claim 9 , wherein prior to the migrating of the inference model, the method further comprising: determining a power associated with the migration of the inference model to the edge server; comparing the power to a threshold power; and in response to the power being less than the threshold power, determining that the inference model is to be migrated to the edge server. 13 . The method of claim 9 , wherein prior to the migrating of the inference model, the method further comprising: determining a power associated with an execution of the inference model in the information handling system; comparing the power to a threshold power; and in response to the power being greater than the threshold power, determining that the inference model is to be migrated to the edge server. 14 . The method of claim 9 , wherein in response to the application not having the performance gain, the method further comprises: determining that the inference model is to be executed in the information handling system. 15 . The method of claim 9 , wherein a default state is for the inference model to be executed in the information handling system. 16 . The method of claim 9 , wherein the resources include a graphics processing unit, a memory, and a power capability. 17 . A system comprising: an edge server configured to execute an inference model; and an information handling system to: determine resources for execution of an inference model; based on the resources for the execution of the inference model, determine a first performance of an application when the inference model is executed in the information handling system; determine a second performance level of the application when the inference model is not executed in the information handling system; based on the first and second performance levels, determine whether the application has a performance gain by the inference model not being executed in the information handling system; and in response to the application having the performance gain, migrate the inference model to an edge server for execution. 18 . The system of claim 17 , wherein prior to the migration of the inference model, the information handling system to: determine a latency associated with the migration of the inference model to the edge server; compare the latency to a threshold quality of service latency; and in response to the latency being less than the threshold quality of service latency, determine that the inference model is to be migrated to the edge server. 19 . The system of claim 17 , wherein prior to the migration of the inference model, the information handling system to: determine a power associated with the migration of the inference model to the edge server; compare the power to a threshold power; and in response to the power being less than the threshold power, determine that the inference model is to be migrated to the edge server. 20 . The system of claim 17 , wherein pri
Workload prediction · CPC title
Offload · CPC title
resumption being on a different machine, e.g. task migration, virtual machine migration (G06F9/5088 takes precedence) · CPC title
involving task migration · CPC title
Performance criteria · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.