Automating efficient deployment of artificial intelligence models

US12360753B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12360753-B2
Application numberUS-202418888151-A
CountryUS
Kind codeB2
Filing dateSep 17, 2024
Priority dateDec 20, 2023
Publication dateJul 15, 2025
Grant dateJul 15, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system facilitates a process for automatically deploying artificial intelligence (AI) models. The system receives, for a first artificial intelligence (AI) model used by an entity, a first request to deploy the first AI model to make the first AI model available for use in a production environment to process input data and generate corresponding outputs. A first model deployment location for the first model is selected based on a model deployment engine. The system generates scripts to deploy the first AI model to the selected location, then monitors operations parameters associated with the deployment of the first AI model. Based on the values of the operations parameters, the system updates the model deployment engine. In response to a second request to deploy a second AI model, the system uses the updated model deployment engine to select a second model deployment location for the second model.

First claim

Opening claim text (preview).

We claim: 1. A computer-implemented method comprising: receiving, for a first artificial intelligence (AI) model used by an entity, a first request to deploy the first AI model to make the first AI model available for use in a production environment to process input data and generate corresponding outputs; selecting a first model deployment location for the first AI model based on a model deployment engine; wherein the model deployment engine is configured to select the first model deployment location from among a set of one or more cloud provider environments or an on-premise environment operated by the entity; generating scripts to deploy the first AI model to the first model deployment location, wherein the model deployment engine is configured to select the first model deployment location for deploying the first AI model; monitoring operations parameters associated with the deployment of the first AI model at the selected model deployment location as the first AI model processes the input data and generates the corresponding outputs, wherein the operations parameters include dynamic data comprising at least one of (1) computing resources usage while processing the input data or (2) a response time; using values of the monitored operations parameters, updating one or more parameters of the model deployment engine for deployment of a second AI model to a second model deployment location; and in response to a second request to deploy the second AI model, selecting the second model deployment location for the second AI model based on the updated model deployment engine. 2. The computer-implemented method of claim 1 , wherein the model deployment engine comprises a decision tree, a knowledge graph, or a rules engine, and wherein selecting the first model deployment location based on the model deployment engine comprises selecting the first model deployment location based on one or more of: a computing cost for deploying the first AI model at the selected model deployment location, a computing capacity of the selected model deployment location, a response time from the selected model deployment location, or a measurement of accuracy of the first AI model when deployed at the selected model deployment location. 3. The computer-implemented method of claim 1 , wherein selecting the first model deployment location based on the model deployment engine comprises selecting the first model deployment location based on a privacy policy associated with at least one of: the first AI model, the input data, or the corresponding outputs. 4. The computer-implemented method of claim 1 , wherein the operations parameters include one or more of: computing cost used by the first AI model at the first model deployment location, response time from the first AI model when deployed at the first model deployment location, computing capacity available in an environment that includes the first model deployment location, a privacy policy of the environment that includes the first model deployment location, or a measurement of downtime or service interruptions of the environment that includes the first model deployment location. 5. The computer-implemented method of claim 1 , wherein the model deployment engine comprises a trained decision model, and wherein updating the model deployment engine comprises retraining the trained decision model based on a difference between the values of the monitored operations parameters and values of a set of operations parameters on which the model deployment engine was trained. 6. The computer-implemented method of claim 1 , wherein the second AI model is a second instance of the first AI model, and wherein the second model deployment location for the second AI model is a different location than the first model deployment location for the first AI model. 7. The computer-implemented method of claim 1 , further comprising: selecting a third model deployment location for the first AI model, different from the first model deployment location, based on the updated model deployment engine; and generating scripts to deploy the first AI model to the third model deployment location. 8. The computer-implemented method of claim 1 , further comprising: selecting a third model deployment location for the first AI model, different from the first model deployment location, based on the model deployment engine and the operations parameters; and generating scripts to deploy the first AI model to the third model deployment location. 9. The computer-implemented method of claim 8 , wherein the operations parameters include a computing cost associated with the deployment of the first AI model at the first model deployment location, and wherein selecting the third model deployment location comprises: determining to move the first AI model to the third model deployment location when the computing cost associated with the deployment of the first AI model at the first model deployment location is greater than a predicted computing cost associated with deploying the first AI model at the third model deployment location. 10. A system, comprising: one or more processors; and one or more non-transitory computer-readable storage media storing executable instructions, the instructions when executed by the one or more processors causing the system to: receive, for a first artificial intelligence (AI) model used by an entity, a first request to deploy the first AI model to make the first AI model available for use in a production environment to process input data and generate corresponding outputs; select a first model deployment location for the first AI model based on a model deployment engine; wherein the model deployment engine is configured to select the first model deployment location from among a set of one or more cloud provider environments or an on-premise environment operated by the entity; generate scripts to deploy the first AI model to the first model deployment location, wherein the model deployment engine is configured to select the first model deployment location for deploying the first AI model; monitor operations parameters associated with the deployment of the first AI model at the selected model deployment location as the first AI model processes the input data and generates the corresponding outputs, wherein the operations parameters include dynamic data comprising at least one of (1) computing resources usage while processing the input data or (2) a response time; using values of the monitored operations parameters, update one or more parameters of the model deployment engine for deployment of a second AI model to a second model deployment location; and in response to a second request to deploy the second AI model, select the second model deployment location for the second AI model based on the updated model deployment engine. 11. The system of claim 10 , wherein the model deployment engine comprises a decision tree, a knowledge graph, or a rules engine, and wherein selecting the first model deployment location based on the model deployment engine comprises selecting the first model deployment location based on one or more of: a computing cost for deploying the first AI model at the selected model deployment location, a computing capacity of the selected model deployment location, a response time from the selected model deployment location, or a measurement of accuracy of the first AI model when deployed at the selected model deployment location. 12. The system of claim 10 , wherein selecting the first model deployment location based on the model deployment engine comprises selecting the first model deployment location base

Assignees

Inventors

Classifications

  • Knowledge-based neural networks; Logical representations of neural networks · CPC title

  • Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title

  • Knowledge engineering; Knowledge acquisition · CPC title

  • G06F8/60Primary

    Software deployment · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12360753B2 cover?
A system facilitates a process for automatically deploying artificial intelligence (AI) models. The system receives, for a first artificial intelligence (AI) model used by an entity, a first request to deploy the first AI model to make the first AI model available for use in a production environment to process input data and generate corresponding outputs. A first model deployment location for …
Who is the assignee on this patent?
Citibank Na
What technology area does this patent fall under?
Primary CPC classification G06F8/60. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 15 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).