Packaging and deploying algorithms for flexible machine learning

US12045693B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12045693-B2
Application numberUS-201816001548-A
CountryUS
Kind codeB2
Filing dateJun 6, 2018
Priority dateNov 22, 2017
Publication dateJul 23, 2024
Grant dateJul 23, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for using scoring algorithms utilizing containers for flexible machine learning inference are described. In some embodiments, a request to host a machine learning (ML) model within a service provider network on behalf of a user is received, the request identifying an endpoint to perform scoring using the ML model. An endpoint is initialized as a container running on a virtual machine based on a container image and used to score data and return a result of said scoring to a user device.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: receiving, at a service provider network, a first request to train a machine learning (ML) model; wherein the first request to train identifies a first ML training container image; wherein the first request to train identifies a set of training data; wherein the service provider network is implemented by one or more electronic devices; in response to receiving the first request to train: retrieving the first ML training container image from a container data store, using the first ML training container image to initialize a first ML training container on a virtual machine instance, the first ML training container image comprising a first training algorithm code, and executing the first training algorithm code and using the set of training data to train the ML model in the first ML training container to yield a first trained ML model; evaluating the first trained ML model to obtain a first set of output data; determining a first quality metric based on comparing the first set of output data to a set of evaluation data; receiving, at the service provider network, a second request to train the machine learning (ML) model; wherein the second request to train identifies a second ML training container image; wherein the second request to train identifies the set of training data; in response to receiving the second request to train: retrieving the second ML training container image from a container data store, using the second ML training container image to initialize a second ML training container on the virtual machine instance, the second ML training container image comprising a second training algorithm, executing the second training algorithm code and using the set of training data to train the ML model in the second ML training container to yield a second trained ML model, and storing the second trained ML model in a training model data store; evaluating the second trained ML model to obtain a second set of output data; determining a second quality metric based on comparing the second set of output data to the set of evaluation data; receiving, at the service provider network, a request to deploy the second trained ML model, wherein the request to deploy identifies a ML scoring container image, wherein the request to deploy identifies the second trained ML model; in response to receiving the request to deploy: retrieving the ML scoring container image from a container data store, using the ML scoring container image to initialize an ML scoring container, the ML scoring container image comprising a scoring algorithm code, retrieving the second trained ML model from the training model data store, storing the second trained ML model in the ML scoring container, and returning an endpoint name for the ML scoring container; receiving, at the service provider network, a request to perform scoring, the request to perform scoring comprising the endpoint name, the request to perform scoring identifying input data; and in response to receiving the request to perform scoring: executing the scoring algorithm code and using the second trained ML model on the input data in the ML scoring container to yield a result, and returning the result. 2. The computer-implemented method of claim 1 , wherein the set of training data is provided to the first ML training container as one or more files in a first local directory in the first ML training container or as one or more input streams accessible within the first ML training container, and wherein the method further comprises: storing a set of one or more model artifacts at a storage location, wherein the storing comprises: obtaining the set of one or more model artifacts from a second local directory in the first ML training container; and sending the set of one or more model artifacts or an archived version of the set of one or more model artifacts to the storage location. 3. The computer-implemented method of claim 1 , wherein a front end of the service provider network is to receive the request to perform scoring and return the result using HyperText Transfer Protocol (HTTP) messages. 4. A computer-implemented method comprising: receiving, at a service provider network, a request to train a machine learning (ML) model; wherein the request to train identifies a set of training data; wherein the request to train identifies a ML training container image; wherein the service provider network is implemented by one or more electronic devices; in response to receiving the request to train: retrieving the ML training container image from a container data store, using the ML training container image to initialize a ML training container, executing training algorithm code and using the set of training data to train the ML model in the ML training container to yield a trained ML model, and storing the trained ML model in a training model data store; receiving, at the service provider network, a request to deploy the trained ML model wherein the request to deploy identifies a ML scoring container image, wherein the request to deploy identifies the trained ML model; in response to receiving the request to deploy: retrieving the ML scoring container image from a container data store, using the ML scoring container image to initialize an ML scoring container, the ML scoring container image comprises a scoring algorithm code, retrieving the trained ML model from the training model data store, storing the trained ML model in the ML scoring container, and returning an endpoint name for the ML scoring container; receiving, at the service provider network, a request to perform scoring, the request to perform scoring comprising the endpoint name, the request to perform scoring identifying input data; and in response to receiving the request to perform scoring: executing scoring algorithm code and using the trained ML model on the input data in the ML scoring container to yield a result, and returning the result. 5. The computer-implemented method of claim 4 , wherein the request to deploy identifies a location of the ML scoring container image. 6. The computer-implemented method of claim 4 , wherein the ML scoring container further includes a runtime. 7. The computer-implemented method of claim 4 , wherein training the ML model based on the set of training data is based on obtaining the set of training data from a storage service in the service provider network. 8. The computer-implemented method of claim 4 , wherein training the ML model further comprises: providing the set of training data to the ML training container as one or more files in a local directory in the ML training container or as one or more input streams accessible within the ML training container. 9. The computer-implemented method of claim 4 , wherein the request to train the ML model identifies one or more hyperparameters to be used for training the ML model, and wherein the training the ML model further comprises providing the one or more hyperparameters to the ML training container as one or more files in a local directory in the ML training container. 10. The computer-implemented method of claim 4 , further comprising: performing scoring using a graphical processing unit. 11. The computer-implemented method of claim 4 , wherein a front end of the service provider network is to receive the request to perform scoring and provide the result. 12. The computer-implemented method of claim 4 , wherein the request to perform scoring and result are transmitted using HyperText Transfer Protocol (HTTP) messages. 13. The computer-implemented

Assignees

Inventors

Classifications

  • Network integration; Enabling network access in virtual machine instances · CPC title

  • Hypervisor-specific management and integration aspects · CPC title

  • G06N20/00Primary

    Machine learning · CPC title

  • Logical partitioning of resources; Management or configuration of virtualized resources (specific details on emulation or internal functioning of virtual machines G06F9/455) · CPC title

  • G06F9/5072Primary

    Grid computing · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12045693B2 cover?
Techniques for using scoring algorithms utilizing containers for flexible machine learning inference are described. In some embodiments, a request to host a machine learning (ML) model within a service provider network on behalf of a user is received, the request identifying an endpoint to perform scoring using the ML model. An endpoint is initialized as a container running on a virtual machine…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 23 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).