Providing application programming interface endpoints for machine learning models

US11669377B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11669377-B2
Application numberUS-202217680859-A
CountryUS
Kind codeB2
Filing dateFeb 25, 2022
Priority dateAug 21, 2019
Publication dateJun 6, 2023
Grant dateJun 6, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

One or more virtual machines are launched at an application platform. At each of the one or more virtual machines, a machine learning model execution environment is instantiated for an instance of a machine learning model. A respective instance of the machine learning model is loaded to each machine learning model execution environment. Each loaded instance of the machine learning model is associated with an application programming interface (API) endpoint which can receive input data for the loaded instance of the machine learning model from a client device and return output data produced by the loaded instance of the machine learning model based on the input data.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: instantiating, at each virtual machine of one or more virtual machines, a machine learning model execution environment for an instance of a machine learning model; loading, by a processing device, a respective instance of the machine learning model to each machine learning model execution environment; associating each loaded instance of the machine learning model with an application programming interface (API) endpoint, the API endpoint configured to receive input data for the loaded instance of the machine learning model from a client device and to return output data produced by the loaded instance of the machine learning model based on the input data; receiving a request by the client device to configure the API endpoint; and identifying configuration information specified by the request, wherein an identifier of the machine learning model and a resource locator of the API endpoint are specified by the configuration information. 2. The method of claim 1 , wherein the API endpoint is further configured to: receive a first request of the client device, the first request comprising first input data, provide the first input data as input for the loaded instance of the machine learning model, obtain first output data of the loaded instance of the machine learning model, and cause a first response comprising an indication of the first output data of the machine learning model to be sent to the client device. 3. The method of claim 2 , further comprising: identifying an audit record that is associated with the API endpoint; and recording audit information at the audit record, wherein the audit information comprises one or more of the first input data of the first request, the first output data of the first response, or contextual information with respect to the first request or first response. 4. The method of claim 3 , further comprising: performing one or more operations using the audit information of the audit record, the one or more operations comprising a validation operation to validate the first output data obtained from the loaded instance of the machine learning model at the respective virtual machine of the one or more virtual machines against second output data obtained from another loaded instance of the machine learning model at another respective virtual machine, the second output data obtained by applying the first input data as input to the other loaded instance of the machine learning model. 5. The method of claim 4 , wherein performing the one or more operations using the audit information of the audit record further comprises: performing a data processing operation on the audit information to generate an audit data output; and providing a graphical user interface (GUI) to the client device that presents a graphical representation of the audit data output. 6. The method of claim 1 , further comprising: receiving, from the client device, an authentication request comprising authentication credentials corresponding to an account; authenticating the account based on the authentication credentials; and generating an access token based on the authentication, wherein the access token to allow the client device to access the API endpoint. 7. The method of claim 1 , wherein the API endpoint is further configured to receive the input data for the loaded instance of the machine learning model from the client device via an HTTP request, wherein the API endpoint is further configured to return output data produced by the loaded instance of the machine learning model based on the input data via an HTTP response. 8. The method of claim 1 , wherein the configuration information further specifies quality of service parameters, the method further comprising: monitoring quality metrics indicative of the quality of service parameters specified by the configuration information subsequent to configuring the API endpoint; determining that one or more of the quality metrics satisfy a threshold; and responsive to determining that the one or more of the quality metrics satisfy the threshold, adjusting a number of the one or more virtual machines executing at an application platform and associated with the API endpoint. 9. A method, comprising: accessing a machine learning model execution environment for a machine learning model at a virtual machine; determining whether the machine learning model is associated with a dataset that is to be preloaded for use by the machine learning model execution environment during run-time; in response to determining that the machine learning model is associated with the dataset that is to be preloaded, preloading the dataset that is associated with the machine learning model that is accessible by the virtual machine; and associating the machine learning model with an application programming interface (API) endpoint, wherein the API endpoint is configured to receive input data provided by a client device for the machine learning model, the received input data configured to be aggregated with data of the preloaded dataset and provided as aggregated input data for the machine learning model to obtain output data of the machine learning model. 10. The method of claim 9 , further comprising: instantiating the virtual machine at an application platform. 11. The method of claim 9 , further comprising: receiving a request by the client device to configure the API endpoint; and identifying configuration information specified by the request and stored at an application platform, wherein the configuration information comprises one or more of an identifier of the machine learning model, an address of the API endpoint, an identifier of the preloaded dataset, or instructions to preload the preloaded dataset into the memory. 12. The method of claim 9 , wherein the API endpoint is to: receive, from the client device, a first request comprising first input data that is to be combined with the data of the preloaded dataset to generate the aggregated input data and be applied as input to the machine learning model, obtain from the machine learning model first output data based on the aggregated input data, and cause a first response comprising an indication of the output data of the machine learning model to be sent to the client device. 13. The method of claim 12 , wherein the first request comprises a data identifier associated with the data of the preloaded dataset. 14. The method of claim 13 , wherein the data identifier comprises a user identifier of a user of the client device, and wherein the data of the preloaded dataset that is associated with the data identifier comprises user information associated with the user of the client device. 15. The method of claim 12 , further comprising: generating the preloaded dataset based on a threshold number of recent requests to the API endpoint by the client device. 16. A system comprising: a memory; and a processing device, coupled to the memory to: instantiate, at each virtual machine of one or more virtual machines, a machine learning model execution environment for an instance of a machine learning model; load a respective instance of the machine learning model to each machine learning model execution environment; associate each loaded instance of the machine learning model with an application programming interface (API) endpoint, the API endpoint to receive input data for the loaded instance of the machine learning model from a client device and to return output data produced by the loaded instance of the machine learning model based on the input data; rec

Assignees

Inventors

Classifications

  • Machine learning · CPC title

  • G06F9/547Primary

    Remote procedure calls [RPC]; Web services · CPC title

  • Protocols for remote procedure calls [RPC] · CPC title

  • Starting, stopping, suspending or resuming virtual machine instances · CPC title

  • Monitoring or debugging support · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11669377B2 cover?
One or more virtual machines are launched at an application platform. At each of the one or more virtual machines, a machine learning model execution environment is instantiated for an instance of a machine learning model. A respective instance of the machine learning model is loaded to each machine learning model execution environment. Each loaded instance of the machine learning model is asso…
Who is the assignee on this patent?
Palantir Technologies Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/547. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 06 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).