Separate deployment of machine learning model and associated embedding

US2021158165A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2021158165-A1
Application numberUS-202117165509-A
CountryUS
Kind codeA1
Filing dateFeb 2, 2021
Priority dateDec 13, 2018
Publication dateMay 27, 2021
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Implementations of the present specification provide a model-based prediction method and apparatus. The method includes: a model running environment receives an input tensor of a machine learning model; the model running environment sends a table query request to an embedding running environment, the table query request including the input tensor, to request low-dimensional conversion of the input tensor; the model running environment receives a table query result returned by the embedding running environment, the table query result being obtained by the embedding running environment by performing embedding query and processing based on the input tensor; and the model running environment inputs the table query result into the machine learning model, and runs the machine learning model to complete model-based prediction.

First claim

Opening claim text (preview).

What is claimed is: 1 . A model-based prediction method performed by a machine learning system, the system including a machine learning model and an embedding model for converting an input tensor of the machine learning model, the embedding model being deployed separately from the machine learning model, the embedding model being deployed in an embedding running environment, and the machine learning model being deployed in a model running environment, the method comprising: receiving, by the model running environment, an input tensor of the machine learning model; sending, by the model running environment, a table query request to the embedding running environment, the table query request including the input tensor, to request low-dimensional conversion of the input tensor; receiving, by the model running environment, the table query result returned by the embedding running environment, the table query result being obtained by the embedding running environment by querying an embedding table for the low-dimensional conversion that is associated with the machine learning model based on the input tensor; and inputting, by the model running environment, the table query result into the machine learning model, and executing the machine learning model to complete model-based prediction. 2 . The method according to claim 1 , wherein the model running environment is a physical execution unit or a virtual execution unit, and the embedding running environment is a physical execution unit or a virtual execution unit. 3 . The method according to claim 1 , wherein the machine learning system includes at least one embedding running environment and at least one model running environment, each embedding running environment implementing a single embedding model, and each model running environment implementing a single machine learning model. 4 . A machine learning method performed by a machine learning system executing on a plurality of running environments including a model running environment and an embedding running environment, the method comprising: receiving, by the model running environment, an input to a machine learning model implemented on the model running environment; sending, from the model running environment to the embedding running environment, a request for converting the input; receiving, by the model running environment, a result returned from the embedding running environment, the result including a low-dimensional representation of the input; and feeding, by the model running environment, the low-dimensional representation into the machine learning model to perform model-based prediction. 5 . The method according to claim 4 , wherein the sending, from the model running environment to the embedding running environment, a request for converting the input includes: sending a local request for converting the input, wherein the embedding running environment and the model running environment are located on a same physical node; or sending a remote request for converting the input, wherein the embedding running environment and the model running environment are located on different physical nodes. 6 . The method according to claim 4 , wherein the plurality of running environments includes at least two model running environments and different hardware resources are configured for different model running environments, the hardware resources being adapted to running requirements of machine learning models in the model running environments. 7 . The method according to claim 6 , wherein the hardware resource includes at least one of a central processing unit or a hardware accelerator. 8 . The method of claim 7 , wherein the hardware accelerator includes at least one of a graphics processing unit a field-programmable gate array, or an application-specific integrated circuit chip designed for a specific purpose. 9 . The method according to claim 4 , wherein the machine learning model includes at least one of a deep neural network model, a Wide & Deep model, or a DeepFM model. 10 . A machine learning system comprising an embedding running environment and a model running environment, an embedding model being deployed in the embedding running environment, and a machine learning model being deployed in the model running environment; wherein the model running environment is configured to receive an input for the machine learning model, send a table query request including the input to the embedding running environment, receive a response including a low-dimensional converted value of the input from the embedding running environment, and feed the low-dimensional converted value into the machine learning model to execute model-based prediction; and wherein the embedding running environment is configured to perform embedding query based on the input to obtain the low-dimensional converted value, and send the a response including the low-dimensional converted value back to the model running environment. 11 . The system according to claim 10 , wherein the embedding running environment and the model running environment each are a physical execution unit or a virtual execution unit. 12 . The system according to claim 10 , comprising at least one embedding running environment and at least one model running environment, each embedding running environment implementing at least one embedding model, and each model running environment implementing at least one machine learning model. 13 . The system according to claim 12 , wherein different hardware resources are configured for different model running environments, the hardware resources being adapted to running requirements of machine learning models in the model running environments. 14 . A non-transitory storage medium storing contents that, when executed by one or more processors, cause the one or more processors to perform actions comprising: receiving, by a model running environment of a machine learning system, an input to a machine learning model implemented on the model running environment; sending, from the model running environment to an embedding running environment of the machine learning system, a request for converting the input; receiving, by the model running environment, a result returned from the embedding running environment, the result including a low-dimensional representation of the input; and feeding, by the model running environment, the low-dimensional representation into the machine learning model to perform model-based prediction. 15 . The storage medium according to claim 14 , wherein the model running environment is a physical execution unit or a virtual execution unit, and the embedding running environment is a physical execution unit or a virtual execution unit. 16 . The storage medium according to claim 14 , wherein the sending, from the model running environment to the embedding running environment, a request for converting the input includes: sending a local request for converting the input, wherein the embedding running environment and the model running environment are located on a same physical node; or sending a remote request for converting the input, wherein the embedding running environment and the model running environment are located on different physical nodes. 17 . The storage medium according to claim 14 , wherein the machine learning system includes a plurality of model running environments and different hardware resources are configured for different model running environments, the hardware resources being adapted to running requirements of machine learning models i

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021158165A1 cover?
Implementations of the present specification provide a model-based prediction method and apparatus. The method includes: a model running environment receives an input tensor of a machine learning model; the model running environment sends a table query request to an embedding running environment, the table query request including the input tensor, to request low-dimensional conversion of the in…
Who is the assignee on this patent?
Advanced New Technologies Co Ltd
What technology area does this patent fall under?
Primary CPC classification G05B13/027. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu May 27 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).