Model selection for split inference

US12505362B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12505362-B2
Application numberUS-202418659158-A
CountryUS
Kind codeB2
Filing dateMay 9, 2024
Priority dateMay 9, 2024
Publication dateDec 23, 2025
Grant dateDec 23, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, methods, and devices for split inference for a task network split between two communication devices. Model parameters for each of the two task network portions may be determined based on a number of factors including the encoding/decoding configuration used for communicating the intermediated representations across the network, and/or the model performance based on network conditions or the encoding/decoding configuration. In some embodiments the transmitting device determines performance changes and/or model parameters and indicates to the receiving device. In some embodiments, the receiving device determines performance changes and/or model parameters and indicates to the transmitting device.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of communication performed by a first communication device, the method comprising: generating, via a first task network, an intermediate representation of an input; generating, via an encoder configured with an encoder configuration, a compressed representation of the intermediate representation; transmitting, to a second communication device, the compressed representation; and transmitting, to the second communication device based on the encoder configuration, an indication for selecting a set of parameters for a second task network for generating an output based on the compressed representation. 2 . The method of claim 1 , wherein the indication includes a distortion level based on a difference between the intermediate representation and the compressed representation. 3 . The method of claim 1 , further comprising: generating, via a decoder, an uncompressed representation based on the compressed representation; generating, via the second task network configured with one or more sets of parameters, one or more outputs based on the uncompressed representation; and generating, via the second task network, a baseline output based on the intermediate representation, wherein the indication includes an indication of a change in task performance based on a comparison of the baseline output and the one or more outputs. 4 . The method of claim 1 , wherein the indication includes an indication of a specific set of parameters, and further comprising: selecting the specific set of parameters based on at least one of: a computed distortion level of the compressed representation, a computed change in task performance, a cost of swapping the second task network on the second communication device, a channel capacity between the first communication device and the second communication device, or a delay. 5 . The method of claim 1 , wherein the compressed representation and the indication are transmitted in-band within a same bitstream. 6 . The method of claim 1 , wherein the indication is transmitted out-of-band from the compressed representation. 7 . The method of claim 1 , wherein the generating the intermediate representation is performed using a depth of the first task network based on the encoder configuration, and wherein the indication is further for selecting a depth of the second task network. 8 . The method of claim 1 , wherein the encoder configuration includes at least one of: a feature reduction configuration, a quantization configuration, or a compression configuration. 9 . A method of communication performed by a first communication device, the method comprising: receiving, from a second communication device, an indication for parameter selection based on a predetermined testing representation; generating, via a first task network with a set of parameters selected based on the indication, an intermediate representation of an input; generating, via an encoder configured with an encoder configuration, a compressed representation of the intermediate representation; and transmitting, to the second communication device, the compressed representation. 10 . The method of claim 9 , wherein the indication includes an indication of a distortion level. 11 . The method of claim 9 , wherein the indication includes an indication of a change in task performance. 12 . The method of claim 9 , wherein the indication includes an indication of a specific set of parameters. 13 . The method of claim 9 , further comprising: configuring the encoder with a second encoder configuration based on the indication. 14 . The method of claim 13 , further comprising: transmitting, to the second communication device, an indication of the second encoder configuration. 15 . The method of claim 13 , wherein the encoder configuration includes at least one of: a feature reduction configuration, a quantization configuration, or a compression configuration. 16 . The method of claim 9 , wherein generating, the intermediate representation is performed using a depth of the first task network based on the indication. 17 . A method of communication performed by a first communication device, the method comprising: configuring an encoder with an encoder configuration; selecting a set of parameters for a first task network based on at least one of: the encoder configuration, or a decoder configuration associated with a second communication device; generating, via the first task network with the selected set of parameters, an intermediate representation of an input; generating, via the encoder, a compressed representation of the intermediate representation; and transmitting, to the second communication device, the compressed representation. 18 . The method of claim 17 , further comprising: receiving, from the second communication device, an indication of the decoder configuration; and transmitting, to the second communication device, an indication of the encoder configuration. 19 . The method of claim 17 , wherein the encoder configuration is based on an information including at least one of: a distortion level; a task performance level; a computing complexity level; a network condition; or a performance requirement. 20 . The method of claim 19 , further comprising: receiving the information from at least one of: the second communication device; or a network entity different from the second communication device.

Assignees

Inventors

Classifications

  • Optimizing {the usage of the radio link}, e.g. header compression, information sizing {, discarding information (system modifying transmission characteristic according to link quality by modifying frame length H04L1/0007; dynamic adaptation of the packet size for flow control or congestion control H04L47/365)} · CPC title

  • H04L67/10Primary

    in which an application is distributed across nodes in the network (software deployment G06F8/60; multiprogramming arrangements G06F9/46) · CPC title

  • Task transfer initiation or dispatching · CPC title

  • Remote procedure calls [RPC]; Web services · CPC title

  • Combinations of networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12505362B2 cover?
Systems, methods, and devices for split inference for a task network split between two communication devices. Model parameters for each of the two task network portions may be determined based on a number of factors including the encoding/decoding configuration used for communicating the intermediated representations across the network, and/or the model performance based on network conditions o…
Who is the assignee on this patent?
Qualcomm Inc
What technology area does this patent fall under?
Primary CPC classification H04L67/10. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Dec 23 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).