Parameter-Efficient Adapter for an Artificial Intelligence System

US2025156684A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025156684-A1
Application numberUS-202418587052-A
CountryUS
Kind codeA1
Filing dateFeb 26, 2024
Priority dateNov 10, 2023
Publication dateMay 15, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An adapter to a base model of an artificial intelligence (AI) system is disclosed. The adapter includes a connector to connect the adapter to the base model such that during an operation of the AI system at least some portion of data transformed by the base model is propagated from the base model to the adapter and back from the adapter to the base model. The adapter includes a non-linear modifier to modify the data received from the base model non-linearly before returning the modified portion of the data back to the base model, and an AI trainer to tune the non-linear modifier of the adapter by propagating training data through the base model and the adapter and updating weights of the non-linear modifier of the adapter for given weights of the base model to optimize a loss function. Further, weight matrices for the base model and the adapter are jointly constructed by an additional module, which efficiently uses a pool of parameters to allocate to save memory requirement for adaptation of the AI system.

First claim

Opening claim text (preview).

What is claimed is: 1 . An adapter to a base model of an artificial intelligence (AI) system, the adapter comprising: a connector configured to connect the adapter to the base model such that during an operation of the AI system at least some portion of data transformed by the base model is propagated from the base model to the adapter and back from the adapter to the base model; a non-linear modifier configured to modify the data received from the base model non-linearly before returning the modified portion of the data back to the base model; and an AI trainer configured to tune the non-linear modifier of the adapter by propagating training data through the base model and the adapter and updating weights of the non-linear modifier of the adapter for given weights of the base model to optimize a loss function. 2 . The adapter of claim 1 , wherein the non-linear modifier includes multiple paths formed by multiple AI architectures of data transformation, each of the paths is either a linear path configured to modify the received data linearly or a non-linear path configured to modify the received data non-linearly, wherein an AI architecture of the linear path modifies the received data linearly using one or multiple weight matrices, wherein an AI architecture of the non-linear path modifies the received data linearly using one or multiple weight matrices and modifies the received data non-linearly using one or multiple non-linear functions, and wherein the non-linear modifier includes at least one non-linear path. 3 . The adapter of claim 2 , wherein the non-linear modifier includes multiple non-linear paths using different non-linear functions, different arrangements of the same non-linear functions with respect to the weight matrices, or both. 4 . The adapter of claim 3 , wherein the non-linear modifier includes at least one linear path. 5 . The adapter of claim 3 , wherein the multiple non-linear paths include the same weight matrices. 6 . The adapter of claim 3 , wherein the multiple non-linear paths share at least some weights. 7 . The adapter of claim 3 , wherein weights in the weight matrices of the multiple non-linear paths come from a common pool of parameters, such that to tune the non-linear modifier, the AI trainer updates the common pool of parameters. 8 . The adapter of claim 2 , wherein the non-linear modifier comprises: a path splitter configured to direct the received data to each of the paths; and a path combiner configured to combine outputs of each of the paths to submit a combined output back to the base model. 9 . The adapter of claim 8 , wherein the path combiner combines the outputs using an operation including one or a combination of: an identity, a duplication, a permutation, a polynomial basis expansion, a Fourier basis expansion, an addition, a multiplication, a division, a subtraction, a modulo-addition, a modulo-product, a Kronecker product, a Kronecker sum, a Hadamard product, a concatenation, a log-sum-exp, an affine transform, a convolution, randomization, a normalization, a nonlinear activation operation, and variants thereof. 10 . The adapter of claim 9 , wherein the operation of the path combiner includes a parameter learned during the tuning of the AI trainer. 11 . The adapter of claim 2 , wherein the AI architecture of the non-linear path includes a bottleneck configuration of multiple layers. 12 . The adapter of claim 1 , wherein the AI trainer is further configured to approximate the base model and train the adapter and to achieve a common objective. 13 . The adapter of claim 2 , wherein the AI trainer further comprises a weight constructor comprising a pool of parameters and a set of hyperparameters forming rules of propagation of the parameters from the pool of parameters into the weight matrices of the multiple paths of the non-linear modifier, and wherein the weight constructor is configured to: update the pool of parameters and the set of hyperparameters for given weights of the base model; and propagate the parameters from the pool of parameters to different weight matrices of different paths according to the trained hyperparameters. 14 . The adapter of claim 1 , wherein the AI trainer updates weights of the adapter for frozen weights of the base model. 15 . The adapter of claim 1 , wherein weight matrices of the adapter have lower dimensions than weight matrices of the base model. 16 . The adapter of claim 1 , wherein weight matrices of the adapter are coming from a pool of parameters updated by the AI trainer during the tuning, and wherein a number of parameters in the pool of parameters is more than 1000 times less than a number of parameters of the base model. 17 . A method for adapting a base model of an artificial intelligence (AI) system using an adapter, the method comprising: connecting, using a connector of the adapter, the adapter to the base model such that during an operation of the AI system at least some portion of data transformed by the base model is propagated from the base model to the adapter and back from the adapter to the base model; modifying, using a non-linear modifier of the adapter, the data received from the base model non-linearly before returning the modified portion of the data back to the base model; and tuning, using an AI trainer of the adapter, the non-linear modifier of the adapter by propagating training data through the base model and the adapter and updating weights of the non-linear modifier of the adapter for given weights of the base model to optimize a loss function. 18 . The method of claim 17 , wherein the non-linear modifier includes multiple paths formed by multiple AI architectures of data transformation, each of the paths is either a linear path configured to modify the received data linearly or a non-linear path configured to modify the received data non-linearly, wherein an AI architecture of the linear path modifies the received data linearly using one or multiple weight matrices, wherein an AI architecture of the non-linear path modifies the received data non-linearly using one or multiple weight matrices and modifies the received data non-linearly using one or multiple non-linear functions, and wherein the non-linear modifier includes at least one non-linear path. 19 . The method of claim 18 , wherein the AI trainer further comprises a weight constructor comprising a pool of parameters and a set of hyperparameters forming rules of propagation of the parameters from the pool of parameters into the weight matrices of the multiple paths of the non-linear modifier, and wherein the method further comprises: updating, using the weight constructor, the pool of parameters and the set of hyperparameters for given weights of the base model; and propagating, using the weight constructor, the parameters from the pool of parameters to different weight matrices of different paths according to the trained hyperparameters. 20 . A non-transitory computer readable storage medium embodied thereon a program executable by a processor for performing a method, the method comprising: connecting an adapter to a base model of an artificial intelligence (AI) system such that during an operation of the AI system at least some portion of data transformed by the base model is propagated from the base model to the adapter and back from the adapter to the base model; modifying the data received from the base model non-linearly before returning the modified portion of the data back to the base model; and t

Assignees

Inventors

Classifications

  • using neural networks · CPC title

  • Transfer learning · CPC title

  • Processing or translation of natural language (natural language analysis G06F40/20; semantic analysis G06F40/30) · CPC title

  • Semantic analysis · CPC title

  • Backpropagation, e.g. using gradient descent · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025156684A1 cover?
An adapter to a base model of an artificial intelligence (AI) system is disclosed. The adapter includes a connector to connect the adapter to the base model such that during an operation of the AI system at least some portion of data transformed by the base model is propagated from the base model to the adapter and back from the adapter to the base model. The adapter includes a non-linear modif…
Who is the assignee on this patent?
Mitsubishi Electric Res Laboratories Inc
What technology area does this patent fall under?
Primary CPC classification G06N3/0455. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu May 15 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).