Who is the assignee on this patent?

Beijing Baidu Netcom Sci & Tech Co Ltd

What technology area does this patent fall under?

Primary CPC classification G06V10/776. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 20 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Model interpretation method, image processing method, electronic device, and storage medium

US12530879B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12530879-B2
Application number	US-202318099551-A
Country	US
Kind code	B2
Filing date	Jan 20, 2023
Priority date	Sep 15, 2022
Publication date	Jan 20, 2026
Grant date	Jan 20, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Provided is a model interpretation method, an image processing method, an electronic device and a storage medium, relating to the field of artificial intelligence, in particular to the field of deep learning. The model interpretation method includes: obtaining a token vector corresponding to an image feature input to a first model; obtaining a model prediction result output by the first model; and determining, according to a combination of an attention weight and a gradient, an association relation between the token vector input to the first model and the model prediction result output by the first model, where the association relation is used to characterize interpretability of the first model.

First claim

Opening claim text (preview).

What is claimed is: 1 . A model interpretation method, comprising: obtaining a token vector corresponding to an image feature input to a first model, wherein the token vector corresponding to the image feature is a token-level vector, in which an image is divided into fixed-size patches without overlapping, each patch of the fixed-size patches is pulled into a one-dimensional vector, and all one-dimensional vectors of the fixed-size patches are recorded as a Classification Token (CLS) sequence; obtaining a model prediction result output by the first model; performing, according to an attention weight, perception of model interpretation, to obtain a first interpretation result, comprising one of: adopting an estimation method based on a feature token, to obtain the first interpretation result, and adopting an estimation method based on an attention head, to obtain the first interpretation result; solving an integral gradient from the attention weight, to obtain a gradient of the attention weight; performing, according to the gradient of the attention weight, decision-making of the model interpretation, to obtain a second interpretation result; and performing point-multiplication on the first interpretation result and the second interpretation result, to obtain an association relation between the token vector input to the first model and the model prediction result output by the first model, wherein the association relation is used to characterize interpretability of the first model. 2 . The method of claim 1 , wherein in a case where the estimation method based on the feature token is adopted, performing, according to the attention weight, the perception of the model interpretation, to obtain the first interpretation result, comprises: weighting, for a self-attention module in the first model, the token vector with a first attention weight, to obtain an association relation based on the token vector, wherein the first attention weight is weights for different token vectors; and performing, according to the association relation based on the token vector, the perception of the model interpretation, to obtain the first interpretation result. 3 . The method of claim 1 , wherein in a case where the estimation method based on the attention head is adopted, performing, according to the attention weight, the perception of the model interpretation, to obtain the first interpretation result, comprises: weighting, for a self-attention module in the first model, the token vector with a second attention weight, to obtain an association relation based on the attention head, wherein the second attention weight is weights for different attention heads; and performing, according to the association relation based on the attention head, the perception of the model interpretation, to obtain the first interpretation result. 4 . The method of claim 1 , wherein the first model is a trained model, or a model to be trained. 5 . An image processing method, comprising: inputting a token vector corresponding to an image feature to be processed to a first model, to execute an image processing including at least one of image classification, image recognition, or image segmentation, wherein the first model obtains an association relation between the token vector input to the first model and a model prediction result output by the first model, according to the model interpretation method of claim 1 , and the association relation is used to characterize interpretability of the first model; and executing at least one of following processing by adopting the association relation: performing, according to the association relation, compensatory processing on the model prediction result output by the first model; performing, according to the association relation, reliability assessment processing on the first model; or performing, according to the association relation, traceability processing on the first model. 6 . An electronic device, comprising: at least one processor; and a memory connected in communication with the at least one processor; wherein the memory stores an instruction executable by the at least one processor, and the instruction, when executed by the at least one processor, enables the at least one processor to execute operations, comprising: inputting a token vector corresponding to an image feature to be processed to a first model, to execute an image processing including at least one of image classification, image recognition, or image segmentation, wherein the first model obtains an association relation between the token vector input to the first model and a model prediction result output by the first model, according to the model interpretation method of claim 1 , and the association relation is used to characterize interpretability of the first model; and executing at least one of following processing by adopting the association relation: performing, according to the association relation, compensatory processing on the model prediction result output by the first model; performing, according to the association relation, reliability assessment processing on the first model; or performing, according to the association relation, traceability processing on the first model. 7 . A non-transitory computer-readable storage medium storing a computer instruction thereon, wherein the computer instruction is used to cause a computer to execute operations, comprising: inputting a token vector corresponding to an image feature to be processed to a first model, to execute an image processing including at least one of image classification, image recognition, or image segmentation, wherein the first model obtains an association relation between the token vector input to the first model and a model prediction result output by the first model, according to the model interpretation method of claim 1 , and the association relation is used to characterize interpretability of the first model; and executing at least one of following processing by adopting the association relation: performing, according to the association relation, compensatory processing on the model prediction result output by the first model; performing, according to the association relation, reliability assessment processing on the first model; or performing, according to the association relation, traceability processing on the first model. 8 . An electronic device, comprising: at least one processor; and a memory connected in communication with the at least one processor; wherein the memory stores an instruction executable by the at least one processor, and the instruction, when executed by the at least one processor, enables the at least one processor to execute operations, comprising: obtaining a token vector corresponding to an image feature input to a first model, wherein the token vector corresponding to the image feature is a token-level vector, in which an image is divided into fixed-size patches without overlapping, each patch of the fixed-size patches is pulled into a one-dimensional vector, and all one-dimensional vectors of the fixed-size patches are recorded as a Classification Token (CLS) sequence; obtaining a model prediction result output by the first model; performing, according to an attention weight, perception of model interpretation, to obtain a first interpretation result, by one of: adopting an estimation method based on a feature token, to obtain the first interpretation result, and adopting an estimation method based on an attention head, to obtain the first interpretation result; solving an integral gradient from the attention weight, to obtain a gradient of the attention weight; performing, according to the gradient of the attent

Assignees

Beijing Baidu Netcom Sci & Tech Co Ltd

Inventors

Classifications

G06V10/776Primary
Validation; Performance evaluation · CPC title
G06N3/063
using electronic means · CPC title
G06N3/08
Learning methods · CPC title
G06V10/82Primary
using neural networks · CPC title

Patent family

Related publications grouped by family.

View patent family 84302509

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12530879B2 cover?: Provided is a model interpretation method, an image processing method, an electronic device and a storage medium, relating to the field of artificial intelligence, in particular to the field of deep learning. The model interpretation method includes: obtaining a token vector corresponding to an image feature input to a first model; obtaining a model prediction result output by the first model; …
Who is the assignee on this patent?: Beijing Baidu Netcom Sci & Tech Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06V10/776. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 20 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).