Method, apparatus, and storage medium for encoding/decoding feature map for machine vision
US-2023336780-A1 · Oct 19, 2023 · US
US12556743B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12556743-B2 |
| Application number | US-202318302719-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 18, 2023 |
| Priority date | Apr 18, 2022 |
| Publication date | Feb 17, 2026 |
| Grant date | Feb 17, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed herein are a method, an apparatus, and a storage medium for encoding/decoding a feature map for machine vision. In an embodiment, an encoding method includes generating a multi-channel feature map by aligning multiple feature maps, generating a converted feature map by performing conversion on the multi-channel feature map and performing encoding on the converted feature map.
Opening claim text (preview).
What is claimed is: 1 . An encoding method, comprising: generating a multi-channel feature map by aligning multiple feature maps; generating a converted feature map by performing conversion on the multi-channel feature map; and performing encoding on the converted feature map, wherein performing the encoding comprises: rearranging channels of the converted feature map based on a correlation between the channels of the converted feature map. 2 . The encoding method of claim 1 , wherein generating the multi-channel feature map comprises: scaling feature maps having different resolutions to an identical resolution. 3 . The encoding method of claim 1 , wherein generating the converted feature map comprises: extracting representative values for respective channels of the multi-channel feature map. 4 . The encoding method of claim 3 , wherein the representative values for respective channels are adjusted through a fully-connected layer. 5 . The encoding method of claim 3 , wherein the multi-channel feature map is adjusted based on the representative values for respective channels. 6 . The encoding method of claim 1 , wherein generating the converted feature map is performed based on a conversion network including one or more of a pooling layer, a fully-connected layer, a convolution layer, and an activation function layer. 7 . An encoding method, comprising: generating a multi-channel feature map by aligning multiple feature maps; generating a converted feature map by performing conversion on the multi-channel feature map; and performing encoding on the converted feature map, wherein: the converted feature map includes a number of channels different from that of the multi-channel feature map, and generating the converted feature map comprises adjusting a quantization parameter based on a change in the number of channels in the converted feature map. 8 . The encoding method of claim 1 , wherein generating the converted feature map comprises: calculating importance levels for respective channels of the multi-channel feature map and converting a number of channels based on the importance levels for the respective channels. 9 . The encoding method of claim 1 , wherein the correlation between the channels is calculated based on an average value for each channel, a median value for each channel, or a Mean Squared Error (MSE) between the channels. 10 . The encoding method of claim 1 , wherein performing the encoding comprises: converting the converted feature map into a feature map of one frame. 11 . The encoding method of claim 10 , wherein performing the encoding further comprises: performing tile-based encoding on the converted feature map of one frame. 12 . A decoding method, comprising: performing size reconstruction on a converted feature map; and generating a reconstructed feature map by reconstructing a number of channels in the converted feature map, wherein the reconstructed feature map includes feature maps having different resolutions or different numbers of channels, and wherein the channels of the converted feature map are rearranged based on a correlation between the channels of the converted feature map. 13 . The decoding method of claim 12 , wherein performing the size reconstruction comprises: differently configuring layers for reconstruction depending on a resolution of a feature map desired to be reconstructed. 14 . The decoding method of claim 12 , wherein the reconstructed feature map is generated based on a reconstruction network including one or more of a pooling layer, a fully-connected layer, a convolution layer, and an activation function layer. 15 . The decoding method of claim 12 , wherein performing the size reconstruction comprises: performing size reconstruction using size information of an original feature map. 16 . The decoding method of claim 12 , wherein the size reconstruction or the reconstruction of the number of channels is performed using residual-based reconstruction.
using hierarchical techniques, e.g. scalability (H04N19/63 takes precedence) · CPC title
the unit being a colour or a chrominance component · CPC title
Selection of coding mode or of prediction mode · CPC title
the region being a slice, e.g. a line of blocks or a group of blocks · CPC title
the region being a picture, frame or field · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.