Ai-based image region recognition method and apparatus and ai-based model training method and apparatus
US-2021366123-A1 · Nov 25, 2021 · US
US12506891B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12506891-B2 |
| Application number | US-202318339783-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 22, 2023 |
| Priority date | Dec 24, 2020 |
| Publication date | Dec 23, 2025 |
| Grant date | Dec 23, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present disclosure relates to methods and apparatuses for decoding data for (still or video processing into a bitstream). Two or more sets of segmentation information elements are obtained from the bitstream. Then, each of the two or more sets of segmentation information elements are inputted respectively into two or more segmentation information processing layers out of a plurality of cascaded layers. In each of the two or more segmentation information processing layers, the respective sets of segmentation information are processed. The decoded data for picture or video processing are obtained based on the segmentation information processed by the plurality of cascaded layers. Accordingly, the data may be decoded from the bitstream in an efficient manner in the layered structure.
Opening claim text (preview).
What is claimed is: 1 . A method for decoding data for picture or video processing from a bitstream, the method comprising: obtaining, from the bitstream, two or more sets of segmentation information elements, wherein at least one set of segmentation information elements of the two or more sets of segmentation information elements are represented by a set of binary flags; inputting each of the two or more sets of segmentation information elements respectively into two or more segmentation information processing layers out of a plurality of cascaded layers; processing, in each of the two or more segmentation information processing layers, the respective sets of segmentation information, wherein the segmentation information processed respectively in the two or more segmentation information processing layers differ in resolution, wherein the processing of the segmentation information in the two or more segmentation information processing layers includes upsampling, wherein obtaining the decoded data for picture or video processing is based on the segmentation information processed by the plurality of cascaded layers for motion estimation, and wherein for each segmentation information processing layer j of the plurality of N segmentation information processing layers out of the plurality of cascaded layers: the inputting comprises, inputting initial segmentation information from the bitstream if j=1, and otherwise inputting segmentation information processed by the (j−1)-th segmentation information processing layer; and outputting the processed segmentation information, wherein the processed segmentation information comprises an upsampled set of binary segmentation flags corresponding to the at least one set of segmentation information elements. 2 . The method according to claim 1 , wherein the obtaining of the sets of segmentation information elements is based on segmentation information processed by at least one segmentation information processing layer out of the plurality of cascaded layers. 3 . The method according to claim 1 , wherein the inputting of the sets of segmentation information elements is based on the processed segmentation information outputted by at least one of the plurality of cascaded layers. 4 . The method according to claim 1 , wherein said upsampling of the segmentation information comprises a nearest neighbor upsampling. 5 . The method according to claim 1 , wherein said upsampling of the segmentation information comprises a transposed convolution. 6 . The method according to claim 1 , wherein the processing of the inputted segmentation information by each layer j<N of the plurality of N segmentation information processing layers further comprises: parsing, from the bitstream, a segmentation information element and associating the parsed segmentation information element with the segmentation information outputted by a preceding layer, wherein the position of the parsed segmentation information element in the associated segmentation information is determined based on the segmentation information outputted by the preceding layer. 7 . The method according to claim 6 , wherein the amount of segmentation information elements parsed from the bitstream is determined based on segmentation information outputted by the preceding layer. 8 . The method according to claim 1 , wherein obtaining decoded data for picture or video processing comprises determining of at least one of: intra- or inter-picture prediction mode; picture reference index; single-reference or multiple-reference prediction (including bi-prediction); presence or absence prediction residual information; quantization step size; motion information prediction type; length of the motion vector motion vector resolution; motion vector prediction index motion vector difference size motion vector difference resolution motion interpolation filter in-loop filter parameters post-filter parameters; based on segmentation information. 9 . The method according to claim 1 , further comprising: obtaining, from the bitstream, sets of feature map elements and inputting the sets of feature map elements respectively into a feature map processing layer out of the plurality of layers based on the segmentation information processed by a segmentation information processing layer; and obtaining the decoded data for picture or video processing based on a feature map processed by the plurality of cascaded layers. 10 . The method according to claim 9 , wherein at least one out of the plurality of cascaded layers is a segmentation information processing layer and a feature map processing layer. 11 . The method according to claim 9 , wherein, each layer out of the plurality of layers is either a segmentation information processing layer or a feature map processing layer. 12 . A computer program product stored on a non-transitory medium, which when executed on one or more processors performs the method according to claim 1 . 13 . A device for decoding an Image or video including a processing circuitry which is configured to perform the method according to claim 1 .
characterised by syntax aspects related to video coding, e.g. related to compression standards · CPC title
involving filtering within a prediction loop · CPC title
Auto-encoder networks; Encoder-decoder networks · CPC title
Combinations of networks · CPC title
Filters, e.g. for pre-processing or post-processing (sub-band filter banks H04N19/635) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.