Labeling objects in image scenes
US-9396546-B2 · Jul 19, 2016 · US
US9740956B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9740956-B2 |
| Application number | US-201615084405-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 29, 2016 |
| Priority date | Jun 29, 2015 |
| Publication date | Aug 22, 2017 |
| Grant date | Aug 22, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present invention provides a method for object segmentation in videos tagged with semantic labels, including: detecting each frame of a video sequence with an object bounding box detector from a given semantic category and an object contour detector, and obtaining a candidate object bounding box set and a candidate object contour set for each frame of the input video; building a joint assignment model for the candidate object bounding box set and the candidate object contour set and solving the model to obtain the initial object segment sequence; processing the initial object segment, to estimate a probability distribution of the object shapes; and optimizing the initial object segment sequence with a variant of graph cut algorithm that integrates the shape probability distribution, to obtain an optimal segment sequence.
Opening claim text (preview).
What is claimed is: 1. A method for object segmentation in videos tagged with semantic labels, comprising: detecting each frame of a video sequence with an object bounding box detector from a given semantic category and an object contour detector according to a semantic category label to which an object belongs, and obtaining a candidate object bounding box set and a candidate object contour set for each frame of the input video; building a joint assignment model for the candidate object bounding box set and the candidate object contour set, and solving the model to obtain the initial object segment sequence, wherein the initial segmenting sequence is at least one sequence containing the object; processing the initial object segment sequence, to estimate a probability distribution of the object shapes in the input video; and optimizing the initial object segment sequence with a variant of graph cut algorithm that integrates the shape probability distribution of the object, to obtain an optimal segment sequence. 2. The method for object segmentation in videos tagged with semantic labels according to claim 1 , wherein the detecting each frame of the video sequence with the object bounding box detector from a given semantic category and the object contour detector, and obtaining the candidate object bounding box set and the candidate object contour set for each frame of the input video, particularly comprises: detecting each frame of the input video with the object bounding box detector on at least two thresholds according to the semantic category label to which the object belongs, to calculate a comprehensive performance value of the detection result corresponding to the at least two thresholds, and take a threshold corresponding to a maximum comprehensive performance value selected form the comprehensive performance values as an optimal threshold of the object bounding box detector; detecting each frame of the input video with the object bounding box detector on the optimal threshold according to the semantic category label to which the object belongs, to obtain an object bounding box set for each frame of the input video, wherein, a final object bounding box set is a union of the object bounding box set for each frame of the input video and a dummy bounding box; detecting each frame of the input video with an object contour detector based on constrained parametric min-cuts (CPMC) after obtaining the candidate object bounding box set for each frame of the input video, to obtain a candidate object contour set for each frame of the input video. 3. The method for object segmentation in videos tagged with semantic labels according to claim 1 , wherein the building the joint assignment model for the candidate object bounding box set and the candidate object contour set, and solving the model to obtain the initial object segment sequence, particularly comprises: building a first optimization objective function which targets at obtaining the initial object segment sequence corresponding to the object as the optimization object, by setting an allocated 0-1 variable set indicating the candidate object bounding box set and the candidate object contour set respectively; converting the problem of solving the initial segmenting sequence corresponding to the object into a problem of solving minimal cost-maximal flow of a network flow, by indicating a combination of the candidate object bounding box set and the candidate object contour set with a network flow node; solving K max initial sequences satisfying the problem of minimal cost-maximal flow; re-selecting an object contour of the first K initial sequences in the K max initial sequences respectively with a K shortest path algorithm, to obtain K candidate sequence sets; optimizing a selecting status of each candidate sequence with 0-1 variable for each candidate sequence in the K candidate sequence sets through 0-1 quadratic programming; solving the problem of 0-1 quadratic programming with an optimizer, to obtain the initial segmenting sequence corresponding to the object. 4. The method for object segmentation in videos tagged with semantic labels according to claim 3 , wherein the building the first optimization objective function which targets at obtaining the initial object segment sequence corresponding to the object as the optimization object, by setting the allocated 0-1 variable set indicating the candidate object bounding box set and the candidate object contour set respectively, particularly comprises: setting a set A={a D k |∀k,t,DεD t } for the candidate object bounding box set, wherein D t indicates a candidate object bounding box set of the t-th frame of the input video, a D k ε{0,1}, when a D k takes a value of 1, it means the bounding box D is assigned to the k-th sequence, and when a D k takes a value of 0, it means the bounding box D is not assigned to the k-th sequence; setting a set B={b S k |∀k,t,SεS t } for the candidate object contour set, wherein, S t indicates a candidate object contour set of the t-th frame of the input video, b S k ε{0,1}, when b S k takes a value of 1, it means the contour S is assigned to the k-th sequence, and when b S k takes a value of 0, it means the contour S is not assigned to the k-th sequence; building a first optimization objective function taking the initial segment sequence corresponding to the object as the optimization object, min A , B L ( A , B ) + λ 1 Ω 1 ( A , B ) + λ 2 Ω 2 ( B ) } , by taking the set A and the set B as variables, constraint conditions are: { a D k ,
Edge-based segmentation · CPC title
Determination of region of interest [ROI] or a volume of interest [VOI] · CPC title
Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features (colour feature extraction G06V10/56) · CPC title
Physics · mapped topic
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.