Method for object segmentation in videos tagged with semantic labels

US9740956B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9740956-B2
Application numberUS-201615084405-A
CountryUS
Kind codeB2
Filing dateMar 29, 2016
Priority dateJun 29, 2015
Publication dateAug 22, 2017
Grant dateAug 22, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present invention provides a method for object segmentation in videos tagged with semantic labels, including: detecting each frame of a video sequence with an object bounding box detector from a given semantic category and an object contour detector, and obtaining a candidate object bounding box set and a candidate object contour set for each frame of the input video; building a joint assignment model for the candidate object bounding box set and the candidate object contour set and solving the model to obtain the initial object segment sequence; processing the initial object segment, to estimate a probability distribution of the object shapes; and optimizing the initial object segment sequence with a variant of graph cut algorithm that integrates the shape probability distribution, to obtain an optimal segment sequence.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for object segmentation in videos tagged with semantic labels, comprising: detecting each frame of a video sequence with an object bounding box detector from a given semantic category and an object contour detector according to a semantic category label to which an object belongs, and obtaining a candidate object bounding box set and a candidate object contour set for each frame of the input video; building a joint assignment model for the candidate object bounding box set and the candidate object contour set, and solving the model to obtain the initial object segment sequence, wherein the initial segmenting sequence is at least one sequence containing the object; processing the initial object segment sequence, to estimate a probability distribution of the object shapes in the input video; and optimizing the initial object segment sequence with a variant of graph cut algorithm that integrates the shape probability distribution of the object, to obtain an optimal segment sequence. 2. The method for object segmentation in videos tagged with semantic labels according to claim 1 , wherein the detecting each frame of the video sequence with the object bounding box detector from a given semantic category and the object contour detector, and obtaining the candidate object bounding box set and the candidate object contour set for each frame of the input video, particularly comprises: detecting each frame of the input video with the object bounding box detector on at least two thresholds according to the semantic category label to which the object belongs, to calculate a comprehensive performance value of the detection result corresponding to the at least two thresholds, and take a threshold corresponding to a maximum comprehensive performance value selected form the comprehensive performance values as an optimal threshold of the object bounding box detector; detecting each frame of the input video with the object bounding box detector on the optimal threshold according to the semantic category label to which the object belongs, to obtain an object bounding box set for each frame of the input video, wherein, a final object bounding box set is a union of the object bounding box set for each frame of the input video and a dummy bounding box; detecting each frame of the input video with an object contour detector based on constrained parametric min-cuts (CPMC) after obtaining the candidate object bounding box set for each frame of the input video, to obtain a candidate object contour set for each frame of the input video. 3. The method for object segmentation in videos tagged with semantic labels according to claim 1 , wherein the building the joint assignment model for the candidate object bounding box set and the candidate object contour set, and solving the model to obtain the initial object segment sequence, particularly comprises: building a first optimization objective function which targets at obtaining the initial object segment sequence corresponding to the object as the optimization object, by setting an allocated 0-1 variable set indicating the candidate object bounding box set and the candidate object contour set respectively; converting the problem of solving the initial segmenting sequence corresponding to the object into a problem of solving minimal cost-maximal flow of a network flow, by indicating a combination of the candidate object bounding box set and the candidate object contour set with a network flow node; solving K max initial sequences satisfying the problem of minimal cost-maximal flow; re-selecting an object contour of the first K initial sequences in the K max initial sequences respectively with a K shortest path algorithm, to obtain K candidate sequence sets; optimizing a selecting status of each candidate sequence with 0-1 variable for each candidate sequence in the K candidate sequence sets through 0-1 quadratic programming; solving the problem of 0-1 quadratic programming with an optimizer, to obtain the initial segmenting sequence corresponding to the object. 4. The method for object segmentation in videos tagged with semantic labels according to claim 3 , wherein the building the first optimization objective function which targets at obtaining the initial object segment sequence corresponding to the object as the optimization object, by setting the allocated 0-1 variable set indicating the candidate object bounding box set and the candidate object contour set respectively, particularly comprises: setting a set A={a D k |∀k,t,DεD t } for the candidate object bounding box set, wherein D t indicates a candidate object bounding box set of the t-th frame of the input video, a D k ε{0,1}, when a D k takes a value of 1, it means the bounding box D is assigned to the k-th sequence, and when a D k takes a value of 0, it means the bounding box D is not assigned to the k-th sequence; setting a set B={b S k |∀k,t,SεS t } for the candidate object contour set, wherein, S t indicates a candidate object contour set of the t-th frame of the input video, b S k ε{0,1}, when b S k takes a value of 1, it means the contour S is assigned to the k-th sequence, and when b S k takes a value of 0, it means the contour S is not assigned to the k-th sequence; building a first optimization objective function taking the initial segment sequence corresponding to the object as the optimization object, min A , B ⁢ ⁢ L ⁡ ( A , B ) + λ 1 ⁢ Ω 1 ⁡ ( A , B ) + λ 2 ⁢ Ω 2 ⁡ ( B ) } , by taking the set A and the set B as variables, constraint conditions are: { a D k ,

Assignees

Inventors

Classifications

  • G06T7/12Primary

    Edge-based segmentation · CPC title

  • Determination of region of interest [ROI] or a volume of interest [VOI] · CPC title

  • Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features (colour feature extraction G06V10/56) · CPC title

  • Physics · mapped topic

  • G06K9/4604Primary

    Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9740956B2 cover?
The present invention provides a method for object segmentation in videos tagged with semantic labels, including: detecting each frame of a video sequence with an object bounding box detector from a given semantic category and an object contour detector, and obtaining a candidate object bounding box set and a candidate object contour set for each frame of the input video; building a joint assig…
Who is the assignee on this patent?
Univ Beihang
What technology area does this patent fall under?
Primary CPC classification G06T7/12. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 22 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).