Tree structured CRF with unary potential function using action unit features of other segments as context feature

US10445582B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10445582-B2
Application numberUS-201615385291-A
CountryUS
Kind codeB2
Filing dateDec 20, 2016
Priority dateDec 20, 2016
Publication dateOct 15, 2019
Grant dateOct 15, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of determining a composite action from a video clip, using a conditional random field (CRF), the method includes determining a plurality of features from the video clip, each of the features having a corresponding temporal segment from the video clip. The method may continue by determining, for each of the temporal segments corresponding to one of the features, an initial estimate of an action unit label from a corresponding unary potential function, the corresponding unary potential function having as ordered input the plurality of features from a current temporal segment and at least one other of the temporal segments. The method may further include determining the composite action by jointly optimizing the initial estimate of the action unit labels.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method of determining a classification of a composite action including a plurality of action units in a video clip, the method comprising: extracting a plurality of features from the video clip; determining a corresponding feature in the plurality of features for each of temporal segments of the video clip; determining an initial estimate of an action unit for each of the temporal segments using a potential function for each segment modeling dependency between a concatenation of features and a classification of a corresponding action unit by inputting a feature from a current temporal segment and a feature from at least one of preceding temporal segments or subsequent temporal segments as the concatenation of features; aggregating the potential functions into a probability distribution; and determining the classification of the composite action using the probability distribution by jointly inferring the classification of the composite action and classifications of the action units of the temporal segments based on the initial estimate of each action unit for each of the temporal segments. 2. The method according to claim 1 , wherein the plurality of features are semantic features. 3. The method according to claim 1 , wherein the plurality of features are low level features. 4. The method according to claim 1 , further comprising: classifying at least one contextual object in at least one other of the temporal segments preceding the current segment, the at least one contextual object being independent of any action units of interest in the at least one other of the temporal segments preceding the current segment; and determining an action unit of interest in the current segment of the video clip, the action unit of interest being performed with the classified at least one contextual object and the determination of the action unit of interest in the current segment being based on the classification of the at least one contextual object wherein the current segment and the other segment preceding the current segment are disjoint. 5. The method according to claim 1 , wherein the probability distribution is a conditional random field (CRF) probability distribution. 6. The method according to claim 5 , wherein the CRF has a tree structure. 7. The method according to claim 5 , wherein the CRF is in log-linear form. 8. A non-transitory computer readable medium having a computer program recorded on the computer readable medium, the computer program being executable by a computer system to perform a method of determining a classification of a composite action including a plurality of action units in a video clip, the method comprising: extracting a plurality of features from the video clip; determining a corresponding feature in the plurality of features for each of temporal segments of the video clip; determining an initial estimate of an action unit for each of the temporal segments using a potential function for each segment modeling dependency between a concatenation of features and a classification of a corresponding action unit by inputting a feature from a current temporal segment and a feature from at least one of preceding temporal segments or subsequent temporal segments as the concatenation of features; aggregating the potential functions into a probability distribution; and determining the classification of the composite action using the probability distribution by jointly inferring the classification of the composite action and classifications of the action units of the temporal segments based on the initial estimate of each action unit for each of the temporal segments. 9. A computer system, comprising: a processor; a memory having a computer program recorded thereon, the memory being in communication with the processor; the processor executing the computer program to perform a method of determining a classification of a composite action including a plurality of action units in a video clip, the method comprising: extracting a plurality of features from the video clip; determining a corresponding feature in the plurality of features for each of temporal segments of the video dip; determining an initial estimate of an action unit for each of the temporal segments using a potential function for each segment modeling dependency between a concatenation of features and a classification of a corresponding action unit by inputting a feature from a current temporal segment and a feature from at least one of preceding temporal segments or subsequent temporal segments as the concatenation of features; aggregating the potential functions into a probability distribution; and determining the classification of the composite action using the probability distribution by jointly inferring the classification of the composite action and classifications of the action units of the temporal segments based on the initial estimate of each action unit for each of the temporal segments.

Assignees

Inventors

Classifications

  • Markov-related models; Markov random fields · CPC title

  • Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods · CPC title

  • using classification, e.g. of video objects · CPC title

  • Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items (segmenting video sequences G06V20/49) · CPC title

  • G06V20/49Primary

    Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10445582B2 cover?
A method of determining a composite action from a video clip, using a conditional random field (CRF), the method includes determining a plurality of features from the video clip, each of the features having a corresponding temporal segment from the video clip. The method may continue by determining, for each of the temporal segments corresponding to one of the features, an initial estimate of a…
Who is the assignee on this patent?
Canon Kk
What technology area does this patent fall under?
Primary CPC classification G06V20/49. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 15 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).