Action recognition in a video sequence

US10691949B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10691949-B2
Application numberUS-201715812685-A
CountryUS
Kind codeB2
Filing dateNov 14, 2017
Priority dateNov 14, 2016
Publication dateJun 23, 2020
Grant dateJun 23, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and system for action recognition in a video sequence is disclosed. The system comprises a camera configured to capture the video sequence and a server configured to perform action recognition. The camera comprises an object identifier that identifies an object of interest in an object image frame of the video sequence; an action candidate recognizer configured to apply a first action recognition algorithm to the object image frame to detect presence of an action candidate; an video extractor configured to produce action image frames of an action video sequence by extracting video data pertaining to a plurality of image frames from the video sequence; and a network interface configured to transfer the action video sequence to the server. The server comprises an action verifier configured to apply a second action recognition algorithm to the action video sequence to verify or reject that the action candidate is an action.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for action recognition in a video sequence captured by a camera, the method comprising: by circuitry of the camera: identifying an object of interest in an image frame of the video sequence; applying a first action recognition algorithm to the image frame to detect an action candidate, wherein the image frame is a single image comprising the object of interest, wherein the first action recognition algorithm uses contextual and/or spatial recognition information of the single image frame to detect the action candidate within the image frame; producing image frames of an action video sequence by extracting video data pertaining to a plurality of image frames from the video sequence, wherein one or more of the plurality of image frames from which the video data is extracted comprises the object of interest; and transferring the action video sequence to a server configured to perform action recognition; and by circuitry of the server: applying a second action recognition algorithm to the action video sequence to verify or reject that the action candidate is an action of a predefined type, wherein the second action recognition algorithm uses temporal information of a plurality of image frames of the action video sequence. 2. The method according to claim 1 , wherein the act of producing the image frames of the action video sequence comprises cropping the plurality of image frames of the video sequence such that the image frames comprising the object of interest comprises at least a portion of the object of interest. 3. The method according to claim 2 , wherein the image frames of the action video sequence comprising the object of interest comprises a portion of background at least partly surrounding the object of interest. 4. The method according to claim 1 , wherein the act of transferring the action video sequence comprises transferring coordinates within the action video sequence to the object of interest. 5. The method according to claim 1 , wherein the method further comprises, by the circuitry of the camera: detecting an object of interest in the video sequence, wherein the act of producing the image frames of the action video sequence comprises extracting video data pertaining to a first predetermined number of image frames of the video sequence related to a point of time before detection of the object of interest. 6. The method according to claim 1 , wherein the method further comprises, by the circuitry of the camera: detecting an object of interest in the video sequence, wherein the act of producing the image frames of the action video sequence comprises extracting video data pertaining to a second predetermined number of image frames of the video sequence related to a point of time after detection of the object of interest. 7. The method according to claim 1 , wherein the camera and the server are separate physical entities positioned at a distance from each other and are configured to communicate with each other via a digital network. 8. A system for action recognition in a video sequence, the system comprising: a camera configured to capture the video sequence and a server configured to perform action recognition, the camera comprising: an object identifier configured to identify an object of interest in an image frame of the video sequence; an action candidate recognizer configured to apply a first action recognition algorithm to the image frame to detect an action candidate, wherein the image frame is a single image comprising the object of interest, wherein the first action recognition algorithm uses contextual and/or spatial recognition information of the single image frame to detect the action candidate within the image frame; a video extractor configured to produce image frames of an action video sequence by extracting video data pertaining to a plurality of image frames from the video sequence, wherein one or more of the plurality of image frames from which the video data is extracted comprises the object of interest; and a network interface configured to transfer the action video sequence to the server, the server comprising: an action verifier configured to apply a second action recognition algorithm to the action video sequence to verify or reject that the action candidate is an action of a predefined type, wherein the second action recognition algorithm uses temporal information of a plurality of image frames of the action video sequence. 9. The system according to claim 8 , wherein the video extractor is further configured to crop the plurality of images frames of the video sequence such that the image frames of the video sequence comprising the object of interest comprises at least a portion of the object of interest. 10. The system according to claim 8 , wherein the video extractor is further configured to crop the plurality of images frames of the video sequence such that the image frames of the video sequence comprising the object of interest comprises a portion of background at least partly surrounding the object of interest. 11. The system according to claim 8 , wherein the object identifier is further configured to detect an object of interest in the video sequence, wherein the video extractor is further configured to extract video data pertaining to a first predetermined number of image frames of the video sequence related to a point of time before detection of the object of interest. 12. The system according to claim 8 , wherein object identifier is further configured to detect an object of interest in the video sequence, wherein the video extractor is further configured to extract video data pertaining to a second predetermined number of image frames of the video sequence related to a point of time after detection of the object of interest.

Assignees

Inventors

Classifications

  • Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items (segmenting video sequences G06V20/49) · CPC title

  • G06V40/20Primary

    Movements or behaviour, e.g. gesture recognition (recognition of facial expressions G06V40/16) · CPC title

  • Control of cameras or camera modules · CPC title

  • G06V20/52Primary

    Surveillance or monitoring of activities, e.g. for recognising suspicious objects (recognising microscopic objects G06V20/69) · CPC title

  • G06V20/42Primary

    of sport video content · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10691949B2 cover?
A method and system for action recognition in a video sequence is disclosed. The system comprises a camera configured to capture the video sequence and a server configured to perform action recognition. The camera comprises an object identifier that identifies an object of interest in an object image frame of the video sequence; an action candidate recognizer configured to apply a first action …
Who is the assignee on this patent?
Axis Ab
What technology area does this patent fall under?
Primary CPC classification G06V40/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 23 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).