Systems and methods for identifying activities and/or events in media contents based on object data and scene data

US9940522B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9940522-B2
Application numberUS-201615211403-A
CountryUS
Kind codeB2
Filing dateJul 15, 2016
Priority dateApr 26, 2016
Publication dateApr 10, 2018
Grant dateApr 10, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

There is provided a system including a non-transitory memory storing an executable code and a hardware processor executing the executable code to receive a plurality of training contents depicting a plurality of activities, extract training object data from the plurality of training contents including a first training object data corresponding to a first activity, extract training scene data from the plurality of training contents including a first training scene data corresponding to the first activity, determine that a probability of the first activity is maximized when the first training object data and the first training scene data both exist in a sample media content.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: a non-transitory memory storing an executable code; and a hardware processor executing the executable code to: receive a plurality of training contents depicting a plurality of activities; extract training object data from the plurality of training contents including a first training object data corresponding to a first activity; extract training scene data from the plurality of training contents including a first training scene data corresponding to the first activity; and determine that a probability of the first activity is maximized when the first training object data and the first training scene data both exist in a sample media content. 2. The system of claim 1 , wherein the plurality of training contents include at least one of object annotations, scene annotations, and activity annotations. 3. The system of claim 1 , wherein the hardware processor further executes the executable code to: store the first training object data and the first training scene data in an activity database, the first training object data and the first training scene data being associated with the first activity in the activity database. 4. The system of claim 3 , wherein the hardware processor further executes the executable code to: receive a media content; extract first object data from the media content; extract first scene data from the media content; compare the first object data and the first scene data with the training object data and the training scene data of the activity database, respectively; and determine that the media content probably shows the first activity when the comparing finds a match for both the first object data and the first scene data in the activity database. 5. The system of claim 4 , wherein the media content includes at least one of object annotations, scene annotations, and activity annotations. 6. The system of claim 4 , wherein the first object data includes at least one of a color and a shape. 7. The system of claim 4 , wherein the first scene data includes a location. 8. The system of claim 3 , wherein the hardware processor further executes the executable code to: receive a media content; extract second object data from the media content; extract second scene data from the media content; compare the second object data and the second scene data with the training object data and the training scene data of the activity database, respectively; and determine that the media content probably shows a new activity when the comparing finds a first similarity between the second object data and the training object data of the activity database, and a second similarity between the scene data and the training scene data of the activity database. 9. The system of claim 8 , wherein, prior to the determining, the hardware processor executes the executable code to: receive a new activity instruction describing the new activity; and include the new activity instruction when determining the media content probably shows the new activity. 10. A method for use with a system including a non-transitory memory and a hardware processor, the method comprising: receiving, using the hardware processor, a plurality of training contents depicting a plurality of activities; extracting, using the hardware processor, training object data from the plurality of training contents including a first training object data corresponding to a first activity; extracting, using the hardware processor, training scene data from the plurality of training contents including a first training scene data corresponding to the first activity; and determining, using the hardware processor, a probability is maximized that the first activity is shown when the first training object data and the first training scene data are both included in a sample media content. 11. The method of claim 10 , wherein the plurality of training contents include at least one of object annotations, scene annotations, and activity annotations. 12. The method of claim 10 , further comprising: storing, in the non-transitory memory, the first training object data and the first training scene data in an activity database, the first training object data and the first training scene data being associated with the first activity in the activity database. 13. The method of claim 10 , further comprising: receiving, using the hardware processor, a media content; extracting, using the hardware processor, object data from the media content; extracting, using the hardware processor, scene data from the media content; comparing, using the hardware processor, the first object data and the first scene data with the training object data and the training scene data of the activity database, respectively; and determining, using the hardware processor, that the media content probably shows the first activity when the comparing finds a match for both the first object data and the first scene data in the activity database. 14. The method of claim 13 , wherein the media content includes at least one of object annotations, scene annotations, and activity annotations. 15. The method of claim 13 , wherein determining the media content shows the first activity includes identifying the first activity as the most probable activity indicated by the first activity data. 16. The method of claim 13 , wherein the first object data includes at least one of a color and a shape. 17. The method of claim 13 , wherein the first scene data includes a location. 18. The method of claim 12 , further comprising: receiving, using the hardware processor, a media content; extracting, using the hardware processor, second object data from the media content; extracting, using the hardware processor, second scene data from the media content; comparing, using the hardware processor, the second object data and the second scene data with the training object data and the training scene data of the activity database, respectively; and determining, using the hardware processor, that the media content probably shows a new activity when the comparing finds a first similarity between the second object data and the training object data of the activity database, and a second similarity between the scene data and the training scene data of the activity database. 19. The method of claim 18 , wherein, prior to the determining, the method further comprises: receiving, using the hardware processor, a new activity instruction describing the new activity; and including, using the hardware processor, the new activity instruction when determining the media content probably shows the new activity. 20. A system for determining whether a media content includes a first activity using an activity database, the activity database including activity object data and activity scene data for a plurality of activities including the first activity, the system comprising: a non-transitory memory storing an activity identification software; a hardware processor executing the activity identification software to: receive a first media content including a first object data and a first scene data; compare the first object data and the first scene data with the activity object data and the activity scene data of the activity database, respectively; and determine that the first media content probably includes the first activity when the comparing finds a match for both the first object data and the first scene data in the activity database.

Assignees

Inventors

Classifications

  • Detecting features for summarising video content · CPC title

  • G06V40/20Primary

    Movements or behaviour, e.g. gesture recognition (recognition of facial expressions G06V40/16) · CPC title

  • Combinations of networks · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9940522B2 cover?
There is provided a system including a non-transitory memory storing an executable code and a hardware processor executing the executable code to receive a plurality of training contents depicting a plurality of activities, extract training object data from the plurality of training contents including a first training object data corresponding to a first activity, extract training scene data fr…
Who is the assignee on this patent?
Disney Entpr Inc
What technology area does this patent fall under?
Primary CPC classification G06V40/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 10 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).