Multi-source multi-modal activity recognition in aerial video surveillance

US9934453B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9934453-B2
Application numberUS-201514727074-A
CountryUS
Kind codeB2
Filing dateJun 1, 2015
Priority dateJun 19, 2014
Publication dateApr 3, 2018
Grant dateApr 3, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Multi-source multi-modal activity recognition for conducting aerial video surveillance comprising detecting and tracking multiple dynamic targets from a moving platform, representing FMV target tracks and chat-messages as graphs of attributes, associating FMV tracks and chat-messages using a probabilistic graph based mapping approach; and detecting spatial-temporal activity boundaries.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for multi-source multi-modal activity recognition in conducting aerial video surveillance comprising: from a moving platform, detecting and tracking, with a video imager, multiple dynamic targets, wherein the detecting and tracking comprises differencing registered frames; using high pixel difference point features to establish correspondences between other points in a previous frame; and clustering point-velocity pairs into motion regions assumed to be individual targets; recording analyst call outs or chats, and appending said analyst call outs or chats to a file; representing full motion video (FMV) target tracks and chat-messages as graphs of attributes; associating said FMV tracks and said chat-messages using a probabilistic graph based mapping approach; detecting spatial-temporal activity boundaries; categorizing activity of said detected multiple dynamic targets; and on a display, presenting said activity. 2. The system of claim 1 , wherein said step of representing FMV target tracks and chat-message as graphs of attributes comprises: dividing tracks into segments; representing attributes of targets as nodes; characterizing relationships between said nodes as edges; and chat parsing. 3. The system of claim 1 , wherein said step of associating FMV tracks and chat-messages comprise probabilistic matching comprising: extracting a chat-message and all video tracks in a given time interval from data sets; generating graph representations of video-tracks and chat messages; and performing partial graph matching using a probabilistic distance measure. 4. The system of claim 1 , wherein said step of detecting spatial-temporal activity boundaries comprises: extracting features from a labeled track segment; clustering features in each activity space; representing each track by a sequence of clusters; and computing a histogram of human motion flow and neighboring intensity variance. 5. The system of claim 1 wherein overhead imagery review excludes eye tracking and/or touch screen input to determine screen locations of targets-of-interest (TOIs) corresponding to analyst call-outs (ACOs). 6. The system of claim 1 wherein learning comprises generating a similarity score. 7. The method of claim 6 wherein a similarity score between a new track and an index is defined by a similarity metric which considers only common clusters and a similarity score of temporal gradient for a cluster sequence. 8. The system of claim 1 further comprising an output report and querying module. 9. The system of claim 1 wherein unlabeled tracks are matched to a learned activity pattern. 10. The system of claim 1 further comprising, a querying module, wherein the querying module further comprises, an activities-of-interest index (AOI) where activities are grouped together by movement type and geo-location; and an adaptive data play-back summary of associated text (ACO) and activity video segments of targets-of-interest (TOIs) in both pixel and geo-coordinates that allow for user-selected filtering by geographic location. 11. A method for multi-source multi-modal activity recognition in conducting aerial video surveillance comprising: tracking a target using a video device on an airborne platform; mapping tracks to graphs comprising multi-graph representation of a single full motion video (FMV) track; parsing and graph representation of chats; associating multi-source graphs and assigning activity classes; learning activity patterns from multi-source associated data; and visualizing event/activity reports on a display and querying by activity type and geo-location, wherein the reports comprise a video summary of activities-of-interest/targets-of-interest AOIs/TOIs allowing non-linear browsing of video content, annotated text-over-video media where only TOIs are highlighted with bounding boxes and synchronized with chat-messages; grouping activities of a same type into an activities index, and allowing adaptive data play-back for user-selected filtering by geographic location. 12. The method of claim 11 wherein probabilities are assigned according to user-defined weights of attributes for actor, shape, time, color, direction, spatial location, tracking confidence, and target mobility. 13. The method of claim 11 comprising using Cluster Objects Using Recognized Sequence of Estimates (COURSE). 14. The method of claim 11 wherein outlier rejection comprises RANdom SAmple Consensus (RANSAC) to remove bad guesses. 15. The method of claim 11 comprising a Multi-INT Activity Pattern Learning and Exploitation (MAPLE) tool. 16. The method of claim 11 comprising a Hyper-Elliptical Learning and Matching (HELM) unsupervised clustering algorithm to learn activity patterns. 17. The method of claim 11 comprising a Multi-media INdexing and explorER (MINER) showing an automatically generated description. 18. The method of claim 11 comprising video-indexed by voice annotations (VIVA) stabilization when two frames being registered have greater than about 35% overlap. 19. A system for a multi-source multi-modal probabilistic graph-based association framework for aerial video surveillance comprising: reviewing by a reviewer at least some of the aerial video surveillance providing reviewed FMV data with a resulting set of non-reviewed FMV data; identifying targets-of-interest corresponding to chat-messages, wherein said chat-messages are the only source to describe a true activity of a target of interest (TOI); extracting objects from a full motion video (FMV) of the aerial video surveillance; detecting activity boundaries comprising segmenting full motion video (FMV) tracks from said aerial video surveillance into semantic sub-tracks/segments; learning activity patterns in low-level feature spaces using the reviewed FMV data; indexing non-reviewed FMV data; and providing to FMV analysts a user interface display for querying and non-linear browsing of multi-source data.

Assignees

Inventors

Classifications

  • G06T7/254Primary

    involving subtraction of images · CPC title

  • Proximity, similarity or dissimilarity measures · CPC title

  • Matching criteria, e.g. proximity measures · CPC title

  • based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate · CPC title

  • Multiple classes · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9934453B2 cover?
Multi-source multi-modal activity recognition for conducting aerial video surveillance comprising detecting and tracking multiple dynamic targets from a moving platform, representing FMV target tracks and chat-messages as graphs of attributes, associating FMV tracks and chat-messages using a probabilistic graph based mapping approach; and detecting spatial-temporal activity boundaries.
Who is the assignee on this patent?
Bae Systems Information & Electronic Systems Integration Inc, Bae Sys Inf & Elect Sys Integ
What technology area does this patent fall under?
Primary CPC classification G06T7/254. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 03 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).