Systems and methods for generating improved content based on matching mappings
US-2021263964-A1 · Aug 26, 2021 · US
US12100244B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12100244-B2 |
| Application number | US-202117303365-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 27, 2021 |
| Priority date | Jun 2, 2020 |
| Publication date | Sep 24, 2024 |
| Grant date | Sep 24, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method and system of generating agent and actions prediction based on multi-agent tracking data are disclosed herein. A computing system retrieves tracking data from a data store. The computing system generates a trained neural network by generating a plurality of training data sets based on the tracking data by converting each frame of data into a matrix representation of the data contained in the frame and learning, by the neural network, a start frame and end frame of each action contained in the frame and its associated actor. The computing system receives target tracking data associated with an event. The target tracking data includes a plurality of actors and a plurality of actions. The computing system generates, via the trained neural network, a target start frame and a target end frame of each action identified in the tracking data and a corresponding actor.
Opening claim text (preview).
The invention claimed is: 1. A method of generating agent and actions prediction based on multi-agent tracking data, comprising: retrieving, by a computing system, tracking data from a data store, the tracking data comprising a plurality of frames of data for a plurality of events across a plurality of seasons; generating, by the computing system, a trained neural network, by: generating a plurality of training data sets based on the tracking data by converting each frame of data into a matrix representation of the data contained in the frame; and learning, by the neural network, based on the plurality of training data sets, a start frame and end frame of each action contained in the frame and its associated actor, wherein each action comprises a plurality of sub-actions of a specified duration, and wherein the learning includes initializing a sub-action to action-actor mapping function for use as a ground truth, and a frame level label for the neural network, and wherein the learning includes optimizing the neural network by reducing weighted cross entropy loss between a predicted sub-action, a predicted actor, and the ground truth; receiving, by the computing system, target tracking data associated with an event, the target tracking data comprising a plurality of actors and a plurality of actions; converting, by the computing system, the tracking data into a matrix representation of the tracking data, wherein the matrix representation includes one or more relationships between one or more agents and one or more periods of time; predicting, by the computing system via the trained neural network, a target start frame and a target end frame of each action identified in the tracking data and an associated actor; and presenting, by the computing system, the predicted target start frame, predicted target end frame, and predicted associated actor to one or more end users. 2. The method of claim 1 , wherein the neural network comprises: a spatial attention sub-network; a per-agent convolution network comprising parallel convolutional streams; and a mutual attention sub-network comprising a first multilayer perceptron and a second multilayer perceptron. 3. The method of claim 2 , wherein generating, by the computing system via the trained neural network, the target start frame and the target end frame of each action identified in the tracking data and a corresponding actor; comprises: inputting the matrix representation of the tracking data into the spatial attention sub-network to generate spatial attention coefficients. 4. The method of claim 3 , further comprising: generating a weighted spatiotemporal matrix by multiplying the matrix representation of the tracking data by the spatial attention coefficients; and inputting the weighted spatiotemporal matrix into a first convolutional stream of the parallel convolutional streams and a second convolutional stream of the parallel convolutional streams. 5. The method of claim 4 , further comprising: combining, by the mutual attention sub-network, a first output from the first convolutional stream and a second output from the second convolutional stream to generate a mutual attention layer; and passing the mutual attention layer through the first multilayer perceptron to generate an action prediction; and passing the mutual attention layer through the second multilayer perceptron to generate an actor prediction. 6. The method of claim 1 , wherein the target tracking data comprises raw positional data for the plurality of actors. 7. The method of claim 6 , further comprising: annotating, by the computing system, the raw positional data for the plurality of actors with player position information. 8. The method of claim 6 , further comprising: fusing, by the computing system, the raw positional data for the plurality of actors with manually annotated tracking data. 9. The method of claim 1 , further comprising: inputting, by the computing system, the predicted target start frame, the predicted target end frame, and the predicted associated actor into a refinement module; and generating, by the computing system via the refinement module, a refined predicted target start frame, a refined predicted target end frame, and a refined predicted associated actor. 10. A system for generating agent and actions prediction based on multi-agent tracking data, comprising: a processor; and a memory having programming instructions stored thereon, which, when executed by the processor, performs one or more operations, comprising: retrieving tracking data from a data store, the tracking data comprising a plurality of frames of data for a plurality of events across a plurality of seasons; generating a trained neural network, by: generating a plurality of training data sets based on the tracking data by converting each frame of data into a matrix representation of the data contained in the frame; and learning, by the neural network, based on the plurality of training data sets, a start frame and end frame of each action contained in the frame and its associated actor, wherein each action comprises a plurality of sub-actions of a specified duration, and wherein the learning includes initializing a sub-action to action-actor mapping function for use as a ground truth, and a frame level label for the neural network, and wherein the learning includes optimizing the neural network by reducing weighted cross entropy loss between a predicted sub-action, a predicted actor, and the ground truth; receiving target tracking data associated with an event, the target tracking data comprising a plurality of actors and a plurality of actions; converting the tracking data into a matrix representation of the tracking data, wherein the matrix representation includes one or more relationships between one or more agents and one or more periods of time; predicting, via the trained neural network, a target start frame and a target end frame of each action identified in the tracking data and an associated actor; and presenting the predicted target start frame, the predicted target end frame, and the predicted associated actor to one or more end users. 11. The system of claim 10 , wherein the neural network comprises: a spatial attention sub-network; a per-agent convolution network comprising parallel convolutional streams; and a mutual attention sub-network comprising a first multilayer perceptron and a second multilayer perceptron. 12. The system of claim 11 , wherein generating, via the trained neural network, the target start frame and the target end frame of each action identified in the tracking data and the associated actor, comprises: inputting the matrix representation of the tracking data into the spatial attention sub-network to generate spatial attention coefficients. 13. The system of claim 12 , further comprising: generating a weighted spatiotemporal matrix by multiplying the matrix representation of the tracking data by the spatial attention coefficients; and inputting the weighted spatiotemporal matrix into a first convolutional stream of the parallel convolutional stream and a second convolutional stream of the parallel convolutional stream. 14. The system of claim 13 , further comprising: combining, by the mutual attention sub-network, a first output from the first convolutional stream and a second output from the second convolutional stream to generate a mutual attention layer; passing the mutual attention layer through the first multilayer perceptron to generate an action prediction; and passing the mutual attention layer through the second multilayer perceptron
Convolutional networks [CNN, ConvNet] · CPC title
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
Surveillance or monitoring of activities, e.g. for recognising suspicious objects (recognising microscopic objects G06V20/69) · CPC title
using neural networks · CPC title
Combinations of networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.