What technology area does this patent fall under?

Primary CPC classification G06T7/20. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 27 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Analysing objects in a set of frames

US12073567B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12073567-B2
Application number	US-202117187831-A
Country	US
Kind code	B2
Filing date	Feb 28, 2021
Priority date	Feb 27, 2020
Publication date	Aug 27, 2024
Grant date	Aug 27, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of analysing objects in a first frame and a second frame is disclosed. The method includes segmenting the frames, and matching at least one object in the first frame with a corresponding object in the second frame. The method optionally includes estimating the motion of the at least one matched object between the frames. Also disclosed is a method of generating a training dataset suitable for training machine learning algorithms to estimate the motion of objects. Also provided are processing systems configured to carry out these methods.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of analyzing one or more objects in a set of frames comprising at least a first frame and a second frame, the method comprising: segmenting the first frame to produce a plurality of first masks, each first mask being a pixel map identifying pixels belonging to a potential object-instance detected in the first frame; for each potential object-instance detected in the first frame, extracting from the first frame a first feature vector characterising the potential object-instance; segmenting the second frame to produce a plurality of second masks, each second mask being a pixel map identifying pixels belonging to a potential object-instance detected in the second frame; for each potential object-instance detected in the second frame, extracting from the second frame a second feature vector characterising the potential object-instance; and matching at least one of the potential object-instances in the first frame with one of the potential object-instances in the second frame, based at least in part on the first feature vectors, the first masks, the second feature vectors and the second masks, wherein the matching comprises clustering the potential object-instances detected in the first and second frames, based at least in part on the first feature vectors and the second feature vectors, to generate clusters of potential object-instances. 2. The method of claim 1 , wherein the matching further comprises, for each cluster in each frame: evaluating a distance between the potential object-instances in the cluster in that frame; and splitting the cluster into multiple clusters based on a result of the evaluating. 3. The method of claim 1 , wherein the matching comprises selecting a single object-instance from among the potential object-instances in each cluster in each frame. 4. The method of claim 3 , wherein the matching comprises matching at least one of the single object-instances in the first frame with a single object-instance in the second frame. 5. The method of claim 1 , wherein the matching comprises rejecting potential object-instances based on any one or any combination of two or more of the following: an object confidence score, which estimates whether a potential object-instance is more likely to be an object or part of the background; a mask confidence score, which estimates a likelihood that a mask represents an object; and a mask area. 6. The method of claim 5 , wherein the mask confidence score is generated by a machine learning algorithm trained to predict a degree of correspondence between the mask and a ground truth mask. 7. The method of claim 1 , wherein the masks and feature vectors are generated by a first machine learning algorithm. 8. The method of claim 1 , further comprising for at least one matched object in the first frame and the second frame, estimating a motion of the object between the first frame and the second frame. 9. The method of claim 8 , wherein estimating the motion of the object comprises, for each of a plurality of pixels of the object: estimating a translational motion vector; estimating a non-translational motion vector; and calculating a motion vector of the pixel as the sum of the translational motion vector and the non-translational motion vector. 10. The method of claim 8 , wherein estimating the motion of the object comprises: generating a coarse estimate of the motion based at least in part on the mask in the first frame and the corresponding matched mask in the second frame; and refining the coarse estimate using a second machine learning algorithm, wherein the second machine learning algorithm takes as input the first frame, the second frame, and the coarse estimate, and the second machine learning algorithm is trained to predict a motion difference between the coarse motion vector and a ground truth motion vector. 11. The method of claim 10 , wherein the machine learning algorithm is trained to predict the motion difference at a plurality of resolutions, starting with the lowest resolution and predicting the motion difference at successively higher resolutions based on up-sampling the motion difference from the preceding resolution. 12. An image processing system, comprising: a memory, configured to store a set of frames comprising at least a first frame and a second frame; and a first segmentation block, configured to segment the first frame, to produce a plurality of first masks, each first mask being a pixel map identifying pixels belonging to a potential object-instance detected in the first frame; a first feature extraction block, configured to, for each potential object-instance detected in the first frame, extract from the first frame a first feature vector characterising the potential object-instance; a second segmentation block, configured to segment the second frame, to produce a plurality of second masks, each second mask being a pixel map identifying pixels belonging to a potential object-instance detected in the second frame; a second feature extraction block, configured to, for each potential object-instance detected in the second frame, extract from the second frame a second feature vector characterising the potential object-instance; and a matching block, configured to match at least one of the potential object-instances in the first frame with one of the potential object-instances in the second frame, based at least in part on the first feature vectors, the first masks, the second feature vectors and the second masks, wherein the matching block is configured to cluster the potential object-instances detected in the first and second frames, based at least in part on the first feature vectors and the second feature vectors, to generate clusters of potential object-instances. 13. The image processing system of claim 12 , wherein the first and second segmentation blocks are the same segmentation block, and/or the first and second feature extraction blocks are the same feature extraction block. 14. The image processing system of claim 12 , further comprising a motion estimation block, configured to estimate the motion of objects matched by the matching block. 15. The image processing system of claim 12 , wherein the matching block is further configured to, for each cluster in each frame: evaluate a distance between the potential object-instances in the cluster in that frame; and split the cluster into multiple clusters based on a result of the evaluating. 16. The image processing system of claim 12 , wherein the matching block is configured to, for each cluster in each frame, select a single object-instance from among the potential object-instances of that cluster, and to match one of the single object-instances in the first frame with a single object-instance in the second frame. 17. The image processing system of claim 12 wherein the masks and feature vectors are generated by a first machine learning algorithm. 18. A non-transitory computer readable storage medium having stored thereon computer readable code configured to cause to be performed, when the code is run, a method of analyzing one or more objects in a set of frames comprising at least a first frame and a second frame, the method comprising: segmenting the first frame, to produce a plurality of first masks, each first mask being a pixel map identifying pixels belonging to a potential object-instance detected in the first frame; for each potential object-instance detected in the first frame, extracting from the first frame a first feature vector characterising the pote

Assignees

Imagination Tech Ltd

Inventors

Classifications

G06F18/23
Clustering techniques · CPC title
G06F18/22
Matching criteria, e.g. proximity measures · CPC title
G06T2207/20081
Training; Learning · CPC title
G06T7/248
involving reference images or patches · CPC title
G06T7/207
for motion estimation over a hierarchy of resolutions (multi-resolution motion estimation or hierarchical motion estimation for coding, decoding, compressing or decompressing digital video signals H04N19/53) · CPC title

Patent family

Related publications grouped by family.

View patent family 70278525

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12073567B2 cover?: A method of analysing objects in a first frame and a second frame is disclosed. The method includes segmenting the frames, and matching at least one object in the first frame with a corresponding object in the second frame. The method optionally includes estimating the motion of the at least one matched object between the frames. Also disclosed is a method of generating a training dataset suita…
Who is the assignee on this patent?: Imagination Tech Ltd
What technology area does this patent fall under?: Primary CPC classification G06T7/20. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 27 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Efficient cnn-based solution for video frame interpolation

Methods and apparatus for tracking objects using saliency

Still and slow object tracking in a hybrid video analytics system

Frequently asked questions