Tracking multiple objects in a video stream using occlusion-aware single-object tracking

US11880985B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11880985-B2
Application numberUS-202217827695-A
CountryUS
Kind codeB2
Filing dateMay 28, 2022
Priority dateMay 28, 2020
Publication dateJan 23, 2024
Grant dateJan 23, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosure herein enables tracking of multiple objects in a real-time video stream. For each individual frame received from the video stream, a frame type of the frame is determined. Based on the individual frame being an object detection frame type, a set of object proposals is detected in the individual frame, associations between the set of object proposals and a set of object tracks are assigned, and statuses of the set of object tracks are updated based on the assigned associations. Based on the individual frame being an object tracking frame type, single-object tracking is performed on the frame based on each object track of the set of object tracks and the set of object tracks is updated based on the performed single-object tracking. For each frame received, a real-time object location data stream is provided based on the set of object tracks.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for tracking multiple objects in a real-time video stream, the system comprising: at least one processor; at least one memory comprising computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the at least one processor to: determine a frame type of an individual frame of a plurality of frames of the real-time video stream as an object-detection frame type, detect a set of object proposals in the individual frame, the set of object proposals including a first subset and a second subset, associate the first subset of the set of object proposals with a set of active object tracks and the second subset of the set of object proposals with a set of passive object tracks, update a status of the set of object tracks based on the association of the first subset of the set of object proposals and the second subset of the set of object proposals, and process the individual frame in real-time, and an output device configured to provide an object location data stream based on the processing. 2. The system of claim 1 , wherein, to update the status of the set of object tracks, the processor is further configured to convert the set of passive object tracks to the set of active object tracks based on a passive object track of the set of passive object tracks being associated with an object proposal of the second subset. 3. The system of claim 1 , wherein, to update the status of the set of object tracks, the processor is further configured to convert the set of active object tracks to the set of passive object tracks based on an active object track of the set of active object tracks being unassociated with an object proposal of the first subset. 4. The system of claim 1 , wherein the processor is further configured to determine the frame type of the individual frame as the object-detection frame type based at least in part on the individual frame being a first frame of a frame interval. 5. The system of claim 4 , wherein the processor is further configured to: determine the frame type of a second individual frame in the plurality of frames as an object-tracking frame type based on the second frame being a frame other than the first frame of the frame interval. 6. The system of claim 5 , wherein the processor is further configured to track the detected set of object proposals from the individual frame to the second individual frame. 7. The system of claim 1 , wherein: the provided object location data stream includes object location data that includes a state of a tracked object, and the state of the tracked object includes a current location of the tracked object and an occlusion of the tracked object. 8. A computerized method for tracking multiple objects in a real-time video stream, the computerized method comprising: determining a frame type of an individual frame of a plurality of frames of the real-time video stream as an object-detection frame type, detecting a set of object proposals in the individual frame, the set of object proposals including a first subset and a second subset, associating the first subset of the set of object proposals with a set of active object tracks and the second subset of the set of object proposals with a set of passive object tracks, updating a status of the set of object tracks based on the association of the first subset of the set of object proposals and the second subset of the set of object proposals, processing the individual frame in real-time, and providing an object location data stream based on the processing. 9. The computerized method of claim 8 , wherein updating the status of the set of object tracks further comprises converting the set of passive object tracks to the set of active object tracks based on a passive object track of the set of passive object tracks being associated with an object proposal of the second subset. 10. The computerized method of claim 8 , wherein updating the status of the set of object tracks further comprises converting the set of active object tracks to the set of passive object tracks based on an active object track of the set of active object tracks being unassociated with an object proposal of the first subset. 11. The computerized method of claim 8 , further comprising determining the frame type of the individual frame as the object-detection frame type based at least in part on the individual frame being a first frame of a frame interval. 12. The computerized method of claim 11 , further comprising determining the frame type of a second individual frame in the plurality of frames as an object-tracking frame type based on the second frame being a frame other than the first frame of the frame interval. 13. The computerized method of claim 12 , further comprising tracking the detected set of object proposals from the individual frame to the second individual frame. 14. The computerized method of claim 8 , wherein: the provided object location data stream includes object location data that includes a state of a tracked object, and the state of the tracked object includes a current location of the tracked object and an occlusion of the tracked object. 15. One or more computer storage media having computer-executable instructions for tracking multiple objects in a real-time video stream that, upon execution by a processor, cause the processor to: determine a frame type of an individual frame of a plurality of frames of the real-time video stream as an object-detection frame type; detect a set of object proposals in the individual frame, the set of object proposals including a first subset and a second subset; associate the first subset of the set of object proposals with a set of active object tracks and the second subset of the set of object proposals with a set of passive object tracks; update a status of the set of object tracks based on the association of the first subset of the set of object proposals and the second subset of the set of object proposals; process the individual frame in real-time; and control an output device to provide an object location data stream based on the processing. 16. The one or more computer storage media of claim 15 , wherein, to update the status of the set of object tracks, the instructions further cause the processor to: convert the set of passive object tracks to the set of active object tracks based on a passive object track of the set of passive object tracks being associated with an object proposal of the second subset. 17. The one or more computer storage media of claim 15 , wherein, to update the status of the set of object tracks, the instructions further cause the processor to: convert the set of active object tracks to the set of passive object tracks based on an active object track of the set of active object tracks being unassociated with an object proposal of the first subset. 18. The one or more computer storage media of claim 15 , wherein the instructions further cause the processor to: determine the frame type of the individual frame as the object-detection frame type based at least in part on the individual frame being a first frame of a frame interval. 19. The one or more computer storage media of claim 18 , wherein the instructions further cause the processor to: determine the frame type of a second individual frame in the plurality of frames as an object-tracking frame type based on the second frame being a frame other than the first frame of the frame interval; and track the

Assignees

Inventors

Classifications

  • G06T7/246Primary

    using feature-based methods, e.g. the tracking of corners or segments · CPC title

  • Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Determination of region of interest [ROI] or a volume of interest [VOI] · CPC title

  • Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries · CPC title

  • Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items (segmenting video sequences G06V20/49) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11880985B2 cover?
The disclosure herein enables tracking of multiple objects in a real-time video stream. For each individual frame received from the video stream, a frame type of the frame is determined. Based on the individual frame being an object detection frame type, a set of object proposals is detected in the individual frame, associations between the set of object proposals and a set of object tracks are…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06T7/246. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 23 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).