Object tracking by an unmanned aerial vehicle using visual sensors

US12367670B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12367670-B2
Application numberUS-202318400113-A
CountryUS
Kind codeB2
Filing dateDec 29, 2023
Priority dateDec 1, 2016
Publication dateJul 22, 2025
Grant dateJul 22, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods are disclosed for tracking objects in a physical environment using visual sensors onboard an autonomous unmanned aerial vehicle (UAV). In certain embodiments, images of the physical environment captured by the onboard visual sensors are processed to extract semantic information about detected objects. Processing of the captured images may involve applying machine learning techniques such as a deep convolutional neural network to extract semantic cues regarding objects detected in the images. The object tracking can be utilized, for example, to facilitate autonomous navigation by the UAV or to generate and display augmentative information regarding tracked objects to users.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving, by a computer system of an autonomous vehicle, images of a physical environment captured by one or more image capture devices coupled to the autonomous vehicle; processing, by the computer system, the received images to: detect an object in the physical environment associated with a particular class of objects; and extract semantic information including information related to the detected object in the physical environment and information related to the physical environment itself; predicting, by the computer system, a trajectory of the detected object in three-dimensional (3D) space of the physical environment based, at least in part, on the extracted semantic information; and causing, by the computer system, the autonomous vehicle to track the object through the 3D space of the physical environment based, at least in part, on the predicted trajectory. 2. The method of claim 1 , wherein processing the received images comprises: generating a dense per-pixel segmentation based on the received images, wherein each pixel in the dense per-pixel segmentation is associated with a value indicative of a likelihood that the pixel corresponds with the particular class of objects. 3. The method of claim 2 , the dense per-pixel segmentation is one of a plurality of dense per-pixel segmentations comprising a tensor, each of the plurality of dense per-pixel segmentations associated with a different class of objects. 4. The method of claim 2 , wherein processing the received images to detect the object in the physical environment further comprises: detecting additional physical objects in the physical environment; and distinguishing between one or more instances of the detected physical objects, the distinguishing including: analyzing the dense per-pixel segmentation generated based on the received images to associate pixels corresponding to the particular class of objects with a particular instance of the particular class of objects. 5. The method of claim 4 , wherein to associate pixels corresponding to the particular class of objects with the particular instance of the particular class includes: applying a grouping process to group: pixels that are substantially similar to other pixels associated with the particular instance; pixels that are spatially clustered with other pixels associated with the particular instance; and/or pixels that fit an appearance-based model for the particular class of objects. 6. The method of claim 1 , wherein the semantic information related to the object comprises a particular class of physical object. 7. The method of claim 1 , wherein the semantic information includes information regarding any of a position, orientation, shape, size, scale, appearance, or pixel segmentation of the object. 8. The method of claim 1 , wherein the semantic information includes information regarding activity of the object. 9. The method of claim 1 , further comprising: receiving, by the computer system, sensor data from one or more other sensors coupled to the autonomous vehicle; and processing, by the computer system, the received sensor data with the received images using a spatiotemporal factor graph to predict the trajectory of the object through the 3D space of the physical environment. 10. The method of claim 1 , wherein causing the autonomous vehicle to track the object through the 3D space of the physical environment based on the predicted trajectory comprises: generating, by the computer system, control commands configured to cause the autonomous vehicle to maneuver along the predicted trajectory. 11. The method of claim 10 , wherein generating the control commands comprises: generating control commands configured to cause a gimbal mechanism to adjust an orientation of the image capture device relative to the autonomous vehicle so as to keep the tracked object within a field of view of the image capture device. 12. The method of claim 10 , wherein generating the control commands comprises: generating, by the computer system, an augmentation based on the object; and causing, by the computer system, the generated augmentation to be presented with the object at a display device. 13. The method of claim 1 , wherein the autonomous vehicle is an unmanned aerial vehicle (UAV). 14. The method of claim 1 , wherein the particular class of objects is selected from a list of classes of objects comprising people, animal, vehicles, buildings, landscape features, and plants. 15. An unmanned aerial vehicle (UAV) configured for autonomous flight through a physical environment, the UAV comprising: a first image capture device; a second image capture device; and a tracking system configured to: receive images of the physical environment captured by any of the first image capture device or second image capture device; process the received images to: detect an object in the physical environment extract semantic information including information related to the detected object in the physical environment and information related to the physical environment itself; predict a trajectory of the detected object in the three-dimensional (3D) space of the physical environment based, at least in part, on the extracted semantic information; and cause the UAV to track the object through the 3D space of the physical environment based, at least in part, on the predicted trajectory. 16. The UAV of claim 15 , wherein the semantic information includes information regarding any of a position, orientation, shape, size, scale, appearance, or pixel segmentation of the object. 17. The UAV of claim 15 , wherein the semantic information includes information regarding activity of the object. 18. An apparatus comprising: one or more computer-readable media; and program instructions stored on the one or more computer-readable storage media that, when executed by one or more processors onboard an aerial vehicle, direct the one or more processors to at least: process images of the physical environment captured by one or more image capture devices coupled to an autonomous vehicle to: detect an object in the physical environment associated with a particular class of objects; and extract semantic information including information related to the detected object in the physical environment and information related to the physical environment itself; predict a trajectory of the detected object in three-dimensional (3D) space of the physical environment based, at least in part, on the extracted semantic information; and cause the autonomous vehicle to track the object through the 3D space of the physical environment based, at least in part, on the predicted trajectory. 19. The apparatus of claim 18 , wherein the semantic information includes information regarding any of a position, orientation, shape, size, scale, appearance, or pixel segmentation of the object. 20. The apparatus of claim 18 , wherein the semantic information includes information regarding activity of the object.

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Remote-control arrangements · CPC title

  • Pointing payloads towards fixed or moving targets (positioning towed, pushed or suspended implements G05D1/672) · CPC title

  • of the remote controlled vehicle type, i.e. RPV · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12367670B2 cover?
Systems and methods are disclosed for tracking objects in a physical environment using visual sensors onboard an autonomous unmanned aerial vehicle (UAV). In certain embodiments, images of the physical environment captured by the onboard visual sensors are processed to extract semantic information about detected objects. Processing of the captured images may involve applying machine learning te…
Who is the assignee on this patent?
Skydio Inc
What technology area does this patent fall under?
Primary CPC classification G06V20/13. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 22 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).