System and method for three-dimensional (3D) object detection

US12033396B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12033396-B2
Application numberUS-202318339961-A
CountryUS
Kind codeB2
Filing dateJun 22, 2023
Priority dateSep 12, 2018
Publication dateJul 9, 2024
Grant dateJul 9, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system and method for three-dimensional (3D) object detection is disclosed. A particular embodiment can be configured to: receive image data from a camera associated with a vehicle, the image data representing an image frame; use a machine learning module to determine at least one pixel coordinate of a two-dimensional (2D) bounding box around an object in the image frame; use the machine learning module to determine at least one vertex of a three-dimensional (3D) bounding box around the object; obtain camera calibration information associated with the camera; and determine 3D attributes of the object using the 3D bounding box and the camera calibration information.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of determining an object-related attribute from image data, comprising: obtaining an image from a camera located on a vehicle; obtaining a first set of coordinates of a plurality of points on a three-dimensional (3D) bounding box of an object in the image; transforming the first set of coordinates of the plurality of points to a second set of coordinates of the plurality of points using an extrinsic matrix of the camera; obtaining a third set of coordinates of the plurality of points in an image plane by projecting the plurality of points of the 3D bounding box to the image plane using an intrinsic matrix of the camera; and determining one or more attributes of the object by using the second set of coordinates of the plurality of points and the third set of coordinates of the plurality of points; wherein the first set of coordinates are in a 3D space, wherein the second set of coordinates are in a camera coordinate space, and wherein the third set of coordinates are in the image plane of the camera. 2. The method of claim 1 , wherein the one or more attributes of the object is determined by minimizing a difference between a value of a point from the second set of coordinates and another value of the point from the third set of coordinates. 3. The method of claim 2 , wherein the difference is minimized by maintaining the object as a cuboid in 3D space. 4. The method of claim 1 , wherein the first set of coordinates are in a 3D space, wherein the second set of coordinates are in a camera coordinate space, and wherein the third set of coordinates are in the image plane of the camera. 5. The method of claim 1 , wherein the one or more attributes of the object including a length, a height, and a width of the object. 6. The method of claim 1 , wherein the one or more attributes of the object includes a 3D location in a camera coordinate space. 7. The method of claim 1 , wherein the one or more attributes of the object includes an orientation of the object measured as an angle between a direction in which the object is heading and an axis of the camera. 8. A system, comprising: a processor configured to: obtain an image from a camera located on a vehicle; obtain a first set of coordinates of a plurality of points on a three-dimensional (3D) bounding box of an object in the image; transform the first set of coordinates of the plurality of points to a second set of coordinates of the plurality of points using an extrinsic matrix of the camera; obtain a third set of coordinates of the plurality of points in an image plane by a projection of the plurality of points of the 3D bounding box to the image plane using an intrinsic matrix of the camera; and determine one or more attributes of the object by using the second set of coordinates of the plurality of points and the third set of coordinates of the plurality of points; wherein the first set of coordinates are in a 3D space, wherein the second set of coordinates are in a camera coordinate space, and wherein the third set of coordinates are in the image plane of the camera. 9. The system of claim 8 , wherein the first set of coordinates of the plurality of points on the 3D bounding box is obtained from a terrain map that includes global positioning system (GPS) locations of a terrain in which the vehicle is operating. 10. The system of claim 8 , wherein a size of the object and a location of the object is determined using three-dimension information of the object. 11. The system of claim 8 , wherein the one or more attributes of the object including a distance of the object to the camera. 12. The system of claim 8 , wherein the plurality of points on the 3D bounding box of the object includes eight corners of the 3D bounding box. 13. The system of claim 8 , wherein the one or more attributes includes a 3D attribute with pre-defined bounds. 14. A non-transitory computer readable storage medium embodying instructions which, when executed by a processor, causes the processor to perform a method, comprising: obtaining an image from a camera located on a vehicle; obtaining a first set of coordinates of a plurality of points on a three-dimensional (3D) bounding box of an object in the image; transforming the first set of coordinates of the plurality of points to a second set of coordinates of the plurality of points using an extrinsic matrix of the camera; obtaining a third set of coordinates of the plurality of points in an image plane by projecting the plurality of points of the 3D bounding box to the image plane using an intrinsic matrix of the camera; and determining one or more attributes of the object by using the second set of coordinates of the plurality of points and the third set of coordinates of the plurality of points; wherein the first set of coordinates are in a 3D space, wherein the second set of coordinates are in a camera coordinate space, and wherein the third set of coordinates are in the image plane of the camera. 15. The non-transitory computer readable storage medium of claim 14 , wherein the one or more attributes of the object is determined by minimizing a difference between a value of a point from the second set of coordinates and another value of the point from the third set of coordinates. 16. The non-transitory computer readable storage medium of claim 15 , wherein the difference is minimized by maintaining the object as a cuboid in 3D space. 17. The non-transitory computer readable storage medium of claim 14 , wherein the one or more attributes of the object including a length, a height, and a width of the object. 18. The non-transitory computer readable storage medium of claim 14 , wherein the one or more attributes of the object includes a 3D location in a camera coordinate space. 19. The non-transitory computer readable storage medium of claim 14 , wherein the one or more attributes of the object includes an orientation of the object measured as an angle between a direction in which the object is heading and an axis of the camera.

Assignees

Inventors

Classifications

  • Geographical information databases · CPC title

  • Machine learning · CPC title

  • of area, perimeter, diameter or volume · CPC title

  • Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration · CPC title

  • exterior to a vehicle by using sensors mounted on the vehicle · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12033396B2 cover?
A system and method for three-dimensional (3D) object detection is disclosed. A particular embodiment can be configured to: receive image data from a camera associated with a vehicle, the image data representing an image frame; use a machine learning module to determine at least one pixel coordinate of a two-dimensional (2D) bounding box around an object in the image frame; use the machine lear…
Who is the assignee on this patent?
Tusimple Inc
What technology area does this patent fall under?
Primary CPC classification G06V20/58. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 09 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).