Systems and methods for image based perception

US11966452B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11966452-B2
Application numberUS-202117394973-A
CountryUS
Kind codeB2
Filing dateAug 5, 2021
Priority dateAug 5, 2021
Publication dateApr 23, 2024
Grant dateApr 23, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for image-based perception. The methods comprise: capturing images by a plurality of cameras with overlapping fields of view; generating, by a computing device, spatial feature maps indicating locations of features in the images; identifying, by the computing device, overlapping portions of the spatial feature maps; generating, by the computing device, at least one combined spatial feature map by combining the overlapping portions of the spatial feature maps together; and/or using, by the computing device, the at least one combined spatial feature map to define a predicted cuboid for at least one object in the images.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for image-based perception, comprising: capturing images by a plurality of cameras with overlapping fields of view; generating, by a computing device, spatial feature maps indicating locations of features in the images; defining, by the computing device for each image, a predicted cuboid at each location of an object in the image based on the spatial feature map of the image; identifying, by the computing device for each image, a plurality of visual features of each object in the image using the spatial feature maps; determining, by the computing device, a value indicating likelihood an object in a first image and an object in a second image are a same object based on the plurality of visual features; associating, by the computing device, the predicted cuboids for the object in the first image and for the object in the second image with each other as being the same object in response to the value being greater than a selected threshold; and tracking, the computing device, the plurality of objects using the predicted cuboids associated with the object. 2. The method according to claim 1 , wherein the spatial feature maps are generated using a feature extraction module. 3. The method according to claim 2 , wherein the feature extraction module comprises a convolutional neural network. 4. The method according to claim 1 , further comprising using the track of the plurality of objects to control autonomous operations of a vehicle. 5. A system, comprising: a processor; a non-transitory computer-readable storage medium comprising programming instructions that are configured to cause the processor to implement a method for image-based perception, wherein the programming instructions comprise instructions to: obtain images captured by a plurality of cameras with overlapping fields of view; generate spatial feature maps indicating locations of features in the images; define, for each image, a predicted cuboid at each location of an object in the image based on the spatial feature map of the image; identify, for each image, a plurality of visual features of each object in the image using the spatial feature maps; determine a value indicating a likelihood an object in a first image and an object in a second image area same object based on the plurality of visual features; associate the predicted cuboids for the object in the first image and for the object in the second image with each other as being the same object in response to the value being greater than a selected threshold; and track the plurality of objects using the predicted cuboids associated with the object. 6. The system according to claim 5 , wherein the spatial feature maps are generated using a feature extraction module. 7. The system according to claim 5 , wherein the programming instructions further comprise instruction to cause autonomous operations of a vehicle to be controlled using the track of the plurality of object. 8. A computer program product comprising a memory storing programming instructions thereon, which when executed by a processor cause the processor to: obtain images captured by a plurality of cameras with overlapping fields of view; generate spatial feature maps indicating locations of features in the images; define for each image, a predicted cuboid at each location of an object in the image based on the spatial feature map of the image; identify, for each image a plurality of visual features of each object in the image using the spatial feature maps; determine a value indicating a likelihood an object in a first image and an object in a second image are a same object based on the plurality of visual features; associate the predicted cuboids for the object in the first image and for the object in the second image with each other as being the same object in response to the value being greater than a selected threshold; and track the plurality of objects using the predicted cuboids associated with the object. 9. The computer program product according to claim 8 , wherein the programming instructions further cause the processor to use the track of the plurality of objects to control autonomous operations of a vehicle. 10. The method according to claim 1 , wherein the plurality of visual features includes a color, a size, a shape, or a combination thereof. 11. The method according to claim 1 , wherein the plurality of visual features are identified using a triplet loss algorithm. 12. The method according to claim 1 , determining the value indicating likelihood the object in the first image and the object in the second image are same object further comprises selecting an overall similarity value ranging from zero to ten, wherein zero indicates no similarity between the plurality of visual features and ten indicates the greatest degree of similarity of the plurality of visual features. 13. The method according to claim 1 , wherein determining the value indicating likelihood the object in the first image and the object in the second image are same object further comprises: for each visual feature: assigning a similarity value of zero when the visual feature is not same, and assigning the similarity value of one when the visual feature is same; and adding the similarity values for the plurality of visual features to determine the value indicating the likelihood the object in the first image and the object in the second image are same object. 14. The system according to claim 5 , wherein the plurality of visual features includes a color, a size, a shape, or a combination thereof. 15. The system according to claim 5 , wherein the plurality of visual features are identified using a triplet loss algorithm. 16. The system according to claim 5 , wherein to determine the value indicating likelihood the object in the first image and the object in the second image are same object, the programming instructions further comprise instruction to select an overall similarity value ranging from zero to ten, wherein zero indicates no similarity between the plurality of visual features and ten indicates the greatest degree of similarity of the plurality of visual features. 17. The system according to claim 5 , wherein to determine the value indicating likelihood the object in the first image and the object in the second image are same object, the programming instructions further comprise instruction to: for each visual feature: assign a similarity value of zero when the visual feature is not same, and assign the similarity value of one when the visual feature is same; and add the similarity values for the plurality of visual features to determine the value indicating the likelihood the object in the first image and the object in the second image are same object. 18. The computer program product according to claim 8 , wherein the plurality of visual features includes a color, a size, a shape, or a combination thereof. 19. The computer program product according to claim 8 , wherein to determine the value indicating likelihood the object in the first image and the object in the second image are same object, the programming instructions further cause the processor to select an overall similarity value ranging from zero to ten, wherein zero indicates no similarity between the plurality of visual features and ten indicates the greatest degree of similarity of the plurality of visual features. 20. The computer program product according to claim 8 , wherein to determine the value indicating likelihood t

Assignees

Inventors

Classifications

  • G06F18/213Primary

    Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods · CPC title

  • Three-dimensional [3D] imaging with simultaneous measurement of time-of-flight at a two-dimensional [2D] array of receiver pixels, e.g. time-of-flight cameras or flash lidar · CPC title

  • Matching criteria, e.g. proximity measures · CPC title

  • Architecture, e.g. interconnection topology · CPC title

  • using feature-based methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11966452B2 cover?
Systems and methods for image-based perception. The methods comprise: capturing images by a plurality of cameras with overlapping fields of view; generating, by a computing device, spatial feature maps indicating locations of features in the images; identifying, by the computing device, overlapping portions of the spatial feature maps; generating, by the computing device, at least one combined …
Who is the assignee on this patent?
Ford Global Tech Llc
What technology area does this patent fall under?
Primary CPC classification G06F18/213. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 23 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).