Method for generating residual image of multi-view video and apparatus using the same
US-2021409726-A1 · Dec 30, 2021 · US
US11966452B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11966452-B2 |
| Application number | US-202117394973-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 5, 2021 |
| Priority date | Aug 5, 2021 |
| Publication date | Apr 23, 2024 |
| Grant date | Apr 23, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods for image-based perception. The methods comprise: capturing images by a plurality of cameras with overlapping fields of view; generating, by a computing device, spatial feature maps indicating locations of features in the images; identifying, by the computing device, overlapping portions of the spatial feature maps; generating, by the computing device, at least one combined spatial feature map by combining the overlapping portions of the spatial feature maps together; and/or using, by the computing device, the at least one combined spatial feature map to define a predicted cuboid for at least one object in the images.
Opening claim text (preview).
What is claimed is: 1. A method for image-based perception, comprising: capturing images by a plurality of cameras with overlapping fields of view; generating, by a computing device, spatial feature maps indicating locations of features in the images; defining, by the computing device for each image, a predicted cuboid at each location of an object in the image based on the spatial feature map of the image; identifying, by the computing device for each image, a plurality of visual features of each object in the image using the spatial feature maps; determining, by the computing device, a value indicating likelihood an object in a first image and an object in a second image are a same object based on the plurality of visual features; associating, by the computing device, the predicted cuboids for the object in the first image and for the object in the second image with each other as being the same object in response to the value being greater than a selected threshold; and tracking, the computing device, the plurality of objects using the predicted cuboids associated with the object. 2. The method according to claim 1 , wherein the spatial feature maps are generated using a feature extraction module. 3. The method according to claim 2 , wherein the feature extraction module comprises a convolutional neural network. 4. The method according to claim 1 , further comprising using the track of the plurality of objects to control autonomous operations of a vehicle. 5. A system, comprising: a processor; a non-transitory computer-readable storage medium comprising programming instructions that are configured to cause the processor to implement a method for image-based perception, wherein the programming instructions comprise instructions to: obtain images captured by a plurality of cameras with overlapping fields of view; generate spatial feature maps indicating locations of features in the images; define, for each image, a predicted cuboid at each location of an object in the image based on the spatial feature map of the image; identify, for each image, a plurality of visual features of each object in the image using the spatial feature maps; determine a value indicating a likelihood an object in a first image and an object in a second image area same object based on the plurality of visual features; associate the predicted cuboids for the object in the first image and for the object in the second image with each other as being the same object in response to the value being greater than a selected threshold; and track the plurality of objects using the predicted cuboids associated with the object. 6. The system according to claim 5 , wherein the spatial feature maps are generated using a feature extraction module. 7. The system according to claim 5 , wherein the programming instructions further comprise instruction to cause autonomous operations of a vehicle to be controlled using the track of the plurality of object. 8. A computer program product comprising a memory storing programming instructions thereon, which when executed by a processor cause the processor to: obtain images captured by a plurality of cameras with overlapping fields of view; generate spatial feature maps indicating locations of features in the images; define for each image, a predicted cuboid at each location of an object in the image based on the spatial feature map of the image; identify, for each image a plurality of visual features of each object in the image using the spatial feature maps; determine a value indicating a likelihood an object in a first image and an object in a second image are a same object based on the plurality of visual features; associate the predicted cuboids for the object in the first image and for the object in the second image with each other as being the same object in response to the value being greater than a selected threshold; and track the plurality of objects using the predicted cuboids associated with the object. 9. The computer program product according to claim 8 , wherein the programming instructions further cause the processor to use the track of the plurality of objects to control autonomous operations of a vehicle. 10. The method according to claim 1 , wherein the plurality of visual features includes a color, a size, a shape, or a combination thereof. 11. The method according to claim 1 , wherein the plurality of visual features are identified using a triplet loss algorithm. 12. The method according to claim 1 , determining the value indicating likelihood the object in the first image and the object in the second image are same object further comprises selecting an overall similarity value ranging from zero to ten, wherein zero indicates no similarity between the plurality of visual features and ten indicates the greatest degree of similarity of the plurality of visual features. 13. The method according to claim 1 , wherein determining the value indicating likelihood the object in the first image and the object in the second image are same object further comprises: for each visual feature: assigning a similarity value of zero when the visual feature is not same, and assigning the similarity value of one when the visual feature is same; and adding the similarity values for the plurality of visual features to determine the value indicating the likelihood the object in the first image and the object in the second image are same object. 14. The system according to claim 5 , wherein the plurality of visual features includes a color, a size, a shape, or a combination thereof. 15. The system according to claim 5 , wherein the plurality of visual features are identified using a triplet loss algorithm. 16. The system according to claim 5 , wherein to determine the value indicating likelihood the object in the first image and the object in the second image are same object, the programming instructions further comprise instruction to select an overall similarity value ranging from zero to ten, wherein zero indicates no similarity between the plurality of visual features and ten indicates the greatest degree of similarity of the plurality of visual features. 17. The system according to claim 5 , wherein to determine the value indicating likelihood the object in the first image and the object in the second image are same object, the programming instructions further comprise instruction to: for each visual feature: assign a similarity value of zero when the visual feature is not same, and assign the similarity value of one when the visual feature is same; and add the similarity values for the plurality of visual features to determine the value indicating the likelihood the object in the first image and the object in the second image are same object. 18. The computer program product according to claim 8 , wherein the plurality of visual features includes a color, a size, a shape, or a combination thereof. 19. The computer program product according to claim 8 , wherein to determine the value indicating likelihood the object in the first image and the object in the second image are same object, the programming instructions further cause the processor to select an overall similarity value ranging from zero to ten, wherein zero indicates no similarity between the plurality of visual features and ten indicates the greatest degree of similarity of the plurality of visual features. 20. The computer program product according to claim 8 , wherein to determine the value indicating likelihood t
Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods · CPC title
Three-dimensional [3D] imaging with simultaneous measurement of time-of-flight at a two-dimensional [2D] array of receiver pixels, e.g. time-of-flight cameras or flash lidar · CPC title
Matching criteria, e.g. proximity measures · CPC title
Architecture, e.g. interconnection topology · CPC title
using feature-based methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.