Sequence-of-sequences model for 3D object recognition

US11410439B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11410439-B2
Application numberUS-202016870138-A
CountryUS
Kind codeB2
Filing dateMay 8, 2020
Priority dateMay 9, 2019
Publication dateAug 9, 2022
Grant dateAug 9, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods are disclosed for capturing multiple sequences of views of a three-dimensional object using a plurality of virtual cameras. The systems and methods generate aligned sequences from the multiple sequences based on an arrangement of the plurality of virtual cameras in relation to the three-dimensional object. Using a convolutional network, the systems and methods classify the three-dimensional object based on the aligned sequences and identify the three-dimensional object using the classification.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: capturing, using a plurality of virtual cameras, multiple sequences of views of a three-dimensional object, each sequence of the multiple sequences representing a unique order of views; generating, using one or more processors, aligned sequences from the multiple sequences based on an arrangement of the plurality of virtual cameras in relation to the three-dimensional object; using a convolutional neural network, classifying the three-dimensional object based on the aligned sequences; and identifying the three-dimensional object using the classification. 2. The method of claim 1 , wherein the arrangement of the plurality of virtual cameras corresponds to an upright orientation and a known rotation axis. 3. The method of claim 1 , wherein each sequence of the multiple sequences has a different starting view. 4. The method of claim 1 , wherein generating aligned sequences further comprises: aligning each sequence of the multiple sequences using an alignment function to align each sequence of the multiple sequences to a canonical viewpoint. 5. The method of claim 4 , wherein before aligning each sequence, the method further comprises: generating the alignment function based on a plurality of different starting views. 6. The method of claim 1 , further comprising: concatenating the aligned sequences; and classifying the three-dimensional object based on the concatenated aligned sequences. 7. The method of claim 1 , further comprising: training the convolutional neural network by updating the convolutional neural network with the aligned sequences. 8. The method of claim 1 , further comprising: classifying the three-dimensional object based on the aligned sequences using a gated recurrent unit. 9. A system comprising: a processor; and a memory storing instructions that, when executed by the processor, configure the system to perform operations comprising: capturing, using a plurality of virtual cameras, multiple sequences of views of a three-dimensional object, each sequence of the multiple sequences representing a unique order of views; generating, using one or more processors, aligned sequences from the multiple sequences based on an arrangement of the plurality of virtual cameras in relation to the three-dimensional object; using a convolutional neural network, classifying the three-dimensional object based on the aligned sequences; and identifying the three-dimensional object using the classification. 10. The system of claim 9 , wherein the arrangement of the plurality of virtual cameras corresponds to an upright orientation and a known rotation axis. 11. The system of claim 9 , wherein each sequence of the multiple sequences has a different starting view. 12. The system of claim 9 , wherein generating aligned sequences further comprises: aligning each sequence of the multiple sequences using an alignment function to align each sequence to a canonical viewpoint. 13. The system of claim 12 , wherein before aligning each sequence, the operations further comprise: generating the alignment function based on a plurality of different starting views. 14. The system of claim 9 , further comprising: concatenating the aligned sequences; and classifying the three-dimensional object based on the concatenated aligned sequences. 15. The system of claim 9 , further comprising: training the convolutional neural network by updating the convolutional neural network with the aligned sequences. 16. The system of claim 9 , further comprising: classifying the three-dimensional object based on the aligned sequences using a gated recurrent unit. 17. A non-transitory computer-readable storage medium, the computer-readable storage medium including instructions that when executed by a computer, cause the computer to perform operations comprising: capturing, using a plurality of virtual cameras, multiple sequences of views of a three-dimensional object, each sequence of the multiple sequences representing a unique order of views; generating, using one or more processors, aligned sequences from the multiple sequences based on an arrangement of the plurality of virtual cameras in relation to the three-dimensional object; using a convolutional neural network, classifying the three-dimensional object based on the aligned sequences; and identifying the three-dimensional object using the classification. 18. The computer-readable storage medium of claim 17 , wherein the arrangement of the plurality of virtual cameras corresponds to an upright orientation and a known rotation axis. 19. The computer-readable storage medium of claim 17 , wherein each sequence of the multiple sequences has a different starting view. 20. The computer-readable storage medium of claim 17 , wherein generating aligned sequences further comprises: aligning each sequence of the multiple sequences using an alignment function to align each sequence to a canonical viewpoint.

Assignees

Inventors

Classifications

  • Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items (segmenting video sequences G06V20/49) · CPC title

  • in augmented reality scenes · CPC title

  • using neural networks · CPC title

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

  • G06V20/64Primary

    Three-dimensional [3D] objects · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11410439B2 cover?
Systems and methods are disclosed for capturing multiple sequences of views of a three-dimensional object using a plurality of virtual cameras. The systems and methods generate aligned sequences from the multiple sequences based on an arrangement of the plurality of virtual cameras in relation to the three-dimensional object. Using a convolutional network, the systems and methods classify the t…
Who is the assignee on this patent?
Snap Inc
What technology area does this patent fall under?
Primary CPC classification G06V20/64. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 09 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).