Dynamic input system for smart glasses based on user availability states
US-12183074-B2 · Dec 31, 2024 · US
US9940553B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9940553-B2 |
| Application number | US-201313774145-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 22, 2013 |
| Priority date | Feb 22, 2013 |
| Publication date | Apr 10, 2018 |
| Grant date | Apr 10, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Camera or object pose calculation is described, for example, to relocalize a mobile camera (such as on a smart phone) in a known environment or to compute the pose of an object moving relative to a fixed camera. The pose information is useful for robotics, augmented reality, navigation and other applications. In various embodiments where camera pose is calculated, a trained machine learning system associates image elements from an image of a scene, with points in the scene's 3D world coordinate frame. In examples where the camera is fixed and the pose of an object is to be calculated, the trained machine learning system associates image elements from an image of the object with points in an object coordinate frame. In examples, the image elements may be noisy and incomplete and a pose inference engine calculates an accurate estimate of the pose.
Opening claim text (preview).
The invention claimed is: 1. A method of calculating pose of an entity comprising: receiving, at a processor, at least one image where the image is of a scene captured by an entity comprising a mobile camera; applying image elements of the at least one image to a trained machine learning system to obtain a plurality of associations between image elements and three-dimensional (3D) points in a scene space, the trained machine learning system optimizing an energy function comprising the 3D points in the scene space predicted by at least one tree in at least one random decision forest and 3D coordinates in camera space; determining whether a pose of the entity has been calculated; based on a determination that the pose has been calculated, refining the pose of the entity from the plurality of associations and the optimized function; and based on a determination that the pose of the entity has not been calculated, calculating an initial pose of the entity from the plurality of associations and the optimized function; and generating map display data based at least in part on the initial pose of the entity, wherein the energy function comprises: E ( H )=Σ iϵ1 ρ(min mϵM i ∥m−Hx i ∥ 2 ) wherein id is an index of the image elements, ρ is an error function, mϵM i represents the predicted 3D points in the scene space, x i are the 3D coordinates in the camera space, and H is the pose of the entity. 2. A method as claimed in claim 1 , further comprising calculating the initial pose of the entity as parameters having six degrees of freedom, three indicating rotation of the entity and three indicating position of the entity. 3. A method as claimed in claim 1 , the machine learning system having been trained using images with image elements labeled with scene coordinates. 4. A method as claimed in claim 1 , wherein the machine learning system comprises a plurality of trained random forests and the method further comprises: applying the image elements of the at least one image to the plurality of trained random forests, the trained random forests having been trained using images from a different one of a plurality of scenes; and calculating which of the scenes the mobile camera was in when the at least one image was captured. 5. A method as claimed in claim 1 , wherein the machine learning system is trained using images of a plurality of scenes with image elements labeled with scene identifiers and labeled with scene coordinates of points in the scene the image elements depict. 6. A method as claimed in claim 1 , further comprising calculating the pose by searching amongst a set of possible pose candidates and using samples of the plurality of associations between image elements and points to assess the set of possible pose candidates. 7. A method as claimed in claim 1 , further comprising receiving at the processor, a stream of images, and calculating the pose by searching amongst a set of possible pose candidates which includes a second pose calculated from another image in the stream. 8. A method as claimed in claim 1 at least partially carried out using hardware logic selected from one or more of the following: a field-programmable gate array, a program-specific integrated circuit, a program-specific standard product, a system-on-a-chip, a complex programmable logic device, and a graphics processing unit. 9. A method as claimed in claim 1 , wherein the entity is a mobile camera and the pose of the mobile camera is calculated, the method further comprising accessing a 3D model of the scene and refining the pose of the mobile camera using the accessed 3D model. 10. A pose tracker comprising: a processor arranged to: receive at least one image of a scene captured by an entity comprising a mobile camera; and apply image elements of the at least one image to a trained machine learning system to obtain a plurality of associations between image elements and three-dimensional (3D) points in a scene space; and a pose inference engine arranged to: optimize an energy function comprising the 3D points in the scene space predicted by at least one tree in at least one random decision forest and 3D coordinates in camera space; determine whether a pose of the entity has been calculated; based on a determination that the pose has been calculated, refining the pose of the entity from the plurality of associations and the optimized function; and based on a determination that the pose of the entity has not been calculated, calculate an initial pose of the mobile camera from the plurality of associations, the calculation being based at least in part on the optimized function; wherein the energy function comprises: E ( H )=Σ iϵ1 ρ(min mϵM i ∥m−Hx i ∥ 2 ) wherein iϵI is an index of the image elements, ρ is an error function, mϵM i represents the predicted 3D points in the scene space, x i are the 3D coordinates in the camera space, and H is the pose of the entity. 11. The pose tracker as claimed in claim 10 , the pose inference engine further arranged to calculate the initial pose by searching amongst a set of possible pose candidates and using samples of the plurality of associations between image elements and points in scene coordinates to assess the set of possible pose candidates. 12. The pose tracker as claimed in claim 10 , the processor further arranged to receive a stream of images, and the pose tracker further comprising a pose inference engine arranged to calculate the initial pose by searching amongst a set of possible pose candidates which includes a second pose calculated from another image in the stream of images. 13. The pose tracker as claimed in claim 10 at least partially implemented using hardware logic selected from one or more of the following: a field-programmable gate array, a program-specific integrated circuit, a program-specific standard product, a system-on-a-chip, a complex programmable logic device, and a graphics processing unit. 14. The method as claimed in claim 1 , further comprising prior to applying the image elements, removing a set of image elements that are spurious or noisy image elements. 15. One or more computer-readable storage devices having computer-executable instructions that when executed by a processor, cause the processor to: receive at least one image that is of a scene captured by an entity comprising a mobile camera; apply image elements of the at least one image to a trained machine learning system to obtain a plurality of associations between a set of image elements and three dimensional (3D) points in a scene space, the trained machine learning system optimizing an energy function comprising the 3D points in the scene space predicted by at least one tree in at least one random decision forest and 3D coordinates in camera space; determine whether a pose of the entity has been calculated; based on a determination that the pose has been calculated, refine the pose of the entity from the plurality of associations and the optimized function; based on a determination that the pose of the entity has not been calculated, calculate an initial pose of the entity from the plurality of associations and the optimized function; and generate map display data based at least in part on the initial pose of the entity; wherein the energy function comprises: E ( H )=Σ iϵ1 ρ(min mϵM i ∥m−Hx i ∥ 2 ) wherein iϵI is an index of the image elements, ρ is an error function, mϵM i represents the predicted 3D points in the scene space, x i are the 3D coordinates in the camera space, and H is the pose of the entity. 1
Hierarchical techniques, i.e. dividing or merging patterns to obtain a tree-like representation; Dendograms · CPC title
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
using classification, e.g. of video objects · CPC title
in augmented reality scenes · CPC title
Hierarchical techniques, i.e. dividing or merging pattern sets so as to obtain a dendrogram · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.