Autonomous task performance based on visual embeddings

US11288883B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11288883-B2
Application numberUS-201916570618-A
CountryUS
Kind codeB2
Filing dateSep 13, 2019
Priority dateJul 23, 2019
Publication dateMar 29, 2022
Grant dateMar 29, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for controlling a robotic device is presented. The method includes capturing an image corresponding to a current view of the robotic device. The method also includes identifying a keyframe image comprising a first set of pixels matching a second set of pixels of the image. The method further includes performing, by the robotic device, a task corresponding to the keyframe image.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for controlling a robotic device, comprising: capturing an image corresponding to a current view of the robotic device; identifying a keyframe image based on a red-green-blue (RGB) value of each pixel of a first set of pixels of the keyframe image matching an RGB value of a corresponding pixel of a second set of pixels of the image; determining one or both of: a first difference between a first pose of the robotic device in relation to a first object in the keyframe image and a second pose of the robotic device in relation to a second object in the image; or a second difference between a first distance of the robotic device to the first object and a second distance of the robotic device to the second object; adjusting one or more of a velocity or a position associated with a task to be performed by an effecter of the robotic device based on determining one or both of the first difference or the second difference, the task being associated with the keyframe image; and performing, via the effecter of the robotic device, the task based on adjusting one or both of the velocity or the position. 2. The method of claim 1 , further comprising capturing the keyframe image while the robotic device is trained to perform the task. 3. The method of claim 1 , in which each pixel of the first set of pixels and the second set of pixels is associated with a pixel descriptor. 4. The method of claim 3 , in which each pixel descriptor comprises a set of values corresponding to pixel level information and depth information, the pixel level information comprising an RGB value of the pixel associated with the pixel descriptor. 5. The method of claim 1 , in which the task comprises at least one of interacting with an object, navigating through an environment, or a combination thereof. 6. The method of claim 1 , in which an area corresponding to the first set of pixels is selected by a user. 7. The method of claim 1 , in which the robotic device is trained to perform the task based on a human demonstration. 8. A robotic device, comprising: a memory; and at least one processor, the at least one processor configured: to capture an image corresponding to a current view of the robotic device; to identify a keyframe image based on a red-green-blue (RGB) value of each pixel of a first set of pixels of the keyframe image matching an RGB value of a corresponding pixel of a second set of pixels of the image; to determine one or both of: a first difference between a first pose of the robotic device in relation to a first object in the keyframe image and a second pose of the robotic device in relation to a second object in the image; or a second difference between a first distance of the robotic device to the first object and a second distance of the robotic device to the second object; to adjust one or more of a velocity or a position associated with a task to be performed by an effecter of the robotic device based on determining one or both of the first difference or the second difference, the task being associated with the keyframe image; and to perform, via the effecter of the robotic device, the task based on adjusting one or both of the velocity or the position. 9. The robotic device of claim 8 , in which the at least one processor is further configured to capture the keyframe image while the robotic device is trained to perform the task. 10. The robotic device of claim 8 , in which each pixel of the first set of pixels and the second set of pixels is associated with a pixel descriptor. 11. The robotic device of claim 10 , in which each pixel descriptor comprises a set of values corresponding to pixel level information and depth information, the pixel level information comprising an RGB value of the pixel associated with the pixel descriptor. 12. The robotic device of claim 8 , in which the task comprises at least one of interacting with an object, navigating through an environment, or a combination thereof. 13. The robotic device of claim 8 , in which an area corresponding to the first set of pixels is selected by a user. 14. The robotic device of claim 8 , in which the robotic device is trained to perform the task based on a human demonstration. 15. A non-transitory computer-readable medium having program code recorded thereon for controlling a robotic device, the program code comprising: program code to capture an image corresponding to a current view of the robotic device; program code to identify a keyframe image based on a red-green-blue (RGB) value of each pixel of a first set of pixels of the keyframe image matching an RGB value of a corresponding pixel of a second set of pixels of the image; program code to determine one or both of: a first difference between a first pose of the robotic device in relation to a first object in the keyframe image and a second pose of the robotic device in relation to a second object in the image; or a second difference between a first distance of the robotic device to the first object and a second distance of the robotic device to the second object; program code to adjust one or more of a velocity or a position associated with a task to be performed by an effecter of the robotic device based on determining one or both of the first difference or the second difference, the task being associated with the keyframe image; and program code to perform, via the effecter of the robotic device, the task based on adjusting one or both of the velocity or the position. 16. The non-transitory computer-readable medium of claim 15 , in which the program code further comprises program code to capture the keyframe image while the robotic device is trained to perform the task. 17. The non-transitory computer-readable medium of claim 15 , in which each pixel of the first set of pixels and the second set of pixels is associated with a pixel descriptor. 18. The non-transitory computer-readable medium of claim 17 , in which each pixel descriptor comprises a set of values corresponding to pixel level information and depth information, the pixel level information comprising an RGB value of the pixel associated with the pixel descriptor. 19. The non-transitory computer-readable medium of claim 15 , in which the task comprises at least one of interacting with an object, navigating through an environment, or a combination thereof. 20. The non-transitory computer-readable medium of claim 15 , in which the robotic device is trained to perform the task based on a human demonstration.

Assignees

Inventors

Classifications

  • Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • using neural networks · CPC title

  • using classification, e.g. of video objects · CPC title

  • Determining representative reference patterns, e.g. by averaging or distorting; Generating dictionaries · CPC title

  • Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11288883B2 cover?
A method for controlling a robotic device is presented. The method includes capturing an image corresponding to a current view of the robotic device. The method also includes identifying a keyframe image comprising a first set of pixels matching a second set of pixels of the image. The method further includes performing, by the robotic device, a task corresponding to the keyframe image.
Who is the assignee on this patent?
Toyota Res Inst Inc
What technology area does this patent fall under?
Primary CPC classification B25J9/1605. Mapped technology areas include Operations & Transport.
When was this patent published?
Publication date Tue Mar 29 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).