Autonomous task performance based on visual embeddings

US11741701B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11741701-B2
Application numberUS-202217667217-A
CountryUS
Kind codeB2
Filing dateFeb 8, 2022
Priority dateJul 23, 2019
Publication dateAug 29, 2023
Grant dateAug 29, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for controlling a robotic device is presented. The method includes capturing an image corresponding to a current view of the robotic device. The method also includes identifying a keyframe image comprising a first set of pixels matching a second set of pixels of the image. The method further includes performing, by the robotic device, a task corresponding to the keyframe image.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for controlling a robotic device, comprising: capturing an image corresponding to a current view of the robotic device; identifying a keyframe image based on matching one or more pixels of a first set of pixels associated with the keyframe image with one or more pixels of a second set of pixels associated with the image; determining, based on identifying the keyframe, a difference between a first relationship of a keyframe robotic device to a first object in the keyframe image and a second relationship of the robotic device to a second object in the image; adjusting a parameter associated with the task based on determining the difference between the first relationship and the second relationship, the task being associated with the keyframe image; and controlling the robotic device to perform a task based on adjusting the parameter. 2. The method of claim 1 , further comprising capturing the keyframe image while training the robotic device to perform the task. 3. The method of claim 1 , in which the task comprises at least one of interacting with an object, navigating through an environment, or a combination thereof. 4. The method of claim 1 , in which: the first relationship corresponds to a pose of the keyframe robotic device in relation to a first object in the keyframe image; and the second relationship corresponds to a pose of the robotic device in relation to a second object in the image. 5. The method of claim 1 , in which: the first relationship corresponds to a distance between the keyframe robotic device and a first object in the keyframe image; and the second relationship corresponds to a distance between the robotic device and a second object in the image. 6. The method of claim 1 , in which: each pixel of the first set of pixels and the second set of pixels is associated with a pixel descriptor; and each pixel descriptor is associated with a set of values corresponding to pixel level information and depth information. 7. The method of claim 1 , in which the parameter is associated with one or more of a velocity or a position associated with the task. 8. An apparatus for controlling a robotic device, comprising: a processor; and a memory coupled with the processor and storing instructions operable, when executed by the processor, to cause the apparatus to: capture an image corresponding to a current view of the robotic device; identify a keyframe image based on matching one or more pixels of a first set of pixels associated with the keyframe image with one or more pixels of a second set of pixels associated with the image; determine, based on identifying the keyframe, a difference between a first relationship of a keyframe robotic device to a first object in the keyframe image and a second relationship of the robotic device to a second object in the image; adjust a parameter associated with the task based on determining the difference between the first relationship and the second relationship, the task being associated with the keyframe image; and control the robotic device to perform a task based on adjusting the parameter. 9. The apparatus of claim 8 , in which execution of the instructions further cause the apparatus to capture the keyframe image while training the robotic device to perform the task. 10. The apparatus of claim 8 , in which the task comprises at least one of interacting with an object, navigating through an environment, or a combination thereof. 11. The apparatus of claim 8 , in which: the first relationship corresponds to a pose of the keyframe robotic device in relation to a first object in the keyframe image; and the second relationship corresponds to a pose of the robotic device in relation to a second object in the image. 12. The apparatus of claim 8 , in which: the first relationship corresponds to a distance between the keyframe robotic device and a first object in the keyframe image; and the second relationship corresponds to a distance between the robotic device and a second object in the image. 13. The apparatus of claim 8 , in which: each pixel of the first set of pixels and the second set of pixels is associated with a pixel descriptor; and each pixel descriptor is associated with a set of values corresponding to pixel level information and depth information. 14. The apparatus of claim 8 , in which the parameter is associated with one or more of a velocity or a position associated with the task. 15. A non-transitory computer-readable medium having program code recorded thereon for controlling a robotic device, the program code comprising: program code to capture an image corresponding to a current view of the robotic device; program code to identify a keyframe image based on matching one or more pixels of a first set of pixels associated with the keyframe image with one or more pixels of a second set of pixels associated with the image; program code to determine, based on identifying the keyframe, a difference between a first relationship of a keyframe robotic device to a first object in the keyframe image and a second relationship of the robotic device to a second object in the image; program code to adjust a parameter associated with the task based on determining the difference between the first relationship and the second relationship, the task being associated with the keyframe image; and program code to control the robotic device to perform a task based on adjusting the parameter. 16. The non-transitory computer-readable medium of claim 15 , in which: the first relationship corresponds to a pose of the keyframe robotic device in relation to a first object in the keyframe image; and the second relationship corresponds to a pose of the robotic device in relation to a second object in the image. 17. The non-transitory computer-readable medium of claim 15 , in which: the first relationship corresponds to a distance between the keyframe robotic device and a first object in the keyframe image; and the second relationship corresponds to a distance between the robotic device and a second object in the image. 18. The non-transitory computer-readable medium of claim 15 , in which: each pixel of the first set of pixels and the second set of pixels is associated with a pixel descriptor; and each pixel descriptor is associated with a set of values corresponding to pixel level information and depth information. 19. The non-transitory computer-readable medium of claim 15 , in which the parameter is associated with one or more of a velocity or a position associated with the task.

Assignees

Inventors

Classifications

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Supervised learning · CPC title

  • G06V20/10Primary

    Terrestrial scenes (scenes under surveillance with static cameras G06V20/52; scenes perceived from the exterior of a vehicle G06V20/56; scenes perceived from the interior of a vehicle G06V20/59) · CPC title

  • B25J9/1605Primary

    Simulation of manipulator lay-out, design, modelling of manipulator · CPC title

  • characterised by task planning, object-oriented languages · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11741701B2 cover?
A method for controlling a robotic device is presented. The method includes capturing an image corresponding to a current view of the robotic device. The method also includes identifying a keyframe image comprising a first set of pixels matching a second set of pixels of the image. The method further includes performing, by the robotic device, a task corresponding to the keyframe image.
Who is the assignee on this patent?
Toyota Res Inst Inc
What technology area does this patent fall under?
Primary CPC classification G06V20/10. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 29 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).