Generalizable robot approach control techniques

US11449079B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11449079-B2
Application numberUS-201916262448-A
CountryUS
Kind codeB2
Filing dateJan 30, 2019
Priority dateJan 30, 2019
Publication dateSep 20, 2022
Grant dateSep 20, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and techniques are described that provide for generalizable approach policy learning and implementation for robotic object approaching. Described techniques provide fast and accurate approaching of a specified object, or type of object, in many different environments. The described techniques enable a robot to receive an identification of an object or type of object from a user, and then navigate to the desired object, without further control from the user. Moreover, the approach of the robot to the desired object is performed efficiently, e.g., with a minimum number of movements. Further, the approach techniques may be used even when the robot is placed in a new environment, such as when the same type of object must be approached in multiple settings.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer program product, the computer program product being tangibly embodied on a non-transitory computer-readable storage medium and comprising instructions that, when executed by at least one computing device, are configured to cause the at least one computing device to: receive an approach request identifying a target object within an environment, wherein the approach request indicates that the target object is to be approached using a movement system of a robot; obtain an image of the target object within the environment of the robot; determine, from the image, a semantic segmentation in which image pixels of the image corresponding to the target object are labeled with a semantic label corresponding to the target object; determine a depth map in which the image pixels of the image with the semantic label corresponding to the target object are associated with a distance of the target object from the robot; generate an attention mask based on the semantic segmentation and the target object identified in the approach request; select a movement action based on the attention mask and on the depth map using a navigation policy generator, wherein the navigation policy generator is trained using reinforcement learning; and execute the movement action to move the robot toward the target object within the environment. 2. The computer program product of claim 1 , wherein the instructions, when executed, are further configured to cause the at least one computing device to: train at least one convolutional neural network to generate the semantic segmentation and the depth map using a single encoder structure, with a first decoder branch trained to generate the semantic segmentation using the encoder structure and a second decoder branch trained to generate the depth map using the encoder structure. 3. The computer program product of claim 1 , wherein the instructions, when executed to select the movement action, are further configured to cause the at least one computing device to: train the navigation policy generator to use at least one additional convolutional neural network to utilize ground truth semantic information and depth information to predict at least one movement action from among a plurality of available movement actions available to the robot within the environment. 4. The computer program product of claim 3 , wherein the instructions, when executed to select the movement action, are further configured to cause the at least one computing device to: use the trained navigation policy generator to represent a state of the robot, relative to the target object, using the attention mask and the depth map, wherein the movement action is selected based on the state of the robot. 5. The computer program product of claim 4 , wherein the instructions, when executed to select the movement action, are further configured to cause the at least one computing device to: execute the trained navigation policy generator to generate a probabilistic distribution of movement actions from among the plurality of available movement actions, based on the state of the robot; and select the movement action from the probabilistic distribution of movement actions. 6. The computer program product of claim 5 , wherein the instructions, when executed, are further configured to cause the at least one computing device to: obtain a second image of the target object from a camera of the robot, following execution of the movement action; determine a second semantic segmentation and a second depth map, using at least one convolutional neural network; select, using the trained navigation policy generator, a second movement action, based on the second semantic segmentation, the second depth map, and the state of the robot following the movement action; and execute the second movement action to move the robot toward the target object within the environment. 7. The computer program product of claim 1 , wherein the instructions, when executed, are further configured to cause the at least one computing device to: obtain a second image of the target object from a camera of the robot, following execution of the movement action; determine a second semantic segmentation and a second depth map, using at least one convolutional neural network and the second image; select a second movement action, based on the second semantic segmentation, the second depth map, and a state of the robot following the movement action, relative to a preceding state of the robot prior to the movement action; and execute the second movement action to move the robot toward the target object within the environment. 8. The computer program product of claim 7 , wherein the instructions, when executed, are further configured to cause the at least one computing device to: continue further iterations of obtaining a current image of the target object following a preceding movement action, determining a current semantic segmentation, current depth map, and current state, based on the current image, selecting a current movement action, based on the current semantic segmentation, the current depth map, and the current state, executing the current movement action, and evaluating whether the current movement action achieves a success condition of an approach request; and complete the approach request when the evaluation indicates the success condition has been reached. 9. The computer program product of claim 1 , wherein the attention mask comprises a 2-dimensional matrix whose values indicate a focus on the target object. 10. A robot comprising: a movement system configured to receive an approach request identifying a target object within an environment, wherein the approach request includes an instruction to move the robot and approach the target object; a camera configured to capture an image of the target object within the environment; and a control system configured to fulfill the approach request including executing iterations of moving the robot towards the target object through a plurality of iterative movements until an approach success condition is reached, the iterations including determining, from a first image from the camera and using at least one convolutional neural network, a first semantic segmentation in which image pixels of the image corresponding to the target object are labeled with a semantic label corresponding to the target object in the approach request, a first depth map, and a first state of the robot, relative to the target object; generate a first attention mask based on the first semantic segmentation and the target object identified in the approach request; determining a first movement action of the robot, using the first attention mask, the first depth map, and the first state, wherein the first movement action is selected using a navigation policy generator, and wherein the navigation policy generator and the at least one convolutional neural network are trained using reinforcement learning; executing the first movement action of the robot toward the target object within the environment; determining a second semantic segmentation, a second depth map, and a second state of the robot, relative to the target object, using a second image from the camera; generate a second attention mask based on the second semantic segmentation and the target object identified in the approach request; determining a second movement action of the robot, using the second attention mask, the second depth map, and the second state, relative to the first state; and executing the second movement action of the robot toward the target object within the environment. 11. The robot of claim 10 ,

Assignees

Inventors

Classifications

  • Labelling scene content, e.g. deriving syntactic or semantic representations · CPC title

  • Terrestrial scenes (scenes under surveillance with static cameras G06V20/52; scenes perceived from the exterior of a vehicle G06V20/56; scenes perceived from the interior of a vehicle G06V20/59) · CPC title

  • using neural networks · CPC title

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

  • using classification, e.g. of video objects · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11449079B2 cover?
Systems and techniques are described that provide for generalizable approach policy learning and implementation for robotic object approaching. Described techniques provide fast and accurate approaching of a specified object, or type of object, in many different environments. The described techniques enable a robot to receive an identification of an object or type of object from a user, and the…
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification G05D1/12. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 20 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).