What technology area does this patent fall under?

Primary CPC classification G06N5/00. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 15 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Robot controller learning system

US9135554B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9135554-B2
Application number	US-201313741902-A
Country	US
Kind code	B2
Filing date	Jan 15, 2013
Priority date	Mar 23, 2012
Publication date	Sep 15, 2015
Grant date	Sep 15, 2015

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A threshold learning control system for learning a controller of a robot. The system includes a threshold learning module, a regime classifier, and an exploratory controller, each receiving sensory inputs from a sensor system of the robot. The regime classifier determines a control regime based on the received sensor inputs and communicates the control regime to the threshold learning module. The exploratory controller also receives control parameters from the threshold learning module. A control arbiter receives commands from the exploratory controller and limits from the threshold learning module. The control arbiter issues modified commands based on the received limits to the robot controller.

First claim

Opening claim text (preview).

What is claimed is: 1. A threshold learning control system for learning a robot controller of a robot, the threshold learning control system comprising: a threshold learning module executing on a data processing apparatus and receiving sensor inputs from a sensor system of the robot; a regime classifier executing on the data processing apparatus and receiving sensor inputs from the sensor system of the robot and determining a control regime based on the received sensor inputs, the regime classifier communicating the control regime to the threshold learning module, the regime classifier receiving at least one state-action map, each state-action map having control regimes arranged contiguously with boundaries therebetween, and each control regime providing a state-action space of possible robot states and robot actions in a corresponding control space; an exploratory controller executing on the data processing apparatus and receiving sensor inputs from the sensor system of the robot and control parameters from the threshold learning module; and a control arbiter executing on the data processing apparatus and receiving exploratory commands from the exploratory controller and limits from the threshold learning module, the control arbiter issuing modified exploratory commands based on the received limits to the robot controller, wherein the threshold learning module learns the boundaries between control regimes within the state-action space of the at least one state-action map using at least one of the received sensor inputs, control regime classifications of the regime classifier, anchor points of the at least one state-action map, and feedback of the modified exploratory commands issued by the control arbiter. 2. The threshold learning control system of claim 1 , wherein the threshold learning module sets the limits for the commands issued by the exploratory controller based on the received sensor inputs, the control arbiter modifying the commands received from the exploratory controller based on the limits received from the threshold learning module. 3. The threshold learning control system of claim 1 , wherein the threshold learning module issues control parameters to the exploratory controller based on the received sensor inputs and the feedback received from the control arbiter of executed modified exploratory commands. 4. The threshold learning control system of claim 1 , wherein the threshold learning module issues control parameters that cause the exploratory controller to issue exploratory commands that alter a control state of the robot toward a desired control regime. 5. The threshold learning control system of claim 4 , wherein the threshold learning module issues control parameters that cause the exploratory controller to issue exploratory commands that alter a control state of the robot away from an undesirable control regime. 6. The threshold learning control system of claim 1 , wherein the threshold learning module receives one or more anchor points corresponding to the received at least one state-action map, each anchor point associated with a control regime and defining a point in control space, the regime classifier determining the control regime of the robot using the one or more anchor points. 7. The threshold learning control system of claim 6 , wherein the regime classifier determines the control regime of the robot by determining whether a current robot state and action values are closer to an anchor point than to previously observed state-action pairs observed to be within the corresponding control regime for the anchor point. 8. A method of learning a robot controller of a robot, the method comprising: receiving sensor inputs from a sensor system of the robot; determining a control regime of the robot within a control space of a state-action map based on the received sensor inputs, the state-action map comprising control regimes arranged contiguously with boundaries therebetween, and each control regime providing a state-action space of possible robot states and robot actions in a corresponding control space; determining control parameters for exploratory commands based on the received sensor inputs and determined control regime; issuing exploratory commands to a control arbiter of the robot controller based on the control parameters, the control arbiter modifying the exploratory commands based on received control limits; receiving feedback from the control arbiter of executed modified exploratory commands for determining the control parameters; and learning the boundaries between control regimes within the state-action space of the state-action map using at least one of the received sensor inputs, determined control regime, anchor points of the state-action map, and feedback of the modified exploratory commands issued by the control arbiter. 9. The method of claim 8 , further comprising determining the control limits based on at least one of the received sensor inputs and the received feedback of the executed modified exploratory commands. 10. The method of claim 8 , further comprising determining the control parameters based on the received sensor inputs and the received feedback of executed modified exploratory commands. 11. The method of claim 8 , further comprising determining control parameters that cause issuance of exploratory commands that alter a control state of the robot toward a desired control regime. 12. The method of claim 11 , further comprising determining control parameters that cause issuance of exploratory commands that alter a control state of the robot away from an undesirable control regime. 13. The method of claim 8 , further comprising: receiving one or more anchor points corresponding to the state-action map, each anchor point associated with a control regime and defining a point in control space; and determining the control regime of the robot using the one or more anchor points. 14. The method of claim 13 , wherein determining the control regime of the robot comprises determining whether a current robot state and action values are closer to an anchor point than to previously observed state-action pairs observed to be within the corresponding control regime for the anchor point using the received sensor inputs. 15. A computer program product encoded on a non-transitory computer readable storage medium comprising instructions that when executed by a data processing apparatus cause the data processing apparatus to perform operations comprising: receiving sensor inputs from a sensor system of a robot; determining a control regime of the robot within a control space of a state-action map based on the received sensor inputs, the state-action map comprising control regimes arranged contiguously with boundaries therebetween, and each control regime providing a state-action space of possible robot states and robot actions in a corresponding control space; determining control parameters for exploratory commands based on the received sensor inputs and determined control regime; issuing exploratory commands to a control arbiter of the robot based on the control parameters, the control arbiter modifying the exploratory commands based on received control limits; receiving feedback from the control arbiter of executed modified exploratory commands for determining the control parameters; and learning the boundaries between control regimes within the state-action space of the state-action map using at least one of the received sensor inputs, determined control regime, anchor points of the state-action map, and feedback of the modified exploratory commands issued by the contro

Assignees

Irobot Corp

Inventors

Yamauchi Brian Masao

Classifications

G05B2219/39376
Hierarchical, learning, recognition and skill level and adaptation servo level · CPC title
G06N5/00Primary
Computing arrangements using knowledge-based models · CPC title
G05B2219/39093
On collision, ann, bam, learns path on line, used next time for same command · CPC title
B25J9/161Primary
Hardware, e.g. neural networks, fuzzy logic, interfaces, processor · CPC title
G05D1/0088
characterized by the autonomous decision making process, e.g. artificial intelligence, predefined behaviours (using knowledge based models G06N5/00) · CPC title

Patent family

Related publications grouped by family.

View patent family 53797304

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9135554B2 cover?: A threshold learning control system for learning a controller of a robot. The system includes a threshold learning module, a regime classifier, and an exploratory controller, each receiving sensory inputs from a sensor system of the robot. The regime classifier determines a control regime based on the received sensor inputs and communicates the control regime to the threshold learning module. T…
Who is the assignee on this patent?: Irobot Corp
What technology area does this patent fall under?: Primary CPC classification G06N5/00. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 15 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).