Who is the assignee on this patent?

Sony Corp, Sony Corp America, Sony Group Corp

What technology area does this patent fall under?

Primary CPC classification G06N20/20. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 13 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and system for continual learning in an intelligent artificial agent

US11443229B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11443229-B2
Application number	US-201816120111-A
Country	US
Kind code	B2
Filing date	Aug 31, 2018
Priority date	Aug 31, 2018
Publication date	Sep 13, 2022
Grant date	Sep 13, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and system for teaching an artificial intelligent agent includes giving the agent several examples where it can learn to identify what is important about these example states. Once the agent has the ability to recognize a goal configuration, it can use that information to then learn how to achieve the goal states on its own. An agent may be provided with positive and negative examples to demonstrate a goal configuration. Once the agent has learned certain goal configurations, the agent can learn an option to achieve the goal configuration and a distance function that predicts at least one of a distance and a duration to the goal configuration under the learned option. This distance function prediction may be incorporated as a state feature of the agent.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for training an artificial intelligent agent, comprising: defining, within the agent, a first continual learning block to include a first skill to achieve a first goal configuration for the agent and a first knowledge feature providing a first prediction of at least one of a distance and duration to achieve the first goal configuration; using the first skill to move the agent in the first goal configuration; defining, within the agent, a second continual learning block, including a second goal configuration, distinct from the first goal configuration, and a second knowledge feature providing a second prediction of at least one of a distance and duration to achieve the second goal configuration, wherein the second continual learning block builds upon the first continual learning block, and using the first prediction by the second continual learning block to move the agent to the second goal configuration. 2. The method of claim 1 , further comprising: using features of the first goal configuration for achievement of the second goal configuration. 3. The method of claim 1 , wherein the first knowledge feature is a value function based on the first goal configuration as a termination condition. 4. The method of claim 1 , further comprising: providing positive examples via an interface to the agent when the agent is in the first goal configuration; providing negative examples via the interface to the agent when the agent is not in the first goal configuration; and extracting key state features to determine what features are important during receipt of positive examples to the agent. 5. The method of claim 1 , further comprising incorporating the first prediction as a state feature of the agent. 6. The method of claim 1 , wherein the first knowledge feature is selected from the group consisting of a distance function, a time to completion, a time to initiation of something else, and a prediction of a value of a feature at the time of completion. 7. The method of claim 1 , wherein the first knowledge feature is learned, either before, in conjunction with, interleaved with, or after a policy. 8. A method of learning to achieve a goal configuration of an artificial agent, comprising: defining, within the agent, the goal configuration for the agent as part of a continual learning block; determining a knowledge feature as a prediction of at least one of a distance and duration required to achieve the goal configuration; relying on a previous learned continual learning block, having a previously learned distinct goal configuration, to move the agent in the goal configuration; determining a first knowledge feature as a first prediction of a number of steps required to achieve the goal configuration; and relying on a previous knowledge feature to achieve the goal configuration, the previous knowledge feature being a previous prediction of at least one of a distance and duration required to achieve the previous learned goal configuration. 9. The method of claim 8 , wherein a previous knowledge feature is used to achieve the goal configuration, the previous knowledge feature being a previous prediction of at least one of a distance and duration required to achieve the previous learned goal configuration. 10. The method of claim 8 , wherein the previous learned goal configuration is an element of a previous continual learning block. 11. The method of claim 10 , wherein the previous continual learning block includes a plurality of previous continual learning blocks, each having a respective previous learned goal configuration and a respective previous knowledge feature. 12. The method of claim 11 , further comprising planning ahead, by the agent, to determine how to most efficiently achieve the respective previous learned goal configurations in order to achieve the goal configuration. 13. The method of claim 8 , wherein the first knowledge feature is selected from the group consisting of a distance function, a time to completion, a time to initiation of something else, and a prediction of a value of a feature at the time of completion. 14. A method of learning to achieve a goal configuration of an artificial agent, comprising: defining, within the agent, the goal configuration for the agent as part of a continual learning block; determining, within the agent, a knowledge feature as a prediction of at least one of a duration and a distance required to achieve the goal configuration, the knowledge feature being a component of the continual learning block; and relying, by the agent, on a previous learned distinct goal configuration, of a previously learned continual learning block, to move the agent in the goal configuration; wherein a previous knowledge feature is used to achieve the goal configuration, wherein the previous knowledge feature is a previous prediction of at least one of a duration and a distance required to achieve the previous learned goal configuration, and wherein the previous knowledge feature, along with the previous goal configuration, are components of a previous continual learning block. 15. The method of claim 14 , wherein the previous continual learning block includes a plurality of previous continual learning blocks, each having a respective previous learned goal configuration and a respective previous knowledge feature. 16. The method of claim 15 , wherein each of the plurality of the previous continual learning blocks are relied upon to achieve the goal configuration. 17. The method of claim 16 , further comprising planning ahead, by the agent, to determine how to most efficiently achieve the respective previous learned goal configurations in order to achieve the goal configuration.

Assignees

Inventors

Classifications

G06N3/006
based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO] · CPC title
G06N3/088
Non-supervised learning, e.g. competitive learning · CPC title
G06N20/20Primary
Ensemble learning · CPC title
G06N5/043
Distributed expert systems; Blackboards · CPC title
G06N20/00Primary
Machine learning · CPC title

Patent family

Related publications grouped by family.

View patent family 69641713

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11443229B2 cover?: A method and system for teaching an artificial intelligent agent includes giving the agent several examples where it can learn to identify what is important about these example states. Once the agent has the ability to recognize a goal configuration, it can use that information to then learn how to achieve the goal states on its own. An agent may be provided with positive and negative examples …
Who is the assignee on this patent?: Sony Corp, Sony Corp America, Sony Group Corp
What technology area does this patent fall under?: Primary CPC classification G06N20/20. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 13 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).