Assistance generation

US10878337B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10878337-B2
Application numberUS-201615212893-A
CountryUS
Kind codeB2
Filing dateJul 18, 2016
Priority dateJul 18, 2016
Publication dateDec 29, 2020
Grant dateDec 29, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An assistance strategy may be generated with a generating apparatus including a processor, and one or more computer readable mediums collectively including instructions that, when executed by the processor, cause the processor to create a reward estimation model for estimating a reward for assisting at least one subject by analyzing a history of input by the subject, create a decision making model including a plurality of forms of assistance and estimated rewards for each form of assistance based on the reward estimation model and the history of input by the subject, and generate an assistance strategy based on the decision making model.

First claim

Opening claim text (preview).

What is claimed is: 1. A generating apparatus comprising: a processor; and one or more non-transitory computer readable mediums collectively including instructions that, when executed by the processor, cause the processor to: create a reward estimation model for estimating an estimated reward for assisting at least one subject by analyzing a history of input corresponding to the subject to determine a level of experience of the subject in operating a particular device; create a decision making model including a plurality of forms of assistance and estimated rewards for each form of assistance based on the reward estimation model and the history of input corresponding to the subject; and generate an assistance strategy based on the decision making model, the assistance strategy comprising maximizing a total reward when a confidence level is determined to be sufficient and increasing a dispersion of the total reward when the confidence level is determined to be insufficient, the assistance strategy further comprising strengthening the confidence level by iteratively evaluating each of the plurality of forms of assistance and balancing exploration of uncertain forms of assistance and exploitation of forms of assistance associated with a highest reward based on feedback from the subject positively indicating whether a particular form of assistance from the plurality of forms of assistance improves an ability of the subject to operate the particular device. 2. The generating apparatus of claim 1 , wherein the creation of the reward estimation model includes training the reward estimation model by using first learning data including a history of first sets, wherein each first set includes a form of assistance, the history of input and a corresponding reward indicated by at least one of the subject and an observer of the subject. 3. The generating apparatus of claim 2 , wherein: the instructions further cause the processor to create a state estimation model for estimating an inner state of the subject by analyzing the history of input corresponding to the subject; and the decision making model is based further on a plurality of states, the state estimation model, and a plurality of probability distributions for state transitions caused by forms of assistance. 4. The generating apparatus of claim 3 , wherein the creation of the state estimation model includes training the state estimation model by using second learning data including a history of second sets, wherein each second set includes the history of input and a corresponding inner state indicated by at least one of the subject and an observer of the subject. 5. The generating apparatus of claim 3 , wherein the plurality of states is based on a combination of a class of assistance and a class of inner states. 6. The generating apparatus of claim 3 , wherein the instructions further cause the processor to select an assistance to be presented to a target subject based on the assistance strategy. 7. The generating apparatus of claim 6 , wherein the instructions further cause the processor to: observe input corresponding to the target subject; and estimate an estimated inner state of the target subject based on the state estimation model and the history of input corresponding to the target subject, and wherein the selection of the assistance is further based on the estimated inner state. 8. The generating apparatus of claim 6 , wherein the selection of the assistance is further based on: a strategy to optimize a balance between exploitation of forms of assistance among the plurality of forms of assistance that are expected to be successful; and an exploration of forms of assistance that have not been validated to be successful. 9. The generating apparatus of claim 8 , wherein the balance is optimized based on a Bayesian posterior distribution for the decision making model. 10. The generating apparatus of claim 6 , wherein the instructions further cause the processor to update the decision making model based on a result of the selection of assistance. 11. The generating apparatus of claim 1 , wherein: the input is to the particular device, and the assistance presents a help message through an interface of the particular device. 12. The generating apparatus of claim 11 , wherein the reward is based on input selected from the group consisting of a difficulty in operating the particular device and an improvement of operation skill. 13. The generating apparatus of claim 1 , wherein the creation of the decision making model includes decreasing the reward if mismatched assistance is presented. 14. A computer program product comprising a non-transitory computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to perform the steps of: creating a reward estimation model for estimating a reward for assisting at least one subject by analyzing a history of input corresponding to the subject to determine a level of experience of the subject in operating a particular device; creating a decision making model including a plurality of forms of assistance and estimated rewards for each form of assistance based on the reward estimation model and the history of input corresponding to the subject; and generating an assistance strategy based on the decision making model, the assistance strategy comprising maximizing a total reward when a confidence level is determined to be sufficient and increasing a dispersion of the total reward when the confidence level is determined to be insufficient, the assistance strategy further comprising strengthening the confidence level by iteratively evaluating each of the plurality of forms of assistance and balancing exploration of uncertain forms of assistance and exploitation of forms of assistance associated with a highest reward based on feedback from the subject positively indicating whether a particular form of assistance from the plurality of forms of assistance improves an ability of the subject to operate the particular device. 15. The computer program product of claim 14 , further comprising: training the reward estimation model by using a first learning data including a history of first sets, each first set including a form of assistance, the history of input and a corresponding reward indicated by at least one of the subject and an observer of the subject. 16. The computer program product of claim 15 , further comprising creating a state estimation model for estimating an inner state of the subject by analyzing the history of input corresponding to the subject, wherein the decision making model is based further on: a plurality of states; the state estimation model; and a plurality of probability distributions for state transitions cause by forms of assistance. 17. A computer-implemented method comprising: creating a reward estimation model for estimating a reward for assisting at least one subject by analyzing a history of input corresponding to the subject to determine a level of experience of the subject in operating a particular device; creating a decision making model including a plurality of forms of assistance and estimated rewards for each form of assistance based on the reward estimation model and the history of input corresponding to the subject; and generating an assistance strategy, using a processor, based on the decision making model, the assistance strategy comprising maximizing a total reward when a confidence level is determined to be sufficient and increasing a dispersi

Assignees

Inventors

Classifications

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • G06N20/00Primary

    Machine learning · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10878337B2 cover?
An assistance strategy may be generated with a generating apparatus including a processor, and one or more computer readable mediums collectively including instructions that, when executed by the processor, cause the processor to create a reward estimation model for estimating a reward for assisting at least one subject by analyzing a history of input by the subject, create a decision making mo…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 29 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).