Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G06N3/006. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Feb 12 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Efficient dialogue policy learning

US10204097B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10204097-B2
Application number	US-201715619314-A
Country	US
Kind code	B2
Filing date	Jun 9, 2017
Priority date	Aug 16, 2016
Publication date	Feb 12, 2019
Grant date	Feb 12, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Efficient exploration of natural language conversations associated with dialogue policy learning may be performed using probabilistic distributions. Exploration may comprise identifying key terms associated with the received natural language input utilizing the structured representation. Identifying key terms may include converting raw text of the received natural language input into a structured representation. Exploration may also comprise mapping at least one of the key terms to an action to be performed by the computer system in response to receiving natural language input associated with the at least one key term. Mapping may then be performed using a probabilistic distribution. The action may then be performed by the computer system. A replay buffer may also be utilized by the computer system to track what has occurred in previous conversations. The replay buffer may then be pre-filled with one or more successful dialogues to jumpstart exploration.

First claim

Opening claim text (preview).

What is claimed: 1. A computer system comprising: one or more processors; and one or more computer-readable storage media having stored thereon computer-executable instructions that are executable by the one or more processors to cause the computer system to perform efficient exploration of natural language conversations associated with dialogue policy learning of the computer system, the computer-executable instructions including instructions that are executable to cause the computer system to perform at least the following: in response to receiving natural language input, perform at least the following: identifying key terms associated with the received natural language input, wherein identifying the key terms includes converting raw text of the received natural language input into a structured representation; performing exploration of a natural language conversation associated with the received natural language input, the exploration comprising at least the following: based on the received natural language input, determining a plurality of potential actions that are to be performed by the computer system in response to the received natural language input by performing Thompson sampling using Monte Carlo samples that are associated with the received natural language input; mapping at least one of the key terms to an action selected from among the plurality of potential actions to be performed by the computer system in response to receiving the natural language input associated with the at least one key term, wherein the mapping is performed using a probabilistic distribution; and performing the action. 2. The computer system of claim 1 , wherein exploration is performed by Thompson sampling using Monte Carlo samples from a Bayes-by-Back Propagation Q Network (BBQN). 3. The computer system of claim 1 , wherein key terms comprise at least one of an act or a key=value pair. 4. The computer system of claim 1 , wherein the probabilistic distribution is dynamically learned, such that identified key terms of received natural language input are more accurately mapped to actions to be performed by the system. 5. The computer system of claim 4 , wherein the probabilistic distribution is dynamically learned using periodically created target networks. 6. The computer system of claim 1 , wherein exploration is performed in an offline environment, such that natural language input is received from a simulated user. 7. The computer system of claim 1 , wherein exploration is performed in an online environment, such that natural language input is received from an end user. 8. The computer system of claim 1 , wherein a replay buffer is utilized by the computer system to track what has occurred in previous conversations. 9. The computer system of claim 8 , wherein replay buffer spiking that comprises pre-filling the replay buffer with one or more successful dialogues is performed. 10. A method, implemented at a computer system that includes one or more processors, for performing efficient exploration of natural language conversations associated with dialogue policy learning, the method comprising: in response to receiving natural language input, performing at least the following: identifying key terms associated with the received natural language input, wherein identifying the key terms includes converting raw text of the received natural language input into a structured representation; performing exploration of a natural language conversation associated with the received natural language input, the exploration being performed using Thompson sampling from a Bayes-by-Back Propagation Q Network (BBQN), the exploration comprising at least the following: mapping at least one of the key terms to an action to be performed by the computer system in response to receiving the natural language input associated with the at least one key term, wherein the mapping is performed using a probabilistic distribution; and performing the action. 11. The method of claim 10 , wherein the exploration is performed by Thompson sampling using Monte Carlo samples from the BBQN. 12. The method of claim 10 , wherein key terms comprise at least one of an act or a key=value pair. 13. The method of claim 10 , wherein the probabilistic distribution is dynamically learned, such that identified key terms of received natural language input are more accurately mapped to actions to be performed by the system. 14. The method of claim 13 , wherein the probabilistic distribution is dynamically learned using periodically created target networks. 15. The method of claim 10 , wherein exploration is performed in an offline environment, such that natural language input is received from a simulated user. 16. The method of claim 10 , wherein exploration is performed in an online environment, such that natural language input is received from an end user. 17. The method of claim 10 , wherein a replay buffer is utilized by the computer system to track what has occurred in previous conversations. 18. The method of claim 17 , wherein replay buffer spiking that comprises pre-filling the replay buffer with one or more successful dialogues is performed. 19. A computer system comprising: one or more processors; and one or more hardware storage devices having stored thereon computer-executable instructions that are executable by the one or more processors to perform efficient exploration of natural language conversations associated with dialogue policy learning, the computer-executable instructions including instructions that are executable to cause the computer system to perform at least the following: in response to receiving natural language input, perform at least the following: identifying key terms associated with the received natural language input, wherein identifying the key terms includes converting raw text of the received natural language input into a structured representation; performing exploration of a natural language conversation associated with the received natural language input, wherein the exploration is performed by Thompson sampling using Monte Carlo samples from a Bayes-by-back Propagation Q Network (BBQN), the exploration comprising at least the following: exploration comprising at least the following: mapping at least one of the key terms to an action to be performed by the computer system in response to receiving natural language input associated with the at least one key term, wherein mapping is performed using a probabilistic distribution; and performing the action.

Assignees

Microsoft Technology Licensing Llc

Inventors

Classifications

G06N7/01
Probabilistic graphical models, e.g. probabilistic networks · CPC title
G06N3/045
Combinations of networks · CPC title
G06N3/00
Computing arrangements based on biological models · CPC title
G06N3/006Primary
based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO] · CPC title
G06F40/35
Discourse or dialogue representation · CPC title

Patent family

Related publications grouped by family.

View patent family 61191723

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10204097B2 cover?: Efficient exploration of natural language conversations associated with dialogue policy learning may be performed using probabilistic distributions. Exploration may comprise identifying key terms associated with the received natural language input utilizing the structured representation. Identifying key terms may include converting raw text of the received natural language input into a structur…
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G06N3/006. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Feb 12 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Reactive learning for efficient dialog tree expansion

Device and method for a spoken dialogue system

Discriminative Policy Training for Dialog Systems

Deep belief network for large vocabulary continuous speech recognition

Frequently asked questions