Method and apparatus for generating dialogue model

US11537798B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11537798-B2
Application numberUS-202016895297-A
CountryUS
Kind codeB2
Filing dateJun 8, 2020
Priority dateDec 27, 2019
Publication dateDec 27, 2022
Grant dateDec 27, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the present disclosure relate to a method and apparatus for generating a dialogue model. The method may include: acquiring a corpus sample set, a corpus sample including input information and target response information; classifying corpus samples in the corpus sample set, setting discrete hidden variables for the corpus samples based on a classification result to generate a training sample set, a training sample including the input information, the target response information, and a discrete hidden variable; and training a preset neural network using the training sample set to obtain the dialogue model, the dialogue model being used to represent a corresponding relationship between inputted input information and outputted target response information.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for generating a dialogue model, the method comprising: acquiring a corpus sample set, a corpus sample comprising input information and target response information; classifying corpus samples in the corpus sample set, setting discrete hidden variables for the corpus samples based on a classification result to generate a training sample set, a training sample comprising the input information, the target response information, and a discrete hidden variable, wherein the corpus samples in the corpus sample set are classified according to a response direction of the target response information in the corpus sample set, and each value of the discrete hidden variables corresponds to a given response direction of the target response information; and training a preset neural network using the training sample set to obtain the dialogue model, the dialogue model being used to represent a corresponding relationship between inputted input information and outputted target response information. 2. The method according to claim 1 , wherein the preset neural network is a conversion neural network, and the conversion neural network is a neural network supporting unidirectional encoding and bidirectional encoding of text information; the training the preset neural network using the training sample set to obtain the dialogue model, comprises: for the training sample in the training sample set, using the input information and the discrete hidden variable in the training sample as input, using the target response information in the training sample as expected output, training the conversion neural network based on one-way attention mechanism using a preset loss function, and updating a parameter of the conversion neural network to obtain an updated conversion neural network; and using the input information and the target response information in the training sample as input of the updated conversion neural network, using the discrete hidden variable in the training sample as expected output, and training the updated conversion neural network based on two-way attention mechanism using a loss function to obtain the dialogue model. 3. The method according to claim 2 , wherein the loss function comprises at least one of: a negative log likelihood loss function, a bag of words loss function, or a response selection loss function; wherein, the bag-of-word loss function is used to represent a degree of difference between a discrete hidden variable outputted by the conversion neural network and the discrete hidden variable in the training sample; and the response selection loss function is used to represent a degree of difference between target response information outputted by the conversion neural network and the target response information in the training sample. 4. The method according to claim 1 , wherein the training the preset neural network using the training sample set to obtain the dialogue model, comprises: converting the training sample into a text vector for the training sample in the training sample set; and inputting the text vector into the preset neural network for training to obtain the dialogue model. 5. The method according to claim 4 , wherein the converting the training sample into the text vector for the training sample in the training sample set, comprises: performing word segmentation on the input information and the target response information in the training sample for the training sample in the training sample set; and converting the training sample into the text vector, based on role information, word type information, dialogue round information and position information of each of the segmented words in the training sample. 6. An apparatus for generating a dialogue model, the apparatus comprising: at least one processor; and a memory storing instructions, the instructions when executed by the at least one processor, causing the at least one processor to perform operations, the operations comprising: acquiring a corpus sample set, a corpus sample comprising input information and target response information; classifying corpus samples in the corpus sample set, set discrete hidden variables for the corpus samples based on a classification result to generate a training sample set, a training sample comprising the input information, the target response information, and a discrete hidden variable, wherein the corpus samples in the corpus sample set are classified according to a response direction of the target response information in the corpus sample set, and each value of the discrete hidden variables corresponds to a given response direction of the target response information; and training a preset neural network using the training sample set to obtain the dialogue model, the dialogue model being used to represent a corresponding relationship between inputted input information and outputted target response information. 7. The apparatus according to claim 6 , wherein the preset neural network is a conversion neural network, and the conversion neural network is a neural network supporting unidirectional encoding and bidirectional encoding of text information; the training the preset neural network using the training sample set to obtain the dialogue model, comprises: for the training sample in the training sample set, using the input information and the discrete hidden variable in the training sample as input, using the target response information in the training sample as expected output, training the updated conversion neural network based on one-way attention mechanism using a preset loss function, and updating a parameter of the conversion neural network to obtain an updated conversion neural network; and using the input information and the target response information in the training sample as input of the updated conversion neural network, using the discrete hidden variable in the training sample as expected output, and training the conversion neural network based on two-way attention mechanism using a loss function to obtain the dialogue model. 8. The apparatus according to claim 7 , wherein the loss function comprises at least one of: a negative log likelihood loss function, a bag of words loss function, or a response selection loss function; wherein, the bag-of-word loss function is used to represent a degree of difference between a discrete hidden variable outputted by the conversion neural network and the discrete hidden variable in the training sample; and the response selection loss function is used to represent a degree of difference between target response information outputted by the conversion neural network and the target response information in the training sample. 9. The apparatus according to claim 6 , wherein the training the preset neural network using the training sample set to obtain the dialogue model, comprises: converting the training sample into a text vector for the training sample in the training sample set; and inputting the text vector into the preset neural network for training to obtain the dialogue model. 10. The apparatus according to claim 9 , wherein the converting the training sample into the text vector for the training sample in the training sample set, comprises: performing word segmentation on the input information and the target response information in the training sample for the training sample in the training sample set; and converting the training sample into the text vector, based on role information, word type information, dialogue round information and position information of each of the segmented words in the training sample. 11. A non-transitory computer readable medium, storing a computer prog

Assignees

Inventors

Classifications

  • Natural language query formulation · CPC title

  • G06F40/35Primary

    Discourse or dialogue representation · CPC title

  • Combinations of networks · CPC title

  • using relevance feedback from the user, e.g. relevance feedback on documents, documents sets, document terms or passages · CPC title

  • Speech to text systems (G10L15/08 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11537798B2 cover?
Embodiments of the present disclosure relate to a method and apparatus for generating a dialogue model. The method may include: acquiring a corpus sample set, a corpus sample including input information and target response information; classifying corpus samples in the corpus sample set, setting discrete hidden variables for the corpus samples based on a classification result to generate a trai…
Who is the assignee on this patent?
Beijing Baidu Netcom Sci & Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F16/3329. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 27 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).