Machine-Learning Models Based on Non-local Neural Networks
US-2019156210-A1 · May 23, 2019 · US
US2022130378A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2022130378-A1 |
| Application number | US-201917418679-A |
| Country | US |
| Kind code | A1 |
| Filing date | Dec 20, 2019 |
| Priority date | Dec 27, 2018 |
| Publication date | Apr 28, 2022 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method and speech processing system for communicating with a user is provided. A speech signal may be received. The received speech signal may be processed by a first unified neural network to extract one or more of intents and entities. The one or more of intents and entities may be analyzed to generate a dialogue response. A second unified neural network may generate a speech output corresponding to the dialogue response for the user. In another example, a single unified neural network may process the received speech signal to extract one or more of intents and entities. The one or more of intents and entities may be analyzed, by the single unified neural network, to generate a dialogue response. The single unified neural network may generate a speech output corresponding to the dialogue response for the user.
Opening claim text (preview).
1 . A speech processing system for communicating with a user, comprising: an input interface configured to receive a speech signal; a first unified neural network comprising an automatic speech recognition (ASR) section and a natural language understanding (NLU) section, the first unified neural network configured to process the speech signal to extract one or more of intents and entities; a dialogue manager configured to analyze the one or more of intents and entities to generate a dialogue response; and a second unified neural network comprising a natural language generator (NLG) section and a text-to-speech (TTS) section, the second unified neural network configured to generate a speech output corresponding to the dialogue response for the user. 2 . The speech processing system according to claim 1 , wherein the ASR section of the first unified neural network is configured to convert the speech signal into a first network state and the NLU section of the first unified neural network is configured to extract the one or more of intents and entities from the first network state. 3 . The speech processing system according to claim 2 , wherein the NLG section of the second unified neural network is configured to generate a second network state corresponding to the dialogue response and the TTS section of the second unified neural network is configured to convert the second network state into the speech output. 4 . The speech processing system according to claim 1 , wherein the input interface is further configured to receive one or more events and transmit the one or more events to the dialogue manager. 5 . The speech processing system according to claim 4 , wherein the dialogue manager is further configured to generate one or more control parameters for the first unified neural network based on the one or more events, wherein the first unified neural network is configured to implement one or more models based on the one or more control parameters. 6 . The speech processing system according to claim 1 , wherein the dialogue manager is further configured to fetch data from an external database for analyzing the one or more of intents and entities to generate the dialogue response. 7 . The speech processing system according to claim 1 , wherein the first unified neural network and the second unified neural network are configured to implement at least one of one or more bi-directional Long Short Term Memory (LSTM) neural networks and one or more transformer neural networks. 8 . The speech processing system according to claim 1 , wherein the first unified neural network is further configured to extract a semantic relationship between the one or more of intents and entities; and the second unified neural network is further configured to analyze the semantic relationship between the one or more of intents and entities to generate the speech output corresponding to the dialogue response for the user. 9 . The speech processing system according to claim 1 , wherein the entities are composite entities. 10 . A speech processing system for communicating with a user, comprising: an input interface configured to receive a speech signal; and a single unified neural network configured to: process the speech signal to extract one or more of intents and entities; analyze the one or more of intents and entities to generate a dialogue response; and generate a speech output corresponding to the dialogue response for the user. 11 . The speech processing system according to claim 10 , wherein the single unified neural network comprises: an automatic speech recognition (ASR) section configured to convert the speech signal into a first network state; a natural language understanding (NLU) section configured to extract the one or more of intents and entities from the first network state; a natural language generator (NLG) section configured to generate a second network state corresponding to the dialogue response; and a text-to-speech (TTS) section configured to convert the second network state into the speech output. 12 . The speech processing system according to claim 10 , wherein the single unified neural network comprises a dialogue manager section configured to fetch data from an external database for analyzing the one or more of intents and entities to generate the dialogue response. 13 . The speech processing system according to claim 10 , wherein the single neural network is further configured to process the speech signal to extract a semantic relationship between the one or more of intents and entities, wherein the single neural network is further configured to analyze the semantic relationship between the one or more of intents and entities to generate the dialogue response. 14 . The speech processing system according to claim 10 , wherein the entities are composite entities. 15 . A computer implemented method for speech processing, comprising: receiving a speech signal; processing, by a first unified neural network, the speech signal to extract one or more of intents and entities; analyzing the one or more of intents and entities to generate a dialogue response; and generating, by a second unified neural network, a speech output corresponding to the dialogue response for the user. 16 . The computer implemented method according to claim 15 , further comprising converting, by an automatic speech recognition (ASR) section of the first unified neural network, the speech signal into a first network state and extracting, by a natural language understanding (NLU) section of the first unified neural network, the one or more of intents and entities from the first network state. 17 . The computer implemented method according to claim 16 , further comprising generating, by a natural language generator (NLG) section of the second unified neural network, a second network state corresponding to the dialogue response and converting, by a text-to-speech (TTS) section of the second unified neural network, the second network state into the speech output. 18 . The computer implemented method according to claim 15 , wherein the entities are composite entities. 19 - 29 . (canceled)
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Auto-encoder networks; Encoder-decoder networks · CPC title
Supervised learning · CPC title
Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.