System and method for communicating with a user with speech processing

US2022130378A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2022130378-A1
Application numberUS-201917418679-A
CountryUS
Kind codeA1
Filing dateDec 20, 2019
Priority dateDec 27, 2018
Publication dateApr 28, 2022
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and speech processing system for communicating with a user is provided. A speech signal may be received. The received speech signal may be processed by a first unified neural network to extract one or more of intents and entities. The one or more of intents and entities may be analyzed to generate a dialogue response. A second unified neural network may generate a speech output corresponding to the dialogue response for the user. In another example, a single unified neural network may process the received speech signal to extract one or more of intents and entities. The one or more of intents and entities may be analyzed, by the single unified neural network, to generate a dialogue response. The single unified neural network may generate a speech output corresponding to the dialogue response for the user.

First claim

Opening claim text (preview).

1 . A speech processing system for communicating with a user, comprising: an input interface configured to receive a speech signal; a first unified neural network comprising an automatic speech recognition (ASR) section and a natural language understanding (NLU) section, the first unified neural network configured to process the speech signal to extract one or more of intents and entities; a dialogue manager configured to analyze the one or more of intents and entities to generate a dialogue response; and a second unified neural network comprising a natural language generator (NLG) section and a text-to-speech (TTS) section, the second unified neural network configured to generate a speech output corresponding to the dialogue response for the user. 2 . The speech processing system according to claim 1 , wherein the ASR section of the first unified neural network is configured to convert the speech signal into a first network state and the NLU section of the first unified neural network is configured to extract the one or more of intents and entities from the first network state. 3 . The speech processing system according to claim 2 , wherein the NLG section of the second unified neural network is configured to generate a second network state corresponding to the dialogue response and the TTS section of the second unified neural network is configured to convert the second network state into the speech output. 4 . The speech processing system according to claim 1 , wherein the input interface is further configured to receive one or more events and transmit the one or more events to the dialogue manager. 5 . The speech processing system according to claim 4 , wherein the dialogue manager is further configured to generate one or more control parameters for the first unified neural network based on the one or more events, wherein the first unified neural network is configured to implement one or more models based on the one or more control parameters. 6 . The speech processing system according to claim 1 , wherein the dialogue manager is further configured to fetch data from an external database for analyzing the one or more of intents and entities to generate the dialogue response. 7 . The speech processing system according to claim 1 , wherein the first unified neural network and the second unified neural network are configured to implement at least one of one or more bi-directional Long Short Term Memory (LSTM) neural networks and one or more transformer neural networks. 8 . The speech processing system according to claim 1 , wherein the first unified neural network is further configured to extract a semantic relationship between the one or more of intents and entities; and the second unified neural network is further configured to analyze the semantic relationship between the one or more of intents and entities to generate the speech output corresponding to the dialogue response for the user. 9 . The speech processing system according to claim 1 , wherein the entities are composite entities. 10 . A speech processing system for communicating with a user, comprising: an input interface configured to receive a speech signal; and a single unified neural network configured to: process the speech signal to extract one or more of intents and entities; analyze the one or more of intents and entities to generate a dialogue response; and generate a speech output corresponding to the dialogue response for the user. 11 . The speech processing system according to claim 10 , wherein the single unified neural network comprises: an automatic speech recognition (ASR) section configured to convert the speech signal into a first network state; a natural language understanding (NLU) section configured to extract the one or more of intents and entities from the first network state; a natural language generator (NLG) section configured to generate a second network state corresponding to the dialogue response; and a text-to-speech (TTS) section configured to convert the second network state into the speech output. 12 . The speech processing system according to claim 10 , wherein the single unified neural network comprises a dialogue manager section configured to fetch data from an external database for analyzing the one or more of intents and entities to generate the dialogue response. 13 . The speech processing system according to claim 10 , wherein the single neural network is further configured to process the speech signal to extract a semantic relationship between the one or more of intents and entities, wherein the single neural network is further configured to analyze the semantic relationship between the one or more of intents and entities to generate the dialogue response. 14 . The speech processing system according to claim 10 , wherein the entities are composite entities. 15 . A computer implemented method for speech processing, comprising: receiving a speech signal; processing, by a first unified neural network, the speech signal to extract one or more of intents and entities; analyzing the one or more of intents and entities to generate a dialogue response; and generating, by a second unified neural network, a speech output corresponding to the dialogue response for the user. 16 . The computer implemented method according to claim 15 , further comprising converting, by an automatic speech recognition (ASR) section of the first unified neural network, the speech signal into a first network state and extracting, by a natural language understanding (NLU) section of the first unified neural network, the one or more of intents and entities from the first network state. 17 . The computer implemented method according to claim 16 , further comprising generating, by a natural language generator (NLG) section of the second unified neural network, a second network state corresponding to the dialogue response and converting, by a text-to-speech (TTS) section of the second unified neural network, the second network state into the speech output. 18 . The computer implemented method according to claim 15 , wherein the entities are composite entities. 19 - 29 . (canceled)

Assignees

Inventors

Classifications

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

  • Supervised learning · CPC title

  • Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title

  • Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022130378A1 cover?
A method and speech processing system for communicating with a user is provided. A speech signal may be received. The received speech signal may be processed by a first unified neural network to extract one or more of intents and entities. The one or more of intents and entities may be analyzed to generate a dialogue response. A second unified neural network may generate a speech output corresp…
Who is the assignee on this patent?
Telepathy Labs Inc
What technology area does this patent fall under?
Primary CPC classification G10L15/1815. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Apr 28 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).