What technology area does this patent fall under?

Primary CPC classification G10L15/16. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 17 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Voice processing method based on artificial intelligence

US11790893B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11790893-B2
Application number	US-202017039169-A
Country	US
Kind code	B2
Filing date	Sep 30, 2020
Priority date	Nov 22, 2019
Publication date	Oct 17, 2023
Grant date	Oct 17, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A voice processing method is disclosed. The voice processing method applies first and second sentence vectors extracted from first and second utterances, that are included in one dialog group and are separated from each other, to a learning model and generates an output from which at least one word having an overlapping meaning is removed. The voice processing method can be associated with an artificial intelligence module, an unmanned aerial vehicle (UAV), a robot, an augmented reality (AR) device, a virtual reality (VR) device, devices related to 5G services, and the like.

First claim

Opening claim text (preview).

What is claimed is: 1. A voice processing method for controlling an artificial intelligence device, the voice processing method comprising: in response to detecting, by a processor in the artificial intelligence device, a stop signal during a reception of a first utterance, temporarily pausing the reception of the first utterance; receiving, by the processor, a sub-utterance while the reception of the first utterance is temporarily paused; outputting, by the processor, a first result corresponding to the sub-utterance while the reception of the first utterance is temporarily paused; receiving, by the processor, a second utterance after a termination of a temporary pause state based on the stop signal; applying, by the processor, a concatenated vector concatenating first and second sentence vectors extracted from the first and second utterances to a pre-trained learning model to generate an output from which at least one word having an overlapping meaning is removed; and outputting, by the processor, a second result according to the output generated by the pre-trained learning model, the second result being different than the first result, wherein the stop signal is a voice signal corresponding to one of a hesitation word, a silent delay, or a preset temporary pause keyword or sound, wherein the artificial intelligence device is prevented from providing an answer to the first utterance while the reception of the first utterance is paused, wherein the second sentence vector is a vector concatenating a plurality of sub-vectors extracted from at least one word included in the second utterance, wherein generating the output comprises: calculating a similarity between the first sentence vector and at least one of the plurality of sub-vectors constituting the second sentence vector; and in response to determining that the first sentence vector and the at least one of the plurality of sub-vectors have the overlapping meaning based on the similarity, generating the output from which the at least one word having the overlapping meaning is removed, and wherein the at least one word having the overlapping meaning is a word corresponding to at least one of the plurality of sub-vectors having a calculated similarity that is equal to or greater than a threshold. 2. The voice processing method of claim 1 , further comprising, if the reception of the first utterance is temporarily paused, waiting for an additional voice input for the first utterance that is input before the temporary pause state. 3. The voice processing method of claim 1 , wherein the first sentence vector is a vector representing an overall content of the first utterance. 4. The voice processing method of claim 1 , wherein the first and second sentence vectors are extracted by a convolutional neural network (CNN). 5. The voice processing method of claim 1 , wherein the learning model is a learning model based on an artificial neural network, wherein the artificial neural network includes an input layer, a hidden layer, and an output layer each having at least one node. 6. The voice processing method of claim 5 , wherein the learning model is a learning model based on a recurrent neural network (RNN). 7. The voice processing method of claim 5 , wherein at least some nodes in the artificial neural network have different weights in order to generate the output. 8. The voice processing method of claim 1 , wherein the second utterance is an utterance belonging to a same dialog group as the first utterance. 9. A non-transitory computer readable recording medium on which a program for implementing the method according to claim 1 is recorded. 10. A voice processing method for controlling an artificial intelligence device, the voice processing method comprising: in response to detecting, by a processor in the artificial intelligence device, a stop signal while a first utterance is transmitted to a server, temporarily pausing transmission of the first utterance; receiving, by the processor, a sub-utterance while the transmission of the first utterance is temporarily paused; outputting, by the processor, a first result corresponding to the sub-utterance while the transmission of the first utterance is temporarily paused; transmitting, by the processor, a second utterance to the server after a termination of a temporary pause state based on the stop signal; applying, by the processor, a concatenated vector concatenating first and second sentence vectors extracted from the first and second utterances to a pre-trained learning model and receiving, from the server, an output from which at least one word having an overlapping meaning is removed; and outputting, by the processor, a second result according to the output from the server, the second result being different than the first result, wherein the stop signal is a voice signal corresponding to one of a hesitation word, a silent delay, or a preset temporary pause keyword or sound, wherein the artificial intelligence device is prevented from providing an answer to the first utterance while the reception of the first utterance is paused, wherein the second sentence vector is a vector concatenating a plurality of sub-vectors extracted from at least one word included in the second utterance, wherein generating the output comprises: calculating a similarity between the first sentence vector and at least one of the plurality of sub-vectors constituting the second sentence vector; and in response to determining that the first sentence vector and the at least one of the plurality of sub-vectors have the overlapping meaning based on the similarity, generating the output from which the at least one word having the overlapping meaning is removed, and wherein the at least one word having the overlapping meaning is a word corresponding to at least one of the plurality of sub-vectors having a calculated similarity that is equal to or greater than a threshold. 11. The voice processing method of claim 10 , further comprising: receiving, from a network, downlink control information (DCI) used to schedule the transmission of the first and second utterances; and transmitting the first and second utterances to the network based on the DCI. 12. The voice processing method of claim 11 , further comprising: performing an initial access procedure with the network based on a synchronization signal block (SSB); and transmitting the first and second utterances to the network via a physical uplink shared channel (PUSCH), wherein the SSB and a demodulation reference signal (DM-RS) of the PUSCH are QCLed for QCL (quasi co-located) type D. 13. The voice processing method of claim 12 , further comprising: controlling a communication module to transmit the first and second utterances to an AI processor included in the network; and controlling the communication module to receive AI processing information from the AI processor, wherein the AI processing information is voice information synthesized based on the output from which the at least one word having the overlapping meaning is removed. 14. An artificial intelligence device for voice processing, comprising: a memory configured to store utterances from a user; and a processor configured to: detect a stop signal during a reception of a first utterance, and temporarily pause the reception of the first utterance, receive a sub-utterance while the reception of the first utterance is temporarily paused, output a first result corresponding to the sub-utterance while the reception of the first utterance is temporarily paused, receive a second utterance after a termination of a temp

Assignees

Lg Electronics Inc

Inventors

Classifications

G10L15/16Primary
using artificial neural networks · CPC title
H04W72/1268
of uplink data flows · CPC title
H04W72/23
in the downlink direction of a wireless link, i.e. towards a terminal · CPC title
G10L15/22Primary
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
G10L25/48
specially adapted for particular use · CPC title

Patent family

Related publications grouped by family.

View patent family 75975450

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11790893B2 cover?: A voice processing method is disclosed. The voice processing method applies first and second sentence vectors extracted from first and second utterances, that are included in one dialog group and are separated from each other, to a learning model and generates an output from which at least one word having an overlapping meaning is removed. The voice processing method can be associated with an a…
Who is the assignee on this patent?: Lg Electronics Inc
What technology area does this patent fall under?: Primary CPC classification G10L15/16. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 17 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).