What technology area does this patent fall under?

Primary CPC classification G10L15/16. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 22 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Artificial intelligence apparatus and method for recognizing speech in consideration of utterance style

US11508358B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11508358-B2
Application number	US-201916709087-A
Country	US
Kind code	B2
Filing date	Dec 10, 2019
Priority date	Sep 30, 2019
Publication date	Nov 22, 2022
Grant date	Nov 22, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed herein an artificial intelligence apparatus for recognizing speech in consideration of an utterance style including a microphone, and a processor configured to obtain, via the microphone, speech data including speech of a user, extract an utterance feature vector from the obtained speech data, determine an utterance style corresponding to the speech based on the extracted utterance feature vector, and generate a speech recognition result using a speech recognition model corresponding to the determined utterance style.

First claim

Opening claim text (preview).

What is claimed is: 1. An artificial intelligence apparatus for recognizing speech in consideration of an utterance style, comprising: a microphone; a memory configured to store a plurality of speech recognition models corresponding to a plurality of utterance styles, respectively; and a processor configured to: obtain, via the microphone, speech data including speech of a user, extract an utterance feature vector from the obtained speech data, determine an utterance style corresponding to the speech based on the extracted utterance feature vector and an utterance style determination model, determine whether the determined utterance style corresponding to the speech is a new utterance style, based on a determination that the determined utterance style corresponding to the speech is not the new utterance style, generate a first speech recognition result by using a speech recognition model corresponding to the determined utterance style, the speech recognition model having been selected from among the plurality of speech recognition models stored in the memory, based on a determination that the determined utterance style corresponding to the speech is the new utterance style, generate a new speech recognition model and generate a second speech recognition result by using the generated new speech recognition model, wherein the processor is configured to: map the extracted utterance feature vector to an utterance feature space, determine a cluster closest to the mapped utterance feature vector among clusters corresponding to a plurality of previously learned utterance styles, determine the utterance style corresponding to the extracted utterance feature vector as an utterance style corresponding to the closest cluster based on a distance from the mapped utterance feature vector to the closest cluster being less than a predetermined reference value, the utterance style corresponding to the closest cluster being stored in the memory, and determine the utterance style corresponding to the utterance feature vector as a new utterance style based on the distance being equal to or greater than the predetermined reference value, wherein the utterance style corresponding to the closest cluster is not one of the plurality of speech recognition models stored in the memory. 2. The artificial intelligence apparatus of claim 1 , wherein the speech recognition model is learned using training data including speech data corresponding to an utterance style corresponding to the speech recognition model. 3. The artificial intelligence apparatus of claim 1 , wherein the first or second speech recognition result includes intent information corresponding to the speech. 4. The artificial intelligence apparatus of claim 1 , wherein the utterance style determination model includes an artificial neural network and is learned using a machine learning algorithm or a deep learning algorithm. 5. The artificial intelligence apparatus of claim 1 , wherein the processor is configured to: generate training data corresponding to the extracted utterance feature vector if the determined utterance style is a new utterance style, learn the new speech recognition model corresponding to the new utterance style using the generated training data, and generate the second speech recognition result corresponding to the speech using the learned new speech recognition model. 6. The artificial intelligence apparatus of claim 5 , wherein the processor is configured to: generate speech data corresponding to the extracted utterance feature vector from a predetermined text set using a text-to-speech (TTS) engine, and generate training data including the predetermined text set and the generated speech data. 7. The artificial intelligence apparatus of claim 1 , wherein the utterance feature vector includes at least one of a gender of a speaker, a speech speed, a pronunciation, a pronunciation stress, a pause interval, a pitch, a tone, an intonation, a rhyme or an emotion. 8. A method of recognizing speech in consideration of an utterance style, the method comprising: obtaining, via a microphone, speech data including speech of a user; extracting an utterance feature vector from the obtained speech data; determining an utterance style corresponding to the speech based on the extracted utterance feature vector and an utterance style determination model; determining whether the determined utterance style corresponding to the speech is a new utterance style; based on a determination that the determined utterance style corresponding to the speech is not the new utterance style, generating a first speech recognition result by using a speech recognition model corresponding to the determined utterance style, the speech recognition model having been selected from among a plurality of speech recognition models stored in a memory; based on a determination that the determined utterance style corresponding to the speech is the new utterance style, generating a new speech recognition model and generating a second speech recognition result by using the generated new speech recognition model, wherein the memory is configured to store the plurality of speech recognition models corresponding to a plurality of utterance styles, respectively, and wherein determining whether the determined utterance style corresponding to the speech is the new utterance style comprises: mapping the extracted utterance feature vector to an utterance feature space, determining a cluster closest to the mapped utterance feature vector among clusters corresponding to a plurality of previously learned utterance styles, determining the utterance style corresponding to the extracted utterance feature vector as an utterance style corresponding to the closest cluster based on a distance from the mapped utterance feature vector to the closest cluster being less than a predetermined reference value, the utterance style corresponding to the closest cluster being stored in the memory, and determining the utterance style corresponding to the utterance feature vector as a new utterance style based on the distance being equal to or greater than the predetermined reference value, wherein the utterance style corresponding to the closest cluster is not one of the plurality of speech recognition models stored in the memory. 9. A non-transitory computer-readable medium having recorded thereon a program for performing a method of recognizing speech in consideration of an utterance style, the method comprising: obtaining, via a microphone, speech data including speech of a user; extracting an utterance feature vector from the obtained speech data; determining an utterance style corresponding to the speech based on the extracted utterance feature vector and an utterance style determination model; determining whether the determined utterance style corresponding to the speech is a new utterance style; based on a determination that the determined utterance style corresponding to the speech is not the new utterance style, generating a first speech recognition result by using a speech recognition model corresponding to the determined utterance style, the speech recognition model having been selected from among a plurality of speech recognition models stored in a memory; based on a determination that the determined utterance style corresponding to the speech is the new utterance style, generating a new speech recognition model and generating a second speech recognition result by using the generated new speech recognition model, wherein the memory is configured to store the plurality of speech recognition models corresponding to a plurality of utterance styles, respectively, and wherein determining

Assignees

Lg Electronics Inc

Inventors

Chae Jonghoon

Classifications

G10L15/16Primary
using artificial neural networks · CPC title
G10L25/51Primary
for comparison or discrimination · CPC title
G06N3/042
Knowledge-based neural networks; Logical representations of neural networks · CPC title
G06F18/23
Clustering techniques · CPC title
G06N3/045
Combinations of networks · CPC title

Patent family

Related publications grouped by family.

View patent family 68462810

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11508358B2 cover?: Disclosed herein an artificial intelligence apparatus for recognizing speech in consideration of an utterance style including a microphone, and a processor configured to obtain, via the microphone, speech data including speech of a user, extract an utterance feature vector from the obtained speech data, determine an utterance style corresponding to the speech based on the extracted utterance fe…
Who is the assignee on this patent?: Lg Electronics Inc
What technology area does this patent fall under?: Primary CPC classification G10L15/16. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 22 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Intent recognition and emotional text-to-speech learning

Method and apparatus for processing voice data of speech

Speech recognition method and apparatus

Speech processing device, speech processing method, and computer program product

Cluster specific speech model

Frequently asked questions