What technology area does this patent fall under?

Primary CPC classification G10L15/1807. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 27 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Method and apparatus for classifying lexical stress

US9928832B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9928832-B2
Application number	US-201414320152-A
Country	US
Kind code	B2
Filing date	Jun 30, 2014
Priority date	Dec 16, 2013
Publication date	Mar 27, 2018
Grant date	Mar 27, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for classifying lexical stress in an utterance includes generating a feature vector representing stress characteristics of a syllable occurring in the utterance, wherein the feature vector includes a plurality of features based on prosodic information and spectral information, computing a plurality of scores, wherein each of the plurality of scores is related to a probability of a given class of lexical stress, and classifying the lexical stress of the syllable based on the plurality of scores.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for classifying lexical stress in a speech sample to enable a computing device to provide stress pronunciation feedback usable by a speaker whose speech is represented in the speech sample, the method comprising: determining a plurality of syllables in the speech sample; locating vowels associated with the syllables of the speech sample; executing a feature extractor on the computing system; with the feature extractor, over a duration of a vowel located in a syllable of the speech sample, determining a plurality of features of the vowel that are usable by the computing system to determine a level of stress associated with the syllable; executing a modeling engine on the computing system; with the modeling engine, using the features of the vowel and at least one model created using other speech samples, computing a plurality of scores for the vowel relating to at least two of a likelihood that the syllable is primary stressed and a likelihood that the syllable is secondary stressed and a likelihood that the syllable is unstressed; executing a classifier on the computing system; using as inputs to the classifier a threshold, the plurality of scores for the vowel, and a canonical stress level for the syllable, determining a stress label for the syllable; tuning the threshold to improve stress pronunciation feedback output by the computing device in response to the speech sample. 2. The method of claim 1 , wherein determining the plurality of features of the vowel comprises computing a segmental feature on the vowel. 3. The method of claim 2 , wherein determining the plurality of features of the vowel comprises computing a plurality of spectral features over a time frame associated with the vowel. 4. The method of claim 3 , comprising including both the segmental features and the spectral features in a feature vector. 5. The method of claim 4 , comprising using the feature vector to compute the plurality of scores. 6. The method of claim 1 , wherein the plurality of scores comprises at least three scores relating to a likelihood that the syllable is primary stressed and a likelihood that the syllable is secondary stressed and a likelihood that the syllable is unstressed. 7. The method of claim 1 , wherein the threshold comprises a plurality of thresholds corresponding to the at least two of the likelihood that the syllable is primary stressed and the likelihood that the syllable is secondary stressed and the likelihood that the syllable is unstressed. 8. The method of claim 1 , comprising outputting the stress pronunciation feedback to an output device. 9. The method of claim 1 , comprising outputting the stress pronunciation feedback for use by a language learning system. 10. A system for enabling a computing device to interpret an un-interpreted portion of natural language captured by an audio input device coupled to the computing device so that the computing device can execute an action in response to the un-interpreted portion of the natural language, the system comprising: one or more processors; a communication interface coupled to the one or more processors; one or more non-transitory computer-readable storage media coupled to the one or more processors and storing sequences of instructions, which when executed by the one or more processors, cause the one or more processors to perform operations comprising: determining a plurality of syllables in the speech sample; locating vowels associated with the syllables of the speech sample; executing a feature extractor on the computing system; with the feature extractor, over a duration of a vowel located in a syllable of the speech sample, determining a plurality of features of the vowel that are usable by the computing system to determine a level of stress associated with the syllable; executing a modeling engine on the computing system; with the modeling engine, using the features of the vowel and at least one model created using other speech samples, computing a plurality of scores for the vowel relating to at least two of a likelihood that the syllable is primary stressed and a likelihood that the syllable is secondary stressed and a likelihood that the syllable is unstressed; executing a classifier on the computing system; using as inputs to the classifier a threshold, the plurality of scores for the vowel, and a canonical stress level for the syllable, determining a stress label for the syllable; tuning the threshold to improve stress pronunciation feedback output by the computing device in response to the speech sample. 11. The system of claim 10 , wherein determining the plurality of features of the vowel comprises computing a segmental feature on the vowel. 12. The system of claim 11 , wherein determining the plurality of features of the vowel comprises computing a plurality of spectral features over a time frame associated with the vowel. 13. The system of claim 12 , wherein the instructions, when executed by the one or more processors, cause the one or more processors to perform operations comprising including both the segmental features and the spectral features in a feature vector. 14. The system of claim 13 , wherein the instructions, when executed by the one or more processors, cause the one or more processors to perform operations comprising using the feature vector to compute the plurality of scores. 15. The system of claim 10 , wherein the plurality of scores comprises at least three scores relating to a likelihood that the syllable is primary stressed and a likelihood that the syllable is secondary stressed and a likelihood that the syllable is unstressed. 16. The system of claim 10 , wherein the threshold comprises a plurality of thresholds corresponding to the at least two of the likelihood that the syllable is primary stressed and the likelihood that the syllable is secondary stressed and the likelihood that the syllable is unstressed. 17. The system of claim 10 , wherein the instructions, when executed by the one or more processors, cause the one or more processors to perform operations comprising outputting the stress pronunciation feedback to an output device. 18. The system of claim 10 , wherein the instructions, when executed by the one or more processors, cause the one or more processors to perform operations comprising outputting the stress pronunciation feedback for use by a language learning system. 19. A computer program product for enabling a computing device to interpret an un-interpreted portion of natural language captured by an audio input device coupled to the computing device so that the computing device can execute an action in response to the un-interpreted portion of the natural language, the computer program product comprising one or more non-transitory computer readable storage media storing instructions which, when executed by one or more processors, cause the one or more processors to perform operations comprising: determining a plurality of syllables in the speech sample; locating vowels associated with the syllables of the speech sample; executing a feature extractor on the computing system; with the feature extractor, over a duration of a vowel located in a syllable of the speech sample, determining a plurality of features of the vowel that are usable to determine a level of stress associated with the syllable; executing a modeling engine on the computing system; with the modeling engine, using the features of the vowel and at least one model created using other speech samples, computin

Assignees

Stanford Res Inst Int

Inventors

Classifications

G10L25/48
specially adapted for particular use · CPC title
G10L15/1807Primary
using prosody or stress · CPC title
G10L25/24
the extracted parameters being the cepstrum · CPC title

Patent family

Related publications grouped by family.

View patent family 53369239

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9928832B2 cover?: A method for classifying lexical stress in an utterance includes generating a feature vector representing stress characteristics of a syllable occurring in the utterance, wherein the feature vector includes a plurality of features based on prosodic information and spectral information, computing a plurality of scores, wherein each of the plurality of scores is related to a probability of a give…
Who is the assignee on this patent?: Stanford Res Inst Int
What technology area does this patent fall under?: Primary CPC classification G10L15/1807. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 27 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).