What technology area does this patent fall under?

Primary CPC classification G10L15/063. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 18 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

System and method of improving speech recognition using context

US9626963B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9626963-B2
Application number	US-201313874304-A
Country	US
Kind code	B2
Filing date	Apr 30, 2013
Priority date	Apr 30, 2013
Publication date	Apr 18, 2017
Grant date	Apr 18, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system and method are provided for improving speech recognition accuracy. Contextual information about user speech may be received, and then speech recognition analysis can be performed on the user speech using the contextual information. This allows the system and method to improve accuracy when performing tasks like searching and navigating using speech recognition.

First claim

Opening claim text (preview).

The invention claimed is: 1. A system, comprising: a processor; a single microphone configured to both record user speech and to record ambient sounds; and a speech recognition module configured to: identify that the ambient sounds are of a particular type by comparing the ambient sounds to stored waveforms; select a dictionary based on the identified particular type of ambient sounds; identify, as contextual information, terms related to the identified particular type of ambient sounds based on identification of the identified particular type of ambient sounds, the terms being generated as contextual information; alter, in response to identification of the terms related to the identified particular type of ambient sounds, the dictionary such that the dictionary includes the terms related to the identified particular type of ambient sounds; assign, in the dictionary, score values to the terms related to the identified particular type of ambient sounds based on identifying that the terms are related to the identified particular type of ambient sounds; and analyze the user speech by comparing each potential output word or phoneme in the user speech to waveforms stored for the dictionary to attempt to match the potential output word or phoneme to a waveform corresponding to a particular word or phoneme in the dictionary, an analysis varying based on the assigned scores to the terms identified as contextual information. 2. The system of claim 1 , wherein the ambient sounds include music and the identification that the ambient sounds are of the particular type includes identifying that the ambient sounds are music and identifying the music, wherein the speech recognition module is further configured to retrieve identify, as the contextual information, terms related to the identified music. 3. The system of claim 1 , further comprising a sensor, and wherein the contextual information includes information identified from sensor information detected by the sensor. 4. The system of claim 3 , wherein the sensor is a global positioning system module and the contextual information includes location. 5. The system of claim 3 , wherein the sensor is a global positioning system module and the contextual information includes speed. 6. A method comprising: recording sounds using a single microphone; identifying, using one or more processors, potential output words and phonemes as well as ambient sounds in the sounds recorded by the single microphone; identifying that the ambient sounds are of a particular type by comparing the ambient sounds to stored waveforms; selecting a dictionary based on the identified particular type of ambient sounds; identifying, as contextual information, terms related to the identified particular type of ambient sounds based on identification of the identified particular type of ambient sounds, the terms being generated as contextual information; assigning, in the dictionary, score values to the terms related to the identified particular type of ambient sounds based on identifying that the terms are related to the identified particular type of ambient sounds; and analyzing user speech by comparing each potential output word or phoneme in the user speech to waveforms stored for the dictionary to attempt to match the potential output word or phoneme to a waveform corresponding to a particular word or phoneme in the dictionary, the analyzing varying based on the assigned scores to the terms identified as contextual information. 7. The method of claim 6 , wherein the contextual information includes user location. 8. The method of claim 6 , wherein the contextual information includes speed of movement of a user. 9. The method of claim 6 , wherein the ambient sounds include music and the identification that the ambient sounds are of the particular type includes identifying that the ambient sounds are music and identifying the music, wherein the speech recognition module is further configured to identify, as the contextual information, terms related to the identified music. 10. The method of claim 6 , further comprising altering the dictionary based on the contextual information such that the dictionary includes the terms related to the identified particular type of ambient sounds. 11. The method of claim 10 , wherein the dictionary is altered by replacing the dictionary with a different dictionary. 12. The method of claim 10 , wherein the dictionary is altered by adding words pertaining to the contextual information to the dictionary. 13. A non-transitory machine-readable storage medium comprising a set of instructions which, when executed by a processor, causes execution of operations comprising: recording sounds using a single microphone; identifying potential output words and phonemes as well as ambient sounds in the sounds recorded by the single microphone; identifying that the ambient sounds are of a particular type by comparing the ambient sounds to stored waveforms; selecting a dictionary based on the identified particular type of ambient sounds; identifying, as contextual information, terms related to the identified particular type of ambient sounds based on identification of the identified particular type of ambient sounds, the terms being generated as contextual information; assigning, in the dictionary, score values to the terms related to the identified particular type of ambient sounds based on identifying that the terms are related to the identified particular type of ambient sounds; and analyzing the user speech by comparing each potential output word or phoneme in the user speech to waveforms stored for the dictionary to attempt to match the potential output word or phoneme to a waveform corresponding to a particular word or phoneme in the dictionary, the analyzing varying based on the assigned scores to the terms identified as contextual information. 14. The non-transitory machine-readable storage medium of claim 13 , wherein the speech recognition analysis includes utilizing a hidden Markov model. 15. The non-transitory machine-readable storage medium of claim 13 , wherein the ambient sounds include music and the identification that the ambient sounds are of the particular type includes identifying that the ambient sounds are music and identifying the music, wherein the speech recognition module is further configured to identify, as the contextual information, terms related to the identified music.

Assignees

Paypal Inc

Inventors

Farraro Eric J

Classifications

G10L15/06
Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice (G10L15/14 takes precedence) · CPC title
G10L15/063Primary
Training · CPC title
G10L15/30Primary
Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title
G10L2015/226
using non-speech characteristics · CPC title
G10L15/22Primary
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

Patent family

Related publications grouped by family.

View patent family 51789968

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9626963B2 cover?: A system and method are provided for improving speech recognition accuracy. Contextual information about user speech may be received, and then speech recognition analysis can be performed on the user speech using the contextual information. This allows the system and method to improve accuracy when performing tasks like searching and navigating using speech recognition.
Who is the assignee on this patent?: Paypal Inc
What technology area does this patent fall under?: Primary CPC classification G10L15/063. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 18 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).