What technology area does this patent fall under?

Primary CPC classification G10L15/22. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jul 23 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Automatic speech recognition (ASR) utilizing GPS and sensor data

US10360910B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10360910-B2
Application number	US-201715687228-A
Country	US
Kind code	B2
Filing date	Aug 25, 2017
Priority date	Aug 29, 2016
Publication date	Jul 23, 2019
Grant date	Jul 23, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An automatic speech recognition (ASR) system is disclosed that compensates for different noise environments and types of speech. The ASR system may be implemented as part of an action camera that collects status data, such as geographic location data and/or sensor data. The ASR system may perform speech recognition using an acoustic model and a speech recognition model, which are trained for operation in specific noise environments and/or for specific types of speech. The computing device may categorize a current status of the action camera, as indicated by the status data, into an action profile, which may represent a particular activity (e.g., running, cycling, etc.) or state of the computing device. The computing device may dynamically switch the acoustic model and/or the speech recognition model to compensate for anticipated changes in the noise environment and speech based upon the action profile to facilitate the recognition of various action camera functions.

First claim

Opening claim text (preview).

What is claimed is: 1. A computing device, comprising: a location-determining component configured to receive location signals and to generate geographic location data based on the received location signals; a sensor array configured to generate sensor data indicative of movement of the computing device; a memory configured to store a plurality of acoustic models and a plurality of speech recognition models to facilitate speech recognition, each acoustic model from among the plurality of acoustic models being associated with one or more acoustic tuning parameters corresponding to an environment with a unique noise characteristic, and each speech recognition model from among the plurality of speech recognition models being associated with a phonetic match tolerance; a processor unit coupled with the location-determining component, the sensor array, and the memory, the processor unit configured to: receive audible speech including a plurality of words; identify an action profile based on one or more of the geographic location data and the sensor data; select an acoustic model and a speech recognition model from among the plurality of acoustic models and speech recognition models based on the identified action profile; determine a phonetic term associated with each word in the received speech based on the selected acoustic model's acoustic tuning parameters to recognize speech; determine a meaning for each determined phonetic term by searching the selected speech recognition model for a match to the determined phonetic term, and execute a computing device function based on the determined meaning for each word within the received audible recognized speech. 2. The computing device of claim 1 , wherein the processor unit is further configured to select a speech recognition model from among the plurality of speech models having a higher phonetic match tolerance when the action profile indicates movement of the computing device in excess of a predetermined movement threshold, the higher phonetic match tolerance resulting in a higher depth and breadth of search for a match to the determined phonetic term. 3. The computing device of claim 1 , wherein the action profile indicates an instantaneous velocity of the computing device, and wherein the processor unit is further configured to select an acoustic model and a speech recognition model based on the instantaneous velocity of the computing device. 4. The computing device of claim 3 , wherein: each acoustic model from among the plurality of acoustic models is associated with a predetermined range of computing device velocities, each speech recognition model from among the plurality of speech recognition models is associated with a predetermined range of computing device velocities, and the processor unit is further configured to select an acoustic model and a speech recognition model having a respective predetermined range of velocities associated with the instantaneous velocity of the computing device. 5. The computing device of claim 1 , wherein the action profile indicates an orientation of the computing device, and wherein the processor unit is further configured to select an acoustic model and a speech recognition model based on the orientation of the computing device. 6. The computing device of claim 1 , wherein the one or more acoustic tuning parameters associated with each acoustic model from among the plurality of acoustic models facilitates the determination of phonetic terms in accordance with a different level of noise tolerance. 7. The computing device of claim 1 , wherein the acoustic model is trained in accordance with a type of speech resulting from a user performing a type of physical activity matching the identified action profile. 8. The computing device of claim 1 , wherein the plurality of acoustic models and the plurality of speech recognition models facilitate speech recognition in accordance with a trigger speech recognizer that facilitates speech recognition of a wake word, and a command speech recognizer that facilitates speech recognition of computing device commands once the wake word is recognized, and wherein the processor unit is further configured to independently select an acoustic model and a speech recognition model for each of the trigger speech recognizer and the command speech recognizer. 9. The computing device of claim 1 , further comprising: wherein the processor unit is further configured to control a microphone to receive the audible speech, and to maintain the microphone in an operating state such that audio input is continuously received via the microphone. 10. An action camera, comprising: a location-determining component configured to receive location signals and to generate geographic location data based on the received location signals; a sensor array configured to generate sensor data indicative of movement of the action camera; a memory configured to store a plurality of speech recognition models and a plurality of acoustic models to facilitate speech recognition, wherein each acoustic model from among the plurality of acoustic models is associated with one or more acoustic tuning parameters corresponding to an environment with a unique noise characteristic for a predetermined range of action camera velocities, and wherein each speech recognition model from among the plurality of speech recognition models is associated with a phonetic match tolerance for a predetermined range of action camera velocities, and a processor unit coupled with the location-determining component, the sensor array, and the memory, the processor unit configured to: receive audible speech including a plurality of words; calculate an instantaneous velocity of the action camera based on one or more of the geographic location data and the sensor data; select an acoustic model and a speech recognition model from among the plurality of acoustic models and speech recognition models having a respective predetermined range of action camera velocities that encompass the instantaneous velocity of the action camera; determine a phonetic term associated with each word in the received speech based on the selected acoustic model's acoustic tuning parameters to recognize speech; determine a meaning for each determined phonetic term by searching the selected speech recognition model for a match to the determined phonetic term, and execute a computing device function based on the determined meaning for each word within the received audible recognized speech. 11. The action camera of claim 10 , wherein the sensor array includes one or more of an accelerometer, a gyroscope, a magnetometer, and a barometer. 12. The action camera of claim 10 , wherein the action profile indicates an orientation of the computing device, and wherein the processor unit is further configured to select an acoustic model and a speech recognition model based on the orientation of the computing device. 13. The action camera of claim 10 , wherein the processor unit is further configured to select a speech recognition model from among the plurality of speech models having a higher phonetic match tolerance when the action profile indicates movement of the action camera in excess of a predetermined movement threshold, the higher phonetic match tolerance resulting in a higher depth and breadth of search for a match to the determined phonetic term. 14. The action camera of claim 10 , wherein the acoustic model is trained in accordance with a type of speech resulting from a user performing a type of physical activity matching the instantaneous velocity of the action camera.

Assignees

Garmin Switzerland Gmbh

Inventors

Classifications

H04N23/62
Control of parameters via user interfaces · CPC title
Y02D70/164
Cross-Sectional Technologies · mapped topic
G10L15/1815
Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning · CPC title
Y02D70/26
Cross-Sectional Technologies · mapped topic
H04M2250/74
with voice recognition means · CPC title

Patent family

Related publications grouped by family.

View patent family 61243274

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10360910B2 cover?: An automatic speech recognition (ASR) system is disclosed that compensates for different noise environments and types of speech. The ASR system may be implemented as part of an action camera that collects status data, such as geographic location data and/or sensor data. The ASR system may perform speech recognition using an acoustic model and a speech recognition model, which are trained for op…
Who is the assignee on this patent?: Garmin Switzerland Gmbh
What technology area does this patent fall under?: Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jul 23 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).