What technology area does this patent fall under?

Primary CPC classification G10L25/78. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 03 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Augmenting speech segmentation and recognition using head-mounted vibration and/or motion sensors

US9779758B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9779758-B2
Application number	US-201514828483-A
Country	US
Kind code	B2
Filing date	Aug 17, 2015
Priority date	Jul 26, 2012
Publication date	Oct 3, 2017
Grant date	Oct 3, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Example methods and systems use multiple sensors to determine whether a speaker is speaking. Audio data in an audio-channel speech band detected by a microphone can be received. Vibration data in a vibration-channel speech band representative of vibrations detected by a sensor other than the microphone can be received. The microphone and the sensor can be associated with a head-mountable device (HMD). It is determined whether the audio data is causally related to the vibration data. If the audio data and the vibration data are causally related, an indication can be generated that the audio data contains HMD-wearer speech. Causally related audio and vibration data can be used to increase accuracy of text transcription of the HMD-wearer speech. If the audio data and the vibration data are not causally related, an indication can be generated that the audio data does not contain HMD-wearer speech.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method, comprising: determining a correlation delay between a microphone positioned on a head-mountable device (HMD) for detecting audio, and a sensor, other than the microphone, positioned on the HMD for detecting vibrations; subsequent to determining the correlation delay, detecting audio with the microphone and converting the detected audio into audio data comprising audio speech data in an audio-channel speech band; subsequent to determining the correlation delay, detecting vibrations with the sensor and converting the detected vibrations into vibration data comprising vibration speech data in a vibration-channel speech band; based on the determined correlation delay, aligning the audio speech data with the vibration speech data in time; after aligning the audio speech data with the vibration speech data in time, comparing one or more frequency ranges of the audio speech data with one or more frequency ranges of the vibration speech date to determine whether or not the aligned audio speech data and vibration speech data are causally related; and in response to determining that the audio speech data is causally related to the vibration speech data: (i) generating an indication that the audio data contains HMD-wearer speech, and (ii) activating a voice interface of the HMD. 2. The method of claim 1 , further comprising: in response to determining that the audio speech data is causally related to the vibration speech data, providing at least the audio speech data to a speech recognizer. 3. The method of claim 1 , further comprising: in response to determining that the audio speech data is causally related to the vibration speech data, providing the audio speech data and the vibration speech data to a speech recognizer. 4. The method of claim 3 , further comprising: recognizing text corresponding to the HMD-wearer speech in the audio speech data and the vibration speech data using the speech recognizer. 5. The method of claim 1 , wherein the sensor comprises a bone-conducting microphone (BCM) and/or an accelerometer. 6. The method of claim 1 , further comprising: in response to determining that the audio speech data is causally related to the vibration speech data, conditioning the audio data and/or vibration data as speech. 7. A head-mountable device (HMD), comprising: a processor; a microphone; a sensor; a non-transitory computer-readable medium; and program instructions stored on the non-transitory computer-readable medium, wherein the program instructions are executable by the processor to cause the HMD to perform operations including: determining a correlation delay between the and the sensor; subsequent to determining the correlation delay, detecting audio with the microphone and converting the detected audio into audio data comprising audio speech data in an audio-channel speech band; subsequent to determining the correlation delay, detecting vibrations with the sensor, and converting the detected vibrations into vibration data comprising vibration speech data in a vibration-channel speech band; based on the determined correlation delay, aligning the audio speech data with the vibration speech data in time; after aligning the audio speech data with the vibration speech data in time, comparing one or more frequency ranges of the audio speech data with one or more frequency ranges of the vibration speech date to determine whether or not the aligned audio speech data and vibration speech data are causally related; and in response to determining that the audio speech data is causally related to the vibration speech data: (i) generating an indication that the audio data contains HMD-wearer speech, and (ii) activating a voice interface of the HMD. 8. The HMD of claim 7 , wherein the operations further include: in response to determining that the audio speech data is causally related to the vibration speech data, providing at least the audio speech data to a speech recognizer. 9. The HMD of claim 7 , wherein the operations further include: in response to determining that the audio speech data is causally related to the vibration speech data, providing the audio speech data and the vibration speech data to a speech recognizer. 10. The HMD of claim 9 , wherein the operations further include: recognizing text corresponding to the HMD-wearer speech in the audio speech data and the vibration speech data using the speech recognizer. 11. The HMD of claim 7 , wherein the sensor comprises a bone-conducting microphone (BCM) and/or an accelerometer. 12. The HMD of claim 7 , wherein the operations further include: in response to determining that the audio speech data is causally related to the vibration speech data, conditioning the audio data and/or vibration data as speech. 13. An article of manufacture including a non-transitory computer-readable medium having instructions stored thereon that, when executed by a computing device, cause the computing device to perform functions comprising: determining a correlation delay between a microphone positioned on a head-mountable device (HMD) for detecting audio, and a sensor, other than the microphone, positioned on the HMD for detecting vibrations; subsequent to determining the correlation delay, detecting audio with the microphone and converting the detected audio into audio data comprising audio speech data in an audio-channel speech band; subsequent to determining the correlation delay, detecting vibrations with the sensor and converting the detected vibrations into vibration data comprising vibration speech data in a vibration-channel speech band; based on the determined correlation delay aligning the audio speech data with the vibration speech data in time; after aligning the audio speech data with the vibration speech data in time, comparing one or more frequency ranges of the audio speech data with one or more frequency ranges of the vibration speech date to determine whether or not the aligned audio speech data and vibration speech data are causally related; and in response to determining that the audio speech data is causally related to the vibration speech data: (i) generating an indication that the audio data contains HMD-wearer speech, and (ii) activating a voice interface of the HMD. 14. The article of manufacture of claim 13 , wherein the functions further comprise: in response to determining that the audio speech data is causally related to the vibration speech data, providing the audio speech data and/or the vibration speech data to a speech recognizer. 15. The article of manufacture of claim 14 , wherein the functions further comprise: recognizing text corresponding to the HMD-wearer speech in the audio speech data and the vibration speech data using the speech recognizer. 16. The article of manufacture of claim 13 , wherein the sensor comprises an accelerometer and/or a bone-conducting microphone (BCM). 17. The article of manufacture of claim 13 , wherein the functions further comprise: in response to determining that the audio speech data is causally related to the vibration speech data, conditioning the audio data and/or vibration data as speech. 18. The method of claim 1 , further comprising: determining a degree of spectral coherency between the audio speech data and the vibration speech data; and wherein determining that the audio speech data is causally related to the vibration speech data comprises determining that the audio speech data is causally related to the vibration speech data based on the determined degre

Assignees

Google Inc

Inventors

Classifications

G10L25/78Primary
Detection of presence or absence of voice signals (switching of direction of transmission by voice frequency in two-way loud-speaking telephone systems H04M9/10) · CPC title
G10L15/20
Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech (G10L21/02 takes precedence) · CPC title
G10L21/16Primary
Transforming into a non-visible representation (devices or methods enabling ear patients to replace direct auditory perception by another kind of perception A61F11/04) · CPC title
H04R2460/13
Hearing devices using bone conduction transducers · CPC title
H04R1/46
Special adaptations for use as contact microphones, e.g. on musical instrument, on stethoscope (throat mountings H04R1/14) · CPC title

Patent family

Related publications grouped by family.

View patent family 54063594

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9779758B2 cover?: Example methods and systems use multiple sensors to determine whether a speaker is speaking. Audio data in an audio-channel speech band detected by a microphone can be received. Vibration data in a vibration-channel speech band representative of vibrations detected by a sensor other than the microphone can be received. The microphone and the sensor can be associated with a head-mountable device…
Who is the assignee on this patent?: Google Inc
What technology area does this patent fall under?: Primary CPC classification G10L25/78. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 03 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).