Multisensory speech detection

US10720176B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10720176-B2
Application numberUS-201816108512-A
CountryUS
Kind codeB2
Filing dateAug 22, 2018
Priority dateNov 10, 2008
Publication dateJul 21, 2020
Grant dateJul 21, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method of multisensory speech detection is disclosed. The method comprises determining an orientation of a mobile device and determining an operating mode of the mobile device based on the orientation of the mobile device. The method further includes identifying speech detection parameters that specify when speech detection begins or ends based on the determined operating mode and detecting speech from a user of the mobile device based on the speech detection parameters.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving, at data processing hardware of a mobile device, an interaction indication indicating a user interaction with a button of the mobile device; in response to receiving the interaction indication: initiating, by the data processing hardware, execution of an audio recording process using a microphone of the mobile device; and notifying, by the data processing hardware, a user of the mobile device when execution of the audio recording process starts by: generating a visual notification that indicates to the user when execution of the audio recording process starts; and displaying the visual notification on a user interface of the mobile device, wherein the visual notification comprises a waveform graphic; receiving, at the data processing hardware, a speech utterance from the user captured by the microphone during execution of the audio recording process; and generating, by the data processing hardware, a transcription of the speech utterance captured by the microphone during the audio recording process. 2. The method of claim 1 , wherein notifying the user of the mobile device when execution of the audio recording process starts comprises: generating an audio notification that indicates to the user when execution of the audio recording process starts; and outputting the audio notification through an audio output device of the mobile device. 3. The method of claim 1 , further comprising, in response to receiving the speech utterance of the user captured by the microphone during execution of the audio recording process: generating, by the data processing hardware, a visual notification that indicates detection of the speech utterance of the user; and displaying, by the data processing hardware, the visual notification on a user interface of the mobile device. 4. The method of claim 1 , wherein receiving the speech utterance of the user comprises: receiving audio input data captured by the microphone during execution of the audio recording process; determining whether the audio input data captured by the microphone exceeds a speech energy threshold; and when the audio input data captured by the microphone exceeds the speech energy threshold, detecting that the audio input data includes the speech utterance of the user. 5. The method of claim 1 , further comprising, in response to initiating execution of the audio recording process: determining, by the data processing hardware, a speech energy threshold for comparing to the speech utterance of the user received during execution of the audio recording process; and ceasing, by the data processing hardware, execution of the audio recording process when an energy of the speech utterance of the user received during the audio recording process is less than the speech energy threshold. 6. The method of claim 1 , further comprising: determining, by the data processing hardware, when execution of the audio recording process ceases; and in response to determining when execution of the audio recording process ceases, displaying, by the data processing hardware, a visual notification on a user interface of the mobile device, the visual notification indicating to the user that execution of the audio recording process has ceased. 7. The method of claim 1 , further comprising: determining, by the data processing hardware, when execution of the audio recording process ceases; and in response to determining when execution of the audio recording process ceases, outputting, by the data processing hardware, an audio notification through an audio output device of the mobile device, the audio notification indicating to the user that execution of the audio recording process has ceased. 8. The method of claim 1 , further comprising: determining, by the data processing hardware, when execution of the audio recording process ceases; and in response to determining when execution of the audio recording process ceases, outputting, by the data processing hardware, tactical feedback through the mobile device, the tactical feedback indicating to the user that execution of the audio recording process has ceased. 9. The method of claim 1 , further comprising, displaying, by the data processing hardware, the transcription of the speech utterance on a user interface of the mobile device. 10. The method of claim 1 , wherein the button of the mobile device comprises a physical button located on a side portion of the mobile device. 11. A mobile device comprising: data processing hardware; and memory hardware in communication with the data processing hardware and storing instructions that when executed, cause the data processing hardware to perform operations comprising: receiving an interaction indication indicating a user interaction with a button of the mobile device; in response to receiving the interaction indication: initiating execution of an audio recording process using a microphone of the mobile device; and notifying a user of the mobile device when execution of the audio recording process starts by: generating a visual notification that indicates to the user when execution of the audio recording process starts; and displaying the visual notification on a user interface of the mobile device, wherein the visual notification comprises a waveform graphic; receiving a speech utterance from the user captured by the microphone during execution of the audio recording process; and generating a transcription of the utterance captured by the microphone during the audio recording process. 12. The mobile device of claim 11 , wherein notifying the user of the mobile device when execution of the audio recording process starts comprises: generating an audio notification that indicates to the user when execution of the audio recording process starts; and outputting the audio notification through an audio output device of the mobile device. 13. The mobile device of claim 11 , wherein the operations further comprise, in response to receiving the speech utterance of the user captured by the microphone during execution of the audio recording process: generating a visual notification that indicates detection of the speech utterance of the user; and displaying the visual notification on a user interface of the mobile device. 14. The mobile device of claim 11 , wherein receiving the speech utterance of the user comprises: receiving audio input data captured by the microphone during execution of the audio recording process; determining whether the audio input data captured by the microphone exceeds a speech energy threshold; and when the audio input data captured by the microphone exceeds the speech energy threshold, detecting that the audio input data includes the speech utterance of the user. 15. The mobile device of claim 11 , wherein the operations further comprise, in response to initiating execution of the audio recording process: determining a speech energy threshold for comparing to the speech utterance of the user received during execution of the audio recording process; and ceasing execution of the audio recording process when an energy of the speech utterance of the user received during the audio recording process is less than the speech energy threshold. 16. The mobile device of claim 11 , wherein the operations further comprise: determining when execution of the audio recording process ceases; and in response to determining when execution of the audio recording process ceases, displaying a visual notification on a user interface of the mobile device, the visual notific

Assignees

Inventors

Classifications

  • according to context-related or environment-related conditions · CPC title

  • Speech to text systems (G10L15/08 takes precedence) · CPC title

  • Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title

  • Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • with detection of the device orientation or free movement in a three-dimensional [3D] space, e.g. 3D mice, 6-DOF [six degrees of freedom] pointers using gyroscopes, accelerometers or tilt-sensors · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10720176B2 cover?
A computer-implemented method of multisensory speech detection is disclosed. The method comprises determining an orientation of a mobile device and determining an operating mode of the mobile device based on the orientation of the mobile device. The method further includes identifying speech detection parameters that specify when speech detection begins or ends based on the determined operating…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G10L25/78. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 21 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).