Multisensory speech detection

US10026419B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10026419-B2
Application numberUS-201514645802-A
CountryUS
Kind codeB2
Filing dateMar 12, 2015
Priority dateNov 10, 2008
Publication dateJul 17, 2018
Grant dateJul 17, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method of multisensory speech detection is disclosed. The method comprises determining an orientation of a mobile device and determining an operating mode of the mobile device based on the orientation of the mobile device. The method further includes identifying speech detection parameters that specify when speech detection begins or ends based on the determined operating mode and detecting speech from a user of the mobile device based on the speech detection parameters.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: initiating, by a mobile computing device, an audio recording process using a microphone of the mobile computing device; identifying, by the mobile computing device, a first orientation of the mobile computing device at a beginning of the audio recording process; determining, during the audio recording process, that the mobile computing device has transitioned from the first orientation to a second orientation; in response to determining that the mobile computing device has transitioned from the first orientation to a second orientation, determining a speech energy threshold for comparing to speech received during the audio recording process and while the mobile computing device is in the second orientation; comparing an energy of the speech received during the audio recording process and while the mobile computing device is in the second orientation to the speech energy threshold; determining an end of speech condition based on (i) the energy of the speech received during the audio recording process and while the mobile computing device is in the second orientation and (ii) the speech energy threshold; and stopping, based on determining the end of speech condition, the audio recording process. 2. The method of claim 1 , wherein identifying the first orientation of the mobile computing device is performed in response to a proximity sensor of the mobile computing device detecting an object within a predetermined distance of the proximity sensor. 3. The method of claim 1 , wherein identifying the first orientation of the mobile computing device comprises detecting that a user is within a predetermined distance of a proximity sensor of the mobile computing device, and wherein determining that the mobile computing device has transitioned from the first orientation to the second orientation includes determining that the user is no longer within the predetermined distance of the proximity sensor. 4. The method of claim 1 , wherein determining that the mobile computing device has transitioned from the first orientation to the second orientation is performed using one or more accelerometers of the mobile computing device. 5. The method of claim 1 , comprising: classifying the transition of the mobile computing device from the first orientation to the second orientation as one of a “to mouth” gesture, a “from mouth” gesture, a “to ear” gesture, or a “from ear” gesture, wherein determining the speech energy threshold for comparing to speech received during the audio recording process and while the mobile computing device is in the second orientation is based on classifying the transition of the mobile computing device from the first orientation to the second orientation as one of the “to mouth” gesture, the “from mouth” gesture, the “to ear” gesture, or the “from ear” gesture. 6. The method of claim 1 , wherein determining, during the audio recording process, that the mobile computing device has transitioned from the first orientation to the second orientation comprises determining that an angle between a user and the mobile computing device is outside of a selected set of angles. 7. The method of claim 1 , wherein determining, during the audio recording process, that the mobile computing device has transitioned from the first orientation to the second orientation comprises determining that the mobile computing device is beyond a predetermined distance from a portion of a user. 8. The method of claim 1 , comprising: while the mobile computing device is in the first orientation, comparing the energy of the speech received during the audio recording process to a different speech energy threshold; determining that the energy of the speech received during the audio recording process and while the mobile computing device is in the first orientation does is greater than the different speech energy threshold; and based on determining that the energy of the speech received during the audio recording process and while the mobile computing device is in the first orientation is greater than the different speech energy threshold, continuing the audio recording process. 9. A non-transitory computer storage medium encoded with a computer program, the program comprising instructions that when executed by data processing apparatus cause the data processing apparatus to perform operations comprising: initiating, by a mobile computing device, an audio recording process using a microphone of the mobile computing device; identifying, by the mobile computing device, a first orientation of the mobile computing device at a beginning of the audio recording process; determining, during the audio recording process, that the mobile computing device has transitioned from the first orientation to a second orientation; in response to determining that the mobile computing device has transitioned from the first orientation to a second orientation, determining a speech energy threshold for comparing to speech received during the audio recording process and while the mobile computing device is in the second orientation; comparing an energy of the speech received during the audio recording process and while the mobile computing device is in the second orientation to the speech energy threshold; determining an end of speech condition based on (i) the energy of the speech received during the audio recording process and while the mobile computing device is in the second orientation and (ii) the speech energy threshold; and stopping, based on determining the end of speech condition, the audio recording process. 10. The computer storage medium of claim 9 , wherein identifying the first orientation of the mobile computing device is performed in response to a proximity sensor of the mobile computing device detecting an object within a predetermined distance of the proximity sensor. 11. The computer storage medium of claim 9 , wherein identifying the first orientation of the mobile computing device comprises detecting that a user is within a predetermined distance of a proximity sensor of the mobile computing device, and wherein determining that the mobile computing device has transitioned from the first orientation to the second orientation includes determining that the user is no longer within the predetermined distance of the proximity sensor. 12. The computer storage medium of claim 9 , wherein determining that the mobile computing device has transitioned from the first orientation to the second orientation is performed using one or more accelerometers of the mobile computing device. 13. The computer storage medium of claim 9 , comprising: classifying the transition of the mobile computing device from the first orientation to the second orientation as one of a “to mouth” gesture, a “from mouth” gesture, a “to ear” gesture, or a “from ear” gesture, wherein determining the speech energy threshold for comparing to speech received during the audio recording process and while the mobile computing device is in the second orientation is based on classifying the transition of the mobile computing device from the first orientation to the second orientation as one of the “to mouth” gesture, the “from mouth” gesture, the “to ear” gesture, or the “from ear” gesture. 14. The computer storage medium of claim 9 , wherein determining, during the audio recording process, that the mobile computing device has transitioned from the first orientation to the second orientation comprises determining that an angle between a user and the mobile computing device is outside of a selected set of angles. 15. The co

Assignees

Inventors

Classifications

  • Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title

  • with voice recognition means · CPC title

  • Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • G10L25/78Primary

    Detection of presence or absence of voice signals (switching of direction of transmission by voice frequency in two-way loud-speaking telephone systems H04M9/10) · CPC title

  • Arrangements for converting the position or the displacement of a member into a coded form · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10026419B2 cover?
A computer-implemented method of multisensory speech detection is disclosed. The method comprises determining an orientation of a mobile device and determining an operating mode of the mobile device based on the orientation of the mobile device. The method further includes identifying speech detection parameters that specify when speech detection begins or ends based on the determined operating…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G10L25/78. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 17 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).