What technology area does this patent fall under?

Primary CPC classification G10L25/78. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jul 10 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Multisensory speech detection

US10020009B1 · US · B1

Patent metadata
Field	Value
Publication number	US-10020009-B1
Application number	US-201615392448-A
Country	US
Kind code	B1
Filing date	Dec 28, 2016
Priority date	Nov 10, 2008
Publication date	Jul 10, 2018
Grant date	Jul 10, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method of multisensory speech detection is disclosed. The method comprises determining an orientation of a mobile device and determining an operating mode of the mobile device based on the orientation of the mobile device. The method further includes identifying speech detection parameters that specify when speech detection begins or ends based on the determined operating mode and detecting speech from a user of the mobile device based on the speech detection parameters.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: receiving, by a given mobile device, audio data corresponding to a user utterance; while receiving the audio data corresponding to the user utterance, determining, by the given mobile device, that the given mobile device has changed position from a first pose to a second pose; in response to determining that the given mobile device has changed position from the first pose to the second pose, determining endpointing parameters for endpointing audio data received by a mobile device changing from the first pose to the second pose; using the endpointing parameters for endpointing audio data received by a mobile device changing from the first pose to the second pose, endpointing the received audio data; generating, by an automated speech recognizer, a transcription of the endpointed audio data; and providing, for output by the given mobile device, the transcription. 2. The method of claim 1 , wherein determining that tat the given mobile device has changed position from a first pose to a second pose comprises: determining that an angle of the given mobile device relative to a reference plane has changed from a first angle to a second angle. 3. The method of claim 1 , wherein determining that the given mobile device has changed position from a first pose to a second pose comprises: determining that a distance between the given mobile device and a user of the mobile device has changed from a first distance to a second distance. 4. The method of claim 1 , wherein determining endpointing parameters for endpointing audio data received by a mobile device changing from the first pose to the second pose comprises determining a speech energy threshold for endpointing audio data received by a mobile device changing from the first pose to the second pose. 5. The method of claim 1 , wherein: the second pose is a walkie-talkie pose in which the mobile device operates in half-duplex, and determining the endpointing parameters comprises detecting the selection of a talk button on the mobile device. 6. The method of claim 1 , comprising: in response to endpointing the audio data, generating a user interface indicating that the audio data has been endpointed; and providing, for display by the given mobile device, the user interface. 7. The method of claim 1 , comprising: after generating the transcription, generating, by the given mobile device, a user interface that indicates a recommended pose; and providing, for display by the given mobile device, the user interface. 8. A system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: receiving, by a given mobile device, audio data corresponding to a user utterance; while receiving the audio data corresponding to the user utterance, determining, by the given mobile device, that the given mobile device has changed position from a first pose to a second pose; in response to determining that the given mobile device has changed position from the first pose to the second pose, determining endpointing parameters for endpointing audio data received by a mobile device changing from the first pose to the second pose; using the endpointing parameters for endpointing audio data received by a mobile device changing from the first pose to the second pose, endpointing the received audio; generating, by an automated speech recognizer, a transcription of the endpointed audio data; and providing, for output by the given mobile device, the transcription. 9. The system of claim 8 , wherein determining that given the mobile device has changed position from a first pose to a second pose comprises: determining that an angle of the given mobile device relative to a reference plane has changed from a first angle to a second angle. 10. The system of claim 8 , wherein determining that the given mobile device has changed position from a first pose to a second pose comprises: determining that a distance between the given mobile device and a user of the mobile device has changed from a first distance to a second distance. 11. The system of claim 8 , wherein determining the endpointing parameters for endpointing audio data received by a mobile device changing from the first pose to the second pose comprises determining a speech energy threshold for endpointing audio data received by a mobile device changing from the first pose to the second pose. 12. The system of claim 8 , wherein the operations further comprise: the second pose is a walkie-talkie pose in which the mobile device operates in half-duplex, and determining the endpointing parameters comprises detecting the selection of a talk button on the mobile device. 13. The system of claim 8 , wherein the operations further comprise: in response to endpointing the audio data, generating a user interface indicating that the audio data has been endpointed; and providing, for display by the given mobile device, the user interface. 14. The system of claim 8 , wherein the operations further comprise: after generating the transcription, generating, by the given mobile device, a user interface that indicates a recommended pose; and providing, for display by the given mobile device, the user interface. 15. A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising: receiving, by a given mobile device, audio data corresponding to a user utterance; while receiving the audio data corresponding to the user utterance, determining, by the given mobile device, that the given mobile device has changed position from a first pose to a second pose; in response to determining that the given mobile device has changed position from the first pose to the second pose, determining endpointing parameters for endpointing audio data received by a mobile device changing from the first pose to the second pose; using the endpointing parameters for endpointing audio data received by a mobile device changing from the first pose to the second pose, endpointing the received audio data; generating, by an automated speech recognizer, a transcription of the endpointed audio data; and providing, for output by the given mobile device, the transcription. 16. The medium of claim 15 , wherein determining that the given mobile device has changed position from a first pose to a second pose comprises: determining that an angle of the given mobile device relative to a reference plane has changed from a first angle to a second angle. 17. The medium of claim 15 , wherein determining that the given mobile device has changed position from a first pose to a second pose comprises: determining that a distance between the given mobile device and a user of the mobile device has changed from a first distance to a second distance. 18. The medium of claim 15 , wherein determining the endpointing parameters for endpointing audio data received by a mobile device changing from the first pose to the second pose comprises determining a speech energy threshold for endpointing audio data received by a mobile device changing from the first pose to the second pose. 19. The medium of claim 15 , wherein the operations further comprise: the second pose is a walkie-talkie pose in which the mobile device operates

Assignees

Inventors

Classifications

G06F3/0346
with detection of the device orientation or free movement in a three-dimensional [3D] space, e.g. 3D mice, 6-DOF [six degrees of freedom] pointers using gyroscopes, accelerometers or tilt-sensors · CPC title
G06F3/167
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title
G06F3/03
Arrangements for converting the position or the displacement of a member into a coded form · CPC title
G10L15/22
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
H04B1/40
Circuits · CPC title

Patent family

Related publications grouped by family.

View patent family 41531538

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10020009B1 cover?: A computer-implemented method of multisensory speech detection is disclosed. The method comprises determining an orientation of a mobile device and determining an operating mode of the mobile device based on the orientation of the mobile device. The method further includes identifying speech detection parameters that specify when speech detection begins or ends based on the determined operating…
Who is the assignee on this patent?: Google Inc, Google Llc
What technology area does this patent fall under?: Primary CPC classification G10L25/78. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jul 10 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).