Server side hotwording
US-2024412734-A1 · Dec 12, 2024 · US
US9093075B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9093075-B2 |
| Application number | US-201213452031-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 20, 2012 |
| Priority date | Apr 20, 2012 |
| Publication date | Jul 28, 2015 |
| Grant date | Jul 28, 2015 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method is disclosed herein for recognizing a repeated utterance in a mobile computing device via a processor. A first utterance is detected being spoken into a first mobile computing device. Likewise, a second utterance is detected being spoken into a second mobile computing device within a predetermined time period. The second utterance substantially matches the first spoken utterance and the first and second mobile computing devices are communicatively coupled to each other. The processor enables capturing, at least temporarily, a matching utterance for performing a subsequent processing function. The performed subsequent processing function is based on a type of captured utterance.
Opening claim text (preview).
We claim: 1. A computer-implemented method comprising: obtaining, by a first device, a first transcription of a first utterance that is spoken into the first device by a first user; obtaining, by the first device and from a second device, a second transcription of a second utterance that is spoken into the second device by a second user; determining, by a first application executing on the first device, that (i) the first transcription exactly matches the second transcription, and (ii) the second utterance was spoken within a predetermined period of time after the first utterance was spoken; and based on determining, by the first application executing on the first device, that (i) the first transcription exactly matches the second transcription and (ii) the second utterance was spoken within the predetermined period of time after the first utterance was spoken, providing, by the first device, the first transcription as an input string to a second application. 2. The method of claim 1 , wherein the second application is a telephone application, a mapping application, or a search application. 3. The method of claim 2 , comprising: determining, by the first application executing on the first device, that the first transcription corresponds to an input string format for the telephone application or the mapping application, wherein providing, by the first device, the first transcription as the input string to the second application is further based on determining, by the first application executing on the first device, that the first transcription corresponds to the input string format for the telephone application or the mapping application. 4. The method of claim 1 , comprising: determining, by the first device, that the first user and the second user are engaged in a telephone conversation using the first device and the second device, wherein providing, by the first device, the first transcription as the input string to the second application is further based on determining, by the first device, that the first user and the second user are engaged in the telephone conversation using the first device and the second device. 5. The method of claim 1 , comprising: based at least in part on determining, by the first application executing on the first device, that (i) the first transcription exactly matches the second transcription and (ii) the second utterance was spoken within a predetermined period of time after the first utterance was spoken, providing, to the second device and by the first device, data for displaying a selectable representation of the first transcription, wherein a selection of the selectable representation provides a request to the second device to execute the second application with the input string. 6. The method of claim 1 , comprising: obtaining, by the first device, a third transcription of a third utterance that is spoken into the first device by the first user; obtaining, by the first device, a fourth transcription of a fourth utterance that is spoken into the second device by the second user; determining, by the first application executing on the first device, that (i) the third transcription exactly matches the fourth transcription, (ii) the fourth utterance was spoken within the predetermined period of time after the third utterance was spoken, and (iii) the third utterance was spoken within a second predetermined period of time after the second utterance was spoken; and based at least in part on determining, by the first application executing on the first device, that (i) the third transcription exactly matches the fourth transcription, (ii) the fourth utterance was spoken within the predetermined period of time after the third utterance was spoken, and (iii) the third utterance was spoken within the second predetermined period of time after the second utterance was spoken, providing, by the first device, the first transcription and the third transcription as a second input string to the second application. 7. The method of claim 1 , comprising: obtaining, by the first device, a third transcription of a third utterance that is spoken into the first device by the first user; obtaining, by the first device, a fourth transcription of a fourth utterance that is spoken into the second device by the second user; determining, by the first application executing on the first device, that (i) the third transcription does not match the fourth transcription, (ii) the fourth utterance was spoken within the predetermined period of time after the third utterance was spoken, and (iii) the third utterance was spoken within a second predetermined period of time after the second utterance was spoken; and based at least in part on determining, by the first application executing on the first device, that that (i) the first transcription exactly matches the second transcription and (ii) the second utterance was spoken within the predetermined period of time after the first utterance was spoken, (iii) the third transcription does not match the fourth transcription, (iv) the fourth utterance was spoken within the predetermined period of time after the third utterance was spoken, and (v) the third utterance was spoken within the second predetermined period of time after the second utterance was spoken, providing, to the second device and by the first device, data for displaying a representation of the third transcription and the fourth transcription. 8. The method of claim 7 , wherein the representation is a selectable representation that when selected provides a request to the second device to request a correction from the second user. 9. A system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: obtaining, by a first device, a first transcription of a first utterance that is spoken into the first device by a first user; obtaining, by the first device and from a second device, a second transcription of a second utterance that is spoken into the second device by a second user; determining, by a first application executing on the first device, that (i) the first transcription exactly matches the second transcription, and (ii) the second utterance was spoken within a predetermined period of time after the first utterance was spoken; and based on determining, by the first application executing on the first device, that (i) the first transcription exactly matches the second transcription and (ii) the second utterance was spoken within the predetermined period of time after the first utterance was spoken, providing, by the first device, the first transcription as an input string to a second application. 10. The system of claim 9 , wherein the second application is a telephone application, a mapping application, or a search application. 11. The system of claim 10 , wherein the operations further comprise: determining, by the first application executing on the first device, that the first transcription corresponds to an input string format for the telephone application or the mapping application, wherein providing, by the first device, the first transcription as the input string to the second application is further based on determining, by the first application executing on the first device, that the first transcription corresponds to the input string format for the telephone application or the mapping application. 12. The system of claim 9 , wherein the operations further comprise: determining, by the first device, that the first user and the second user are engaged in a telephone conversation
Conversation recording systems (at the subscriber's set H04M1/656) · CPC title
for recording conversations · CPC title
Speech to text systems (G10L15/08 takes precedence) · CPC title
with voice recognition means · CPC title
Call monitoring, e.g. for law enforcement purposes; Call tracing; Detection or prevention of malicious calls · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.