Language model selection for speech-to-text conversion

US9495127B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9495127-B2
Application numberUS-97692010-A
CountryUS
Kind codeB2
Filing dateDec 22, 2010
Priority dateDec 23, 2009
Publication dateNov 15, 2016
Grant dateNov 15, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, computer program products and systems are described for converting speech to text. Sound information is received at a computer server system from an electronic device, where the sound information is from a user of the electronic device. A context identifier indicates a context within which the user provided the sound information. The context identifier is used to select, from among multiple language models, a language model appropriate for the context. Speech in the sound information is converted to text using the selected language model. The text is provided for use by the electronic device.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of converting speech to text, comprising: generating a language model by analyzing textual content for a first web page to determine a topic of the first web page, determining other pages determined to be directed to the same topic of the first web page, and analyzing content of queries previously submitted to the other pages determined to be directed to the same topic of the first web page, wherein the queries previously submitted to the other pages include queries submitted to respective search capabilities of at least some of the other pages; receiving, at a computer server system and from an electronic device, sound information from a user of the electronic device, and a context identifier that indicates a context within which the user provided the sound information; using the context identifier to select the generated language model from among a plurality of language models; converting speech in the sound information to text using the selected language model; and providing the text for use by the electronic device. 2. The method claim 1 , wherein the sound information was provided to the electronic device upon identifying that the user selected a selectable virtual control displayed along with a virtual keyboard interface on the electronic device, the user selection having caused the electronic device to begin listening for spoken input using an application programmed to convert user spoken and typed input into text to be provided to other applications on the electronic device, and wherein an operating system of the electronic device makes available, to the user of the electronic device, the virtual keyboard interface and a mechanism for speaking input for entering data to multiple applications on the electronic device, and provides, to a particular one of the multiple applications determined to represent the context within which the user provided the sound information, text that corresponds to a user input. 3. The method of claim 1 , wherein the context identifier identifies a topic for a web page that was being presented by the electronic device when the sound information was input by the user, and wherein using the context identifier to select the generated language model comprises selecting the generated language model based on a match between the topic identified by the context identifier and the topic of the first web page and the other pages. 4. The method of claim 1 , wherein the context identifier identifies a web page that was being presented by the electronic device when the sound information was input by the user, and wherein using the context identifier to select the generated language model comprises selecting the generated language model based on a match between the web page that was being presented by the electronic device and the first web page or one of the other pages. 5. A method of converting speech to text, comprising: generating a plurality of language models by analyzing textual content for a web page to determine a topic of the web page, determining other pages determined to be directed to the same topic of the web page, and analyzing textual content of queries previously submitted by a plurality of users to the other pages determined to be directed to the same topic of the web page, wherein the queries previously submitted by the plurality of users to the other pages include queries submitted to respective search capabilities of at least some of the other pages; receiving, at a computer server system and from an electronic device, sound information spoken by a user of the electronic device, and a context identifier of the web page, wherein the web page was being presented by the electronic device when the sound information was spoken by the user; selecting, using the context identifier and from among the plurality of language models, a language model appropriate for the context identifier; converting speech in the sound information to text using the selected language model; and providing the text for use by the electronic device. 6. The method of claim 5 , further comprising selecting the other pages by a clustering analysis on a graph having pages as nodes that are connected to each other by queries for which the other pages are responsive. 7. The method of claim 6 , wherein a web page is determined to be responsive to a query if the web page is a top n ranked search result for the query in a set of ranked search results relevant to the query, wherein n is a predetermined integer. 8. A system comprising: a data processing apparatus; and storage coupled to the data processing apparatus storing code that when executed by the data processing apparatus causes the data processing apparatus to perform operations comprising: generating a language model by analyzing textual content for a first web page to determine a topic of the first web page, determining other pages determined to be directed to the same topic of the first web page, and analyzing content of queries previously submitted to the other pages determined to be directed to the same topic of the first web page, wherein the queries previously submitted to the other pages include queries submitted to respective search capabilities of at least some of the other pages; receiving, at a computer server system and from an electronic device, sound information from a user of the electronic device, and a context identifier that indicates a context within which the user provided the sound information; using the context identifier to select the generated language model from among a plurality of language models; converting speech in the sound information to text using the selected language model; and providing the text for use by the electronic device. 9. The system of claim 8 , wherein the context identifier identifies a field of a form in which input on the electronic device is received that corresponds to the sound information. 10. The system of claim 8 , wherein the context identifier identifies a web page that was being presented by the electronic device when the sound information was input by the user. 11. The system of claim 8 , wherein the sound information was provided to the electronic device upon identifying that the user selected a selectable virtual control displayed along with a virtual keyboard interface on the electronic device, the user selection having caused the electronic device to begin listening for spoken input using an application programmed to convert user spoken and typed input into text to be provided to other applications on the electronic device. 12. The system of claim 11 , wherein generating the language model comprises generating the language model by analyzing textual content for queries to which the first web page, and the other pages are responsive. 13. The system of claim 12 , wherein determining the other pages comprises selecting other pages that are related to the first web page by a clustering analysis on a graph having pages as nodes that are connected to each other by queries for which the other pages are responsive. 14. The system of claim 13 , wherein a web page is determined to be responsive to a query if the web page is a top n ranked search result for the query in a set of ranked search results relevant to the query, wherein n is a predetermined integer. 15. A computer-readable storage device encoded with a computer program product, the computer program product including instructions that, when executed, cause data processing apparatus to perform operations comprising: generating a language model by analyzing textual content for a first web page to dete

Assignees

Inventors

Classifications

  • G10L15/30Primary

    Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title

  • Speech to text systems (G10L15/08 takes precedence) · CPC title

  • Language recognition · CPC title

  • of application context · CPC title

  • Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9495127B2 cover?
Methods, computer program products and systems are described for converting speech to text. Sound information is received at a computer server system from an electronic device, where the sound information is from a user of the electronic device. A context identifier indicates a context within which the user provided the sound information. The context identifier is used to select, from among mul…
Who is the assignee on this patent?
Ballinger Brandon M, Schalkwyk Johan, Cohen Michael H, and 2 more
What technology area does this patent fall under?
Primary CPC classification G10L15/30. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 15 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).