Adjusting language models using context information

US9076445B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9076445-B1
Application numberUS-201213705228-A
CountryUS
Kind codeB1
Filing dateDec 5, 2012
Priority dateDec 30, 2010
Publication dateJul 7, 2015
Grant dateJul 7, 2015

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatuses, including computer programs encoded on a computer storage medium, for adjusting language models. In one aspect, a method includes accessing audio data. Information that indicates a first context is accessed, the first context being associated with the audio data. At least one term is accessed. Information that indicates a second context is accessed, the second context being associated with the term. A similarity score is determined that indicates a degree of similarity between the second context and the first context. A language model is adjusted based on the accessed term and the determined similarity score to generate an adjusted language model. Speech recognition is performed on the audio data using the adjusted language model to select one or more candidate transcriptions for a portion of the audio data.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: obtaining audio data; accessing first context information associated with the audio data, wherein the first context information indicates (i) a first geographical location, and (ii) a first time; accessing second context information associated with one or more previously typed or previously transcribed terms, wherein the second context information indicates (i) a second geographical location and (ii) a second time; determining a similarity score for the first context information and the second context information based on (i) a degree of a similarity of the second geographical location to the first geographical location and (ii) a degree of a similarity of the second time to the first time; adjusting a language model based on the similarity score to adjust a likelihood that the language model indicates the one or more previously typed or previously transcribed terms as a candidate transcription of the audio data; determining a transcription of the audio data using the adjusted language model; and outputting the transcription that was determined using the adjusted language model. 2. The method of claim 1 , wherein obtaining the audio data comprises receiving the audio data over a network from client device; and wherein outputting the transcription determined using the adjusted language model comprises providing the transcription to the client device over the network. 3. The method of claim 1 , wherein accessing the first context information comprises accessing information that indicates a geographical location where the audio data was recorded and a time when the audio data was recorded. 4. The method of claim 1 , wherein accessing the second context information comprises accessing second context information that is associated with one or more terms previously transcribed for other audio, the second context information indicating (i) a particular geographical location where the other audio was input, and (ii) a time when the other audio was input at the particular geographical location. 5. The method of claim 1 , wherein obtaining the audio data comprises obtaining audio data for an utterance of a user; wherein accessing the first context information comprises accessing information that indicates a geographical location of a device when the audio data was recorded by the device and a time when the audio data was recorded by the device; and wherein accessing the second context information comprises accessing second context information associated with one or more previously transcribed terms that were previously transcribed from previously received audio data for a previous utterance of the user, the second context information indicating a geographical location of the device when the previous utterance of the user was input to the device and a time when the previous utterance of the user was input to the device. 6. The method of claim 1 , wherein the first time indicates a first day of week when the audio data was recorded and the second time indicates a second day of week when the one or more previously typed or previously transcribed terms were input; and wherein determining the similarity score comprises determining the similarity score based on a similarity of the second day of week to the first day of week. 7. The method of claim 1 , wherein the first time indicates a first time of day when the audio data was recorded and the second time indicates a second time of day when the one or more previously typed or previously transcribed terms were input; and wherein determining the similarity score comprises determining the similarity score based on a similarity of the second time of day to the first time of day. 8. The method of claim 1 , wherein determining the similarity score comprises determining the similarity score based on a distance between the second geographical location and the first geographical location. 9. The method of claim 1 , wherein accessing the first context information comprises accessing information that indicates a geographical location indicated by a Global Positioning System (GPS) receiver of a device that receives the audio data. 10. The method of claim 1 , wherein adjusting the language model based on the similarity score comprises changing one or more weighting values in the language model that correspond to the one or more previously typed or previously transcribed terms. 11. The method of claim 10 , wherein changing the one or more weighting values comprises changing the one or more weighting values such that a magnitude of the change in the one or more weighting values is based on the similarity score. 12. The method of claim 1 , wherein adjusting the language model based on the similarity score comprises increasing the likelihood by an amount that is based on the similarity score. 13. A system comprising: one or more processors; and a non-transitory computer-readable medium coupled to the one or more processors having instructions stored thereon which, when executed by the one or more processors, cause the system to perform operations comprising: obtaining audio data; accessing first context information associated with the audio data, wherein the first context information indicates (i) a first geographical location, and (ii) a first time; accessing second context information associated with one or more previously typed or previously transcribed terms, wherein the second context information indicates (i) a second geographical location and (ii) a second time; determining a similarity score for the first context information and the second context information based on (i) a degree of a similarity of the second geographical location to the first geographical location and (ii) a degree of a similarity of the second time to the first time; adjusting a language model based on the similarity score to adjust a likelihood that the language model indicates the one or more previously typed or previously transcribed terms as a candidate transcription of the audio data; determining a transcription of the audio data using the adjusted language model; and outputting the transcription that was determined using the adjusted language model. 14. The system of claim 13 , wherein the first time indicates a first day of week when the audio data was recorded and the second time indicates a second day of week when the one or more previously typed or previously transcribed terms were input; and wherein determining the similarity score comprises determining the similarity score based on a similarity of the second day of week to the first day of week. 15. The system of claim 13 , wherein the first time indicates a first time of day when the audio data was recorded and the second time indicates a second time of day when the one or more previously typed or previously transcribed terms were input; and wherein determining the similarity score comprises determining the similarity score based on a similarity of the second time of day to the first time of day. 16. The system of claim 13 , wherein determining the similarity score comprises determining the similarity score based on a distance between the second geographical location and the first geographical location. 17. A non-transitory computer storage medium storing a computer program, the program comprising instructions that when executed by one or more computers cause the one or more computers to perform operations comprising: obtaining audio data; accessing first context information associated with the audio data, wherein the first context information that

Assignees

Inventors

Classifications

  • G10L15/183Primary

    using context dependencies, e.g. language models · CPC title

  • G10L15/18Primary

    using natural language modelling · CPC title

  • G10L15/26Primary

    Speech to text systems (G10L15/08 takes precedence) · CPC title

  • for comparison or discrimination · CPC title

  • Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9076445B1 cover?
Methods, systems, and apparatuses, including computer programs encoded on a computer storage medium, for adjusting language models. In one aspect, a method includes accessing audio data. Information that indicates a first context is accessed, the first context being associated with the audio data. At least one term is accessed. Information that indicates a second context is accessed, the second…
Who is the assignee on this patent?
Google Inc
What technology area does this patent fall under?
Primary CPC classification G10L15/183. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 07 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).