Language model training method and device

US2017125013A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2017125013-A1
Application numberUS-201615242065-A
CountryUS
Kind codeA1
Filing dateAug 19, 2016
Priority dateOct 29, 2015
Publication dateMay 4, 2017
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure provides a language model training method and device, including: obtaining a universal language model in an offline training mode, and clipping the universal language model to obtain a clipped language model; obtaining a log language model of logs within a preset time period in an online training mode; fusing the clipped language model with the log language model to obtain a first fusion language model used for carrying out first time decoding; and fusing the universal language model with the log language model to obtain a second fusion language model used for carrying out second time decoding. The method is used for solving the problem that a language model obtained offline in the prior art has poor coverage on new corpora, resulting in a reduced language recognition rate.

First claim

Opening claim text (preview).

What is claimed is: 1 . A language model training method, comprising: obtaining a universal language model in an offline training mode, and clipping the universal language model to obtain a clipped language model; obtaining a log language model of logs within a preset time period in an online training mode; fusing the clipped language model with the log language model to obtain a first fusion language model used for carrying out first time decoding; and fusing the universal language model with the log language model to obtain a second fusion language model used for carrying out second time decoding. 2 . The method of claim 1 , wherein the obtaining a log language model of logs within a preset time period in an online training mode comprises: obtaining log information within the preset time period, filtering the log information, and carrying out word segmentation processing on the filtered log information to obtain a log model training corpus within the preset time period; and training the log model training corpus to obtain the log language model. 3 . The method of claim 1 , wherein the clipped language model is a tri-gram language model, and correspondingly, the first fusion language model is a tri-gram fusion language model; and the universal language model is a tetra-gram language model, and correspondingly, the second fusion language model is a tetra-gram fusion language model. 4 . The method of any one of claim 1 , wherein the obtaining a universal language model in an offline training mode comprises: collecting a model training corpus of each field; for each field, training the model training corpus of the field to obtain the language model of the field; and generating the collected language models corresponding to all fields into the universal language model in the interpolation mode. 5 . The method of claim 4 , wherein the clipping the universal language model to obtain a clipped language model comprises: clipping the universal language model in a language model clipping mode based on entropy to obtain a second language model LM2; clipping the second language model LM2 in the language model clipping mode based on entropy to obtain a third language model LM3; and extracting the tri-gram language model from the third language model LM3, and clipping the extracted tri-gram language model to obtain the clipped language model LM4. 6 . An electronic device, comprising: at least one processor; and a memory communicably connected with the at least one processor for storing instructions executable by the at least one processor, wherein execution of the instructions by the at least one processor causes the at least one processor to: obtain a universal language model in an offline training mode; clip the universal language model to obtain a clipped language model; obtain a log language model of logs within a preset time period in an online training mode; fuse the clipped language model with the log language model to obtain a first fusion language model used for carrying out first time decoding; and fuse the universal language model with the log language model to obtain a second fusion language model used for carrying out second time decoding. 7 . The device of claim 6 , wherein the processor is further configured to perform the following steps: obtaining log information within the preset time period, filtering the log information, and carrying out word segmentation processing on the filtered log information to obtain a log model training corpus within the preset time period; and training the log model training corpus to obtain the log language model. 8 . The device of claim 6 , wherein the clipped language model is a tri-gram language model, and correspondingly, the first fusion language model is a tri-gram fusion language model; and the universal language model is a tetra-gram language model, and correspondingly, the second fusion language model is a tetra-gram fusion language model. 9 . The device of claim 6 , wherein the processor is further configured to perform the following steps: collecting a model training corpus of each field; for each field, training the model training corpus of the field to obtain the language model of the field; and generating the collected language models corresponding to all fields into the universal language model in the interpolation mode. 10 . The device of claim 9 , wherein the processor is further configured to perform the following steps: clipping the universal language model in a language model clipping mode based on entropy to obtain a second language model LM2; clipping the second language model LM2 in the language model clipping mode based on entropy to obtain a third language model LM3; and extracting the tri-gram language model from the third language model LM3, and clipping the extracted tri-gram language model to obtain the clipped language model LM4. 11 . A non-transitory computer-readable storage medium storing executable instructions that, when executed by an electronic device with a touch-sensitive display, cause the electronic device to: obtain a universal language model in an offline training mode; clip the universal language model to obtain a clipped language model; obtain a log language model of logs within a preset time period in an online training mode; fuse the clipped language model with the log language model to obtain a first fusion language model used for carrying out first time decoding; and fuse the universal language model with the log language model to obtain a second fusion language model used for carrying out second time decoding. 12 . The non-transitory computer-readable storage medium of claim 11 , wherein the electronic device is further configured to perform the following steps: obtaining log information within the preset time period, filtering the log information, and carrying out word segmentation processing on the filtered log information to obtain a log model training corpus within the preset time period; and training the log model training corpus to obtain the log language model. 13 . The non-transitory computer-readable storage medium of claim 11 , wherein the clipped language model is a tri-gram language model, and correspondingly, the first fusion language model is a tri-gram fusion language model; and the universal language model is a tetra-gram language model, and correspondingly, the second fusion language model is a tetra-gram fusion language model. 14 . The non-transitory computer-readable storage medium of claim 11 , wherein the electronic device is further configured to perform the following steps: collecting a model training corpus of each field; for each field, training the model training corpus of the field to obtain the language model of the field; and generating the collected language models corresponding to all fields into the universal language model in the interpolation mode. 15 . The non-transitory computer-readable storage medium of claim 14 , wherein the electronic device is further configured to perform the following steps: clipping the universal language model in a language model clipping mode based on entropy to obtain a second language model LM2; clipping the second language model LM2 in the language model clipping mode based on entropy to obtain a third language model LM3; and extracting the tri-gram language model from the third language model LM3, and clipping the extracted tri-gram language model to obtain the clipped language model LM4.

Assignees

Inventors

Classifications

  • using lexical or orthographic knowledge sources · CPC title

  • G10L15/063Primary

    Training · CPC title

  • updating or merging of old and new templates; Mean values; Weighting · CPC title

  • using context dependencies, e.g. language models · CPC title

  • Probabilistic grammars, e.g. word n-grams · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2017125013A1 cover?
The present disclosure provides a language model training method and device, including: obtaining a universal language model in an offline training mode, and clipping the universal language model to obtain a clipped language model; obtaining a log language model of logs within a preset time period in an online training mode; fusing the clipped language model with the log language model to obtai…
Who is the assignee on this patent?
Le Holdings Beijing Co Ltd, Le Shi Zhi Xin Electronic Tech (Tianjin) Ltd
What technology area does this patent fall under?
Primary CPC classification G10L15/063. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu May 04 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).