Training systems and methods for sequence taggers

US9792560B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9792560-B2
Application numberUS-201514623846-A
CountryUS
Kind codeB2
Filing dateFeb 17, 2015
Priority dateFeb 17, 2015
Publication dateOct 17, 2017
Grant dateOct 17, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for or training as sequence tagger, such as conditional random field model. More specifically, the systems and methods train a sequence tagger utilizing partially labeled data from crowd-sourced data for a specific application and partially labeled data from search logs. Further, the systems and methods disclosed herein train a sequence tagger utilizing only partially labeled by utilizing a constrained lattice where each input value within the constrained lattice can have multiple candidate tags with confidence scores. Accordingly, the systems and methods provide for a more accurate sequence tagging system, a more reliable sequence tagging system, and a more efficient sequence tagging system in comparison to sequence taggers trained utilizing at least some fully-labeled training data.

First claim

Opening claim text (preview).

The invention claimed is: 1. A training system for a conditional random field, the training system comprising: a computing device including a processing unit and a memory, the processing unit implementing a constrained lattice system, the constrained lattice system is operable to: obtain partially labeled data from crowd-sourced data for a specific application; obtain partially labeled data from search logs; merge the partially labeled data from the crowd-sourced data and from the search logs into a constrained lattice, wherein each word within the constrained lattice has a plurality of candidate tags with confidence scores; run a training algorithm based on the constrained lattice to estimate model parameters. 2. The training system of claim 1 , wherein the partially labeled data from the search logs is generated from unlabeled data from a commercial search engine. 3. The training system of claim 1 , wherein when a word in the constrained lattice has an uncertain tag, the constrained lattice assigns all candidate tags from a schema to the word. 4. The training system of claim 1 , wherein the constrained lattice is constrained because each word has a set of allowed candidate tag types and because the plurality of candidate tags is structured. 5. The training system of claim 4 , wherein the plurality of candidate tags is structured because some candidate tags types cannot follow certain other candidate tag types. 6. The training system of claim 1 , wherein the training algorithm minimizes an energy gap between a candidate tag from the constrained lattice and a corresponding candidate tag from an unconstrained lattice. 7. The training system of claim 1 , wherein the constrained lattice system creates a more accurate conditional random field and a more reliable conditional random field in comparison to conditional random fields that are trained with at least some fully-labeled data. 8. The training system of claim 1 , wherein the training system builds a language understanding model without needing to obtain any fully-labeled crowd-sourced data for the specific application. 9. The training system of claim 1 , wherein the constrained lattice system is implemented on at least one of: a mobile telephone; a smart phone; a tablet; a smart watch; a wearable computer; a personal computer; a desktop computer; a gaming system; and a laptop computer. 10. The training system of claim 1 , wherein the specific application is at least one of: a digital assistant application; a voice recognition application; an email application; a social networking application; a collaboration application; an enterprise management application; a messaging application; a word processing application; a spreadsheet application; a database application; a presentation application; a contacts application; a gaming application; an e-commerce application; an e-business application; a transactional application; an exchange application; and a calendaring application. 11. A method for training a sequence tagger utilizing machine learning techniques, the method comprising: obtaining partially labeled data from a first source for a specific application; obtaining partially labeled data from a second source, wherein the second source is search logs; merging the partially labeled data from the first source and from the search logs into a constrained lattice, wherein each input value within the constrained lattice has a plurality of candidate tags with confidence scores, and running a training algorithm based on the constrained lattice to estimate model parameters, wherein the method provides for a more accurate sequence tagger and a more reliable sequence tagger in comparison to sequence taggers that are trained with at least some fully-labeled data. 12. The method of claim 11 , wherein the sequence tagger is a conditional random field. 13. The method of claim 11 , wherein when an input value in the constrained lattice has a missing or uncertain tag, the constrained lattice assigns all candidate tags from a schema to the input value. 14. The method of claim 11 , wherein the constrained lattice is constrained because every input value has a set of allowed candidate tag types and because the plurality of candidate tags is structured. 15. The method of claim 14 , wherein the plurality of candidate tags is structured because some candidate tags types cannot follow certain other candidate tag types. 16. The method of claim 11 , wherein the training algorithm minimizes an energy gap between a candidate tag from the constrained lattice and a corresponding candidate tag from an unconstrained lattice. 17. The method of claim 11 , wherein the method provides a platform for building language understanding models without needing any fully-labeled data for the specific application. 18. The method of claim 11 , wherein the specific application is at least one of: a digital assistant application; a voice recognition application; an email application; a social networking application; a collaboration application; an enterprise management application; a messaging application; a word processing application; a spreadsheet application; a database application; a presentation application; a contacts application; a gaming application; an e-commerce application; an e-business application; a transactional application; an exchange application; and a calendaring application. 19. The method of claim 11 , wherein the partially labeled data from the search logs is generated from unlabeled data from a commercial search engine by: constructing a query-knowledge click graph from unlabeled click-through data via linking query click logs and knowledge extraction; applying a string-based alignment algorithm to align semantic tags with the unlabeled click-through data on the query-knowledge click graph to form an aligned query-knowledge click graph; removing less-confident alignments from the aligned query-knowledge click graph to form an updated aligned graph; and partially labeling the unlabeled click-through data based on the semantic tags aligned with the unlabeled click-through data on the updated aligned graph. 20. A system for building a language understanding model utilizing machine learning techniques, the system comprising: at least one processor; and one or more system memories including computer-executable instructions stored thereon that, responsive to execution by the at least one processor, cause the system to perform operations including: obtaining partially labeled data from crowd-sourced data for a specific application; obtaining partially labeled data from search logs; merging the partially labeled data from the crowd-sourced data and from the search logs into a constrained lattice, wherein each word within the constrained lattice has a plurality of candidate tags with confidence scores, and wherein the constrained lattice is constrained because every word has a set of allowed candidate tag types and because the plurality of candidate tags is structured; and running a training algorithm based on the constrained lattice to estimate model parameters, wherein the language understanding model is a trained conditional random field.

Assignees

Inventors

Classifications

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • G06F40/205Primary

    Parsing · CPC title

  • Indexing; Web crawling techniques · CPC title

  • Semantic analysis · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9792560B2 cover?
Systems and methods for or training as sequence tagger, such as conditional random field model. More specifically, the systems and methods train a sequence tagger utilizing partially labeled data from crowd-sourced data for a specific application and partially labeled data from search logs. Further, the systems and methods disclosed herein train a sequence tagger utilizing only partially labele…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F40/205. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 17 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).