Contextual language understanding for multi-turn language tasks
US-9690776-B2 · Jun 27, 2017 · US
US9792560B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9792560-B2 |
| Application number | US-201514623846-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 17, 2015 |
| Priority date | Feb 17, 2015 |
| Publication date | Oct 17, 2017 |
| Grant date | Oct 17, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods for or training as sequence tagger, such as conditional random field model. More specifically, the systems and methods train a sequence tagger utilizing partially labeled data from crowd-sourced data for a specific application and partially labeled data from search logs. Further, the systems and methods disclosed herein train a sequence tagger utilizing only partially labeled by utilizing a constrained lattice where each input value within the constrained lattice can have multiple candidate tags with confidence scores. Accordingly, the systems and methods provide for a more accurate sequence tagging system, a more reliable sequence tagging system, and a more efficient sequence tagging system in comparison to sequence taggers trained utilizing at least some fully-labeled training data.
Opening claim text (preview).
The invention claimed is: 1. A training system for a conditional random field, the training system comprising: a computing device including a processing unit and a memory, the processing unit implementing a constrained lattice system, the constrained lattice system is operable to: obtain partially labeled data from crowd-sourced data for a specific application; obtain partially labeled data from search logs; merge the partially labeled data from the crowd-sourced data and from the search logs into a constrained lattice, wherein each word within the constrained lattice has a plurality of candidate tags with confidence scores; run a training algorithm based on the constrained lattice to estimate model parameters. 2. The training system of claim 1 , wherein the partially labeled data from the search logs is generated from unlabeled data from a commercial search engine. 3. The training system of claim 1 , wherein when a word in the constrained lattice has an uncertain tag, the constrained lattice assigns all candidate tags from a schema to the word. 4. The training system of claim 1 , wherein the constrained lattice is constrained because each word has a set of allowed candidate tag types and because the plurality of candidate tags is structured. 5. The training system of claim 4 , wherein the plurality of candidate tags is structured because some candidate tags types cannot follow certain other candidate tag types. 6. The training system of claim 1 , wherein the training algorithm minimizes an energy gap between a candidate tag from the constrained lattice and a corresponding candidate tag from an unconstrained lattice. 7. The training system of claim 1 , wherein the constrained lattice system creates a more accurate conditional random field and a more reliable conditional random field in comparison to conditional random fields that are trained with at least some fully-labeled data. 8. The training system of claim 1 , wherein the training system builds a language understanding model without needing to obtain any fully-labeled crowd-sourced data for the specific application. 9. The training system of claim 1 , wherein the constrained lattice system is implemented on at least one of: a mobile telephone; a smart phone; a tablet; a smart watch; a wearable computer; a personal computer; a desktop computer; a gaming system; and a laptop computer. 10. The training system of claim 1 , wherein the specific application is at least one of: a digital assistant application; a voice recognition application; an email application; a social networking application; a collaboration application; an enterprise management application; a messaging application; a word processing application; a spreadsheet application; a database application; a presentation application; a contacts application; a gaming application; an e-commerce application; an e-business application; a transactional application; an exchange application; and a calendaring application. 11. A method for training a sequence tagger utilizing machine learning techniques, the method comprising: obtaining partially labeled data from a first source for a specific application; obtaining partially labeled data from a second source, wherein the second source is search logs; merging the partially labeled data from the first source and from the search logs into a constrained lattice, wherein each input value within the constrained lattice has a plurality of candidate tags with confidence scores, and running a training algorithm based on the constrained lattice to estimate model parameters, wherein the method provides for a more accurate sequence tagger and a more reliable sequence tagger in comparison to sequence taggers that are trained with at least some fully-labeled data. 12. The method of claim 11 , wherein the sequence tagger is a conditional random field. 13. The method of claim 11 , wherein when an input value in the constrained lattice has a missing or uncertain tag, the constrained lattice assigns all candidate tags from a schema to the input value. 14. The method of claim 11 , wherein the constrained lattice is constrained because every input value has a set of allowed candidate tag types and because the plurality of candidate tags is structured. 15. The method of claim 14 , wherein the plurality of candidate tags is structured because some candidate tags types cannot follow certain other candidate tag types. 16. The method of claim 11 , wherein the training algorithm minimizes an energy gap between a candidate tag from the constrained lattice and a corresponding candidate tag from an unconstrained lattice. 17. The method of claim 11 , wherein the method provides a platform for building language understanding models without needing any fully-labeled data for the specific application. 18. The method of claim 11 , wherein the specific application is at least one of: a digital assistant application; a voice recognition application; an email application; a social networking application; a collaboration application; an enterprise management application; a messaging application; a word processing application; a spreadsheet application; a database application; a presentation application; a contacts application; a gaming application; an e-commerce application; an e-business application; a transactional application; an exchange application; and a calendaring application. 19. The method of claim 11 , wherein the partially labeled data from the search logs is generated from unlabeled data from a commercial search engine by: constructing a query-knowledge click graph from unlabeled click-through data via linking query click logs and knowledge extraction; applying a string-based alignment algorithm to align semantic tags with the unlabeled click-through data on the query-knowledge click graph to form an aligned query-knowledge click graph; removing less-confident alignments from the aligned query-knowledge click graph to form an updated aligned graph; and partially labeling the unlabeled click-through data based on the semantic tags aligned with the unlabeled click-through data on the updated aligned graph. 20. A system for building a language understanding model utilizing machine learning techniques, the system comprising: at least one processor; and one or more system memories including computer-executable instructions stored thereon that, responsive to execution by the at least one processor, cause the system to perform operations including: obtaining partially labeled data from crowd-sourced data for a specific application; obtaining partially labeled data from search logs; merging the partially labeled data from the crowd-sourced data and from the search logs into a constrained lattice, wherein each word within the constrained lattice has a plurality of candidate tags with confidence scores, and wherein the constrained lattice is constrained because every word has a set of allowed candidate tag types and because the plurality of candidate tags is structured; and running a training algorithm based on the constrained lattice to estimate model parameters, wherein the language understanding model is a trained conditional random field.
Probabilistic graphical models, e.g. probabilistic networks · CPC title
Parsing · CPC title
Indexing; Web crawling techniques · CPC title
Semantic analysis · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.