Systems and methods for automatic semantic token tagging
US-10534863-B2 · Jan 14, 2020 · US
US10747958B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10747958-B2 |
| Application number | US-201816226132-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 19, 2018 |
| Priority date | Dec 19, 2018 |
| Publication date | Aug 18, 2020 |
| Grant date | Aug 18, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Examples of automatically generating natural language pipelines to process an input to generate tags, semantic or syntactic, are described. In an example, on receiving a request to process input data to generate tags, a dependency graph, based on identified dependees and further dependees may be created to satisfy the request, the dependency graph including natural language operations arranged in order of their dependencies on each other. Based on the dependency graph, a pipeline for the tags may be automatically generated, which includes a series of natural language operations such that the operations for dependee tags are processed before any of their associated depender tags. Further, the dependency graph and the automated pipeline generation allows for automated optimization of the pipeline, training, re-training, testing and regression testing of the semantic tags and supporting machine learning models and provides a framework to efficiently manage the sharing and reuse of semantic understanding operations.
Opening claim text (preview).
What is claimed is: 1. A system comprising: a processor; a data reader coupled to the processor to receive a request to process input data to generate tags, the tags being one of semantic tags and syntactic tags; a natural language processor comprising natural language processing operations to tag the input data with the tags to provide for interpreting of the input data, wherein the natural language processing operations include depender operations and dependee operations, and wherein the depender operations require tagged output of the dependee operations as input; a dependency graph generator, coupled to the processor to: identify dependees of the tags, and further dependees of the dependees of the tags, a dependee being one of a tag and a natural language operation upon which the tag depends, wherein at least one of natural language processing operations and resources required to provide an input for a natural language operation to generate the tags is identified for each dependent and further dependents; and create a dependency graph, based on the identified dependents and the further dependents, the dependency graph including the natural language processing operations, corresponding dependents, and corresponding further dependents arranged in an order of and linked by their dependencies; and a pipeline generator coupled to the processor to, generate a pipeline including a series of natural language operations ordered as they appear in the dependency graph such that the natural language operations for dependee tags are processed before any of their associated depender tags, a depender tag being a tag which depends on a dependee tag, wherein the pipeline includes a plurality of natural language processing operations to be executed in a predefined order to generate the tags; and provide the pipeline to generate the tags for interpreting the input content. 2. The system of claim 1 , wherein, upon generating the pipeline, the pipeline generator creates an optimized pipeline using the dependency graph merging a set of potentially mergeable operations in the series of natural language operations. 3. The system of claim 2 , wherein to create the optimized pipeline, the pipeline generator is to: identify a set of potentially mergeable operations in the series of natural language operations, the set of potentially mergeable operations including the natural language operations having at least one of: identical functionality; identical processing, component sub-operations, which take the same input and produce the same output; and identical processing with one of different underlying configuration and different resources databases, which when merged together produce a same result as if they were processed separately; identify a set of fully mergeable operations using the dependency graph as operations from the set of potentially mergeable operations which are capable of being merged without causing a dependency conflict, the dependency conflict being caused when a merged operation provides semantic understanding that would be computed after the natural language operation that depends on that semantic understanding based on the dependency graph; generate an optimized pipeline with fully mergeable operations merged together; and provide the optimized pipeline to generate the tags for interpreting the input content. 4. The system of claim 1 , wherein the system further comprises an output processor coupled to the processer to: process natural language content corresponding to the input data to obtain the tags, based on the pipeline; and provide the tags to a client device requesting the tags. 5. The system of claim 1 , wherein the system further includes an automated tag trainer coupled to the processor to: receive a notification that a tag, from among the tags, has been modified; and automatically retrain a modified tag and corresponding depender tags to reflect modifications made to the tag, a depender tag including one of a tag which depends on the modified tag and another tag, which depends on any tag which is retrained including dependencies of dependencies to any level of dependency nesting. 6. The system of claim 5 , wherein the tag is considered to be modified when a modification events occurs, the modification event includes at least one: changing an underlying natural language operation that produces the modified tag from natural language content including one of changing the software code which implemented the natural language operation, changing the configuration of the natural language operation, changing a resource which supplies data to the natural language operation, the resource including one of a database, a file, and an external system; and changing natural language text processing operations that produce modified representations of the input data that are required by at least one natural language processing operation associated with the modified tag. 7. The system of claim 5 , wherein the automated tag trainer to retrain the modified tag is to: identify the depender tag corresponding to the modified tag; construct a tag modification pipeline for each depender tag; reprocess training content for the depender tag; re-run machine learning training for the depender tag; and perform a quality evaluation to determine whether the depender tag has been trained correctly. 8. The system of claim 5 , wherein, when the modified tag comprises multiple depender tags, the automatic tag trainer is to retrain the multiple depender tags in parallel, based on the dependency graph, wherein parallel retraining is performed such that each tag is retrained after the tag upon which it depends, including dependencies of dependencies to any level of nesting. 9. A method comprising: receiving a request to process input data to generate tags, the tags being one of semantic tags and syntactic tags; identifying dependees of the tags and further dependees of the dependees of the tags, a dependee being one of a tag and a natural language operation upon which the tag depends, wherein at least one of natural language processing operations and resources required to provide an input for a natural language operation to generate the tags is identified for each dependee and further dependees, wherein the natural language processing operations include depender operations and dependee operations, and wherein the depender operation require tagged output of the dependee operations as input; creating a dependency graph, based on the identified dependents and the further dependents, the dependency graph including the natural language processing operations, corresponding dependents, and corresponding further dependents arranged in an order of and linked by their dependencies; and generating a pipeline including a series of natural language operations in the order as they appear in the dependency graph such that the natural language operations for dependee tags are processed before any of their associated depender tags, a depender tag being a tag which depends on a dependee tag, wherein the pipeline includes a plurality of natural language processing operations to be executed in a predefined order to generate the tags; and providing the pipeline to generate the tags for interpreting the input content. 10. The method of claim 9 , wherein the method further comprises, upon generating the pipeline, creating an optimized pipeline using the dependency graph merging a set of potentially mergeable operations in the series of natural language operations. 11. The method of claim 10 , wherein creating the optimized pipeline comprises: identifying a set of potentially mergeable operations in the series
Semantic analysis · CPC title
Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars · CPC title
Recognition of textual entities · CPC title
Dictionaries · CPC title
Machine learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.