Interactive dendrogram controls
US-10013641-B2 · Jul 3, 2018 · US
US11475222B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11475222-B2 |
| Application number | US-202016797430-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 21, 2020 |
| Priority date | Feb 21, 2020 |
| Publication date | Oct 18, 2022 |
| Grant date | Oct 18, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A controller accesses an initial taxonomy for a domain comprising one or more existing terms for the domain identified in a hierarchical structure. The controller analyzes a corpus documents for a domain to identify a selection of one or more documents with glossaries. The controller extracts, from the glossaries, one or more pairs each comprising a term and a definition. The controller attempts to map a respective term of each of the one or more pairs into the initial taxonomy for the domain based on text of a respective definition of each of the one or more pairs to generate an updated taxonomy for the domain.
Opening claim text (preview).
What is claimed is: 1. A method comprising: applying, by a computer, rule-based annotators and statistical annotators to automate document annotation; annotating, by the computer, a plurality of documents in a corpus such that the plurality of documents are recognizable by a machine; using, by the computer, annotated documents as a dataset in machine learning for building natural language processing models used in a question answering system in the computer; accessing, by the computer, an initial taxonomy for a domain comprising one or more existing terms for the domain identified in a hierarchical structure; analyzing, by the computer, the corpus of the plurality of documents for a domain to identify a selection of one or more documents with glossaries; extracting, by the computer, from the glossaries, one or more pairs each comprising a term and a definition; attempting to map, by the computer, a respective term of each of the one or more pairs into the initial taxonomy for the domain based on text of a respective definition of each of the one or more pairs to generate an updated taxonomy for the domain; extracting, by the computer, a head noun phrase of the respective definition of a current entry from among the one or more pairs; evaluating, by the computer, whether the head noun phrase is present in the initial taxonomy; responsive to the head noun phrase being present in the initial taxonomy, mapping, by the computer, the respective term of the current entry to the initial taxonomy to generate the updated taxonomy; responsive to the head noun phrase not being present in the initial taxonomy, evaluating, by the computer, whether the head noun phrase is present in a particular definition from among the one or more pairs; responsive to evaluating the head noun phrase is present in the particular definition from among the one or more pairs, building, by the computer, a tiny taxonomy with the respective term of the current entry as a child node and another term paired with the particular definition as the parent node; and responsive to mapping the another term to the initial taxonomy to generate the updated taxonomy, mapping, by the computer system, the tiny taxonomy to the updated taxonomy. 2. The method according to claim 1 , wherein accessing, by a computer, an initial taxonomy for a domain comprising one or more existing terms for the domain identified in a hierarchical structure further comprises: accessing, by the computer, the initial taxonomy comprising the one or more existing terms for the domain identified in the hierarchical structure comprising a parent node and one or more levels of child nodes. 3. The method according to claim 1 , wherein attempting to map a respective term of each of the one or more pairs into the initial taxonomy for the domain based on text of a respective definition of each of the one or more pairs to generate an updated taxonomy for the domain further comprises: marking, by the computer, one or more selections of the one or more pairs that are related; and attempting to map, by the computer, the respective term of each of the one or more pairs into the initial taxonomy for the domain based on text of the respective definition of each of the one or more pairs and the marked one or more selections of the one or more pairs that are related to generate the updated taxonomy for the domain. 4. The method according to claim 1 , further comprising: responsive to the head noun phrase not being present in the initial taxonomy, evaluating, by the computer, whether a last word of the head noun phrase is present the initial taxonomy; and responsive to evaluating the last word of the head noun phrase is present in the initial taxonomy, mapping, by the computer, the respective term of the current entry to the updated taxonomy. 5. The method according to claim 1 , further comprising: responsive to the head noun phrase not being present in the initial taxonomy, evaluating, by the computer, whether a see also term is present in a particular definition of another entry from among the one or more pairs; responsive to detecting the see also term in the particular definition of the another entry, attempting to map, by the computer system, the another entry to the initial taxonomy to generate the updated taxonomy; and responsive to mapping the another entry to the initial taxonomy to generate the updated taxonomy, mapping the current entry to a same node as the another entry in the updated taxonomy. 6. The method according to claim 1 , further comprising: identifying, by the computer, a remainder collection of one or more unmapped pairs from among the plurality of pairs that are not mapped to generate the updated taxonomy; and clustering, by the computer, one or more clusters from among the one or more unmapped pairs based on the text of the respective definition of each of the one or more unmapped pairs into one or more groups of semantically similar terms. 7. The method according to claim 1 , wherein attempting to map, by the computer, a respective term of each of the one or more pairs into the initial taxonomy for the domain based on text of a respective definition of each of the one or more pairs to generate an updated taxonomy for the domain further comprises: identifying, by the computer, a remainder collection of one or more unmapped pairs from among the plurality of pairs that are not mapped to generate the updated taxonomy; and iteratively attempting to map, by the computer, the remainder collection of the one or more unmapped pairs to the updated taxonomy based on the text of the respective definition of the one or more unmapped pairs. 8. The method according to claim 6 , further comprising: evaluating, by the computer, a top N terms from each of the one or more clusters; selecting, by the computer, a best match term from each selection of top N terms as a candidate concept label for the respective cluster from the one or more clusters; and automatically adding, by the computer, each candidate concept label to the initial taxonomy to generate the updated taxonomy. 9. A computer system comprising one or more processors, one or more computer-readable memories, one or more computer-readable storage devices, and program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, the stored program instructions comprising: program instruction to apply rule-based annotators and statistical annotators to automate document annotation; program instructions to annotate a plurality of documents in a corpus such that the plurality of documents are recognizable by a machine; program instructions to use annotated documents as a dataset in machine learning for building natural language processing models used in a question answering system in the computer system; program instructions to access an initial taxonomy for a domain comprising one or more existing terms for the domain identified in a hierarchical structure; program instructions to analyze the corpus of the plurality of documents for a domain to identify a selection of one or more documents with glossaries; program instructions to extract, from the glossaries, one or more pairs each comprising a term and a definition; program instructions to attempt to map a respective term of each of the one or more pairs into the initial taxonomy for the domain based on text of a respective definition of each of the one or more pairs to generate an updated taxonomy for the domain; program instructions to extract a head noun phrase of the respective definition of a current entry from among the one or more pairs; program instructions to
Semantic analysis · CPC title
based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO] · CPC title
Phrasal analysis, e.g. finite state techniques or chunking · CPC title
Thesauruses; Synonyms · CPC title
Grammatical analysis; Style critique · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.