Unsupervised learning of deep patterns for semantic parsing

US9292490B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9292490-B2
Application numberUS-201313968462-A
CountryUS
Kind codeB2
Filing dateAug 16, 2013
Priority dateAug 16, 2013
Publication dateMar 22, 2016
Grant dateMar 22, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Using exemplary sentences, usage patterns and thematic roles ascribed in VerbNet to generate “deep pattern trees” for the exemplary sentences. Then, when an arbitrary natural language subject sentence is input, these deep pattern trees can be matched to the natural language subject sentence in order to assign thematic roles to at least some of the “grammatical portions” of the natural language subject sentence.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of unsupervised learning of deep patterns for semantic parsing, using an algorithm that operates over a VerbNet corpus, the method comprising: receiving a set of exemplary sentences, including a first exemplary sentence, from a language net; extracting, from VerbNet, a plurality of markup language files corresponding to a usage pattern P; parsing each of the exemplary sentences to yield a set of training constituency trees respectively corresponding to the exemplary sentences; performing maximal frequent subtree analysis on the set of training constituency trees to yield an unfiltered set of deep pattern trees; and filtering at least one irrelevant tree(s) from the set of deep pattern trees to obtain a filtered set of deep pattern trees, with the at least one irrelevant tree(s) do not contain a part of speech that is included in the usage pattern P; deleting redundant leaves from at least one of the following: the filtered set of deep pattern, and/or the unfiltered set of deep pattern trees; and matching nodes of each tree pattern, of the filtered set of deep tree patterns, with items of the usage pattern P using machine logic based rules to facilitate translation of the filtered set of deep pattern trees into annotation query language rules; At least the parsing step is performed by computer software running on computer hardware. 2. The method of claim 1 further comprising: responsive to the matching of the nodes, translating the filtered set of deep pattern trees into annotation query language (AQL) rules. 3. The method of claim 1 wherein: the language net organizes the exemplary sentences into a set of usage patterns; and the performing maximal subtree analysis step is performed separately for each sub-set of training constituency trees respectively corresponding to each usage pattern in the language net. 4. A computer program product, for unsupervised learning of deep patterns for semantic parsing, using an algorithm that operates over a VerbNet corpus, the computer program product comprising software stored on a software storage device, the software comprising: first program instructions programmed to extract, from VerbNet, a plurality of markup language files corresponding to a usage pattern P; second program instructions programmed to parse each of the exemplary sentences to yield a set of training constituency trees respectively corresponding to the exemplary sentences; third program instructions programmed to perform maximal frequent subtree analysis on the set of training constituency trees to yield an unfiltered set of deep pattern trees; and fourth program instructions programmed to filter at least one irrelevant tree(s) from the set of deep pattern trees to obtain a filtered set of deep pattern trees, with the at least one irrelevant tree(s) do not contain a part of speech that is included in the usage pattern P; fifth program instructions programmed to delete redundant leaves from at least one of the following: the filtered set of deep pattern, and/or the unfiltered set of deep pattern trees; and sixth program instructions programmed to match nodes of each tree pattern, of the filtered set of deep tree patterns, with items of the usage pattern P using machine logic based rules to facilitate translation of the filtered set of deep pattern trees into annotation query language (AQL) rules. 5. The product of claim 4 wherein the software further comprises: seventh program instructions programmed to, responsive to the matching of the nodes, translate the filtered set of deep pattern trees into annotation query language (AQL) rules. 6. The product of claim 4 wherein: the language net organizes the exemplary sentences into a set of usage patterns; and the third program instructions are further programmed to perform maximal subtree analysis separately for each sub-set of training constituency trees respectively corresponding to each usage pattern in the language net. 7. A computer system for unsupervised learning of deep patterns for semantic parsing, using an algorithm that operates over a VerbNet corpus, the computer system comprising: a processor(s) set; and a software storage device; wherein: the processor set is structured, located, connected and/or programmed to run software stored on the software storage device; and the software comprises: first program instructions programmed to extract, from VerbNet, a plurality of markup language files corresponding to a usage pattern P; second program instructions programmed to parse each of the exemplary sentences to yield a set of training constituency trees respectively corresponding to the exemplary sentences; third program instructions programmed to perform maximal frequent subtree analysis on the set of training constituency trees to yield an unfiltered set of deep pattern trees; and fourth program instructions programmed to filter at least one irrelevant tree(s) from the set of deep pattern trees to obtain a filtered set of deep pattern trees, with the at least one irrelevant tree(s) do not contain a part of speech that is included in the usage pattern P; fifth program instructions programmed to delete redundant leaves from at least one of the following: the filtered set of deep pattern, and/or the unfiltered set of deep pattern trees; and sixth program instructions programmed to match the nodes of each tree pattern, of the filtered set of deep tree patterns, with items of the usage pattern P using machine logic based rules to facilitate translation of the filtered set of deep pattern trees into annotation query language (AQL) rules. 8. The system of claim 7 wherein the software further comprises: Seventh program instructions, responsive to the matching of the nodes, translate the filtered set of deep pattern trees into annotation query language (AQL) rules. 9. The system of claim 7 wherein: the language net organizes the exemplary sentences into a set of usage patterns; and the third program instructions are further programmed to perform maximal subtree analysis separately for each sub-set of training constituency trees respectively corresponding to each usage pattern in the language net.

Assignees

Inventors

Classifications

  • G06F40/30Primary

    Semantic analysis · CPC title

  • G06F40/205Primary

    Parsing · CPC title

  • Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars · CPC title

  • Phrasal analysis, e.g. finite state techniques or chunking · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9292490B2 cover?
Using exemplary sentences, usage patterns and thematic roles ascribed in VerbNet to generate “deep pattern trees” for the exemplary sentences. Then, when an arbitrary natural language subject sentence is input, these deep pattern trees can be matched to the natural language subject sentence in order to assign thematic roles to at least some of the “grammatical portions” of the natural language …
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F40/30. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 22 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).