Computation reduction using a decision tree classifier for faster neural transition-based parsing

US11816581B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11816581-B2
Application numberUS-202017014435-A
CountryUS
Kind codeB2
Filing dateSep 8, 2020
Priority dateSep 8, 2020
Publication dateNov 14, 2023
Grant dateNov 14, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A fast neural transition-based parser. The fast neural transition-based parser includes a decision tree-based classifier and a state vector control loss function. The decision tree-based classifier is dynamically used to replace a multilayer perceptron in the fast neural transition-based parser, and the decision tree-based classifier increases speed of neural transition-based parsing. The state vector control loss function trains the fast neural transition-based parser, the state vector control loss function builds a vector space favorable for building a decision tree that is used for the decision tree-based classifier in the neural transition-based parser, and the state vector control loss function maintains accuracy of neural transition-based parsing while the decision tree-based classifier is used to increase the speed of the neural transition-based parsing while using the decision tree-based classifier to increase the speed of the neural transition-based parsing.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for using a neural transition-based parser to parse a sentence, the method comprising: training, by a server, the neural transition-based parser by clustering state vectors, distributing centroids of the state vectors, gathering the state vectors in a same action class into a hyperrectangle, and determining, based on a set of action classes, an optimized set of trainable parameters of the neural transition-based parser; receiving, by the server, a vector representation of a state of parsing the sentence, the vector representation being in a vector space built by a state vector control loss function in training the neural transition-based parser; predicting, by the server, by using a decision tree-based classifier in the neural transition-based parser, a parsing action based on the vector representation; calculating, by the server, by using the decision tree-based classifier, a Gini coefficient and a number of samples, based on the vector representation; determining, by the server, whether either of two conditions is met, the two conditions being that the Gini coefficient is greater than a predetermined threshold of the Gini coefficient and the number of samples is less than a predetermined threshold of the number of samples; and in response to determining that neither of the two conditions is met, applying, by the server, the parsing action predicted by the decision tree-based classifier to the state of parsing the sentence by using the neural transition-based parser. 2. The computer-implemented method of claim 1 , further comprising: in response to determining that either of the two conditions is met, using, by the server, a multilayer perceptron in the neural transition-based parser to predict the parsing action based on the vector representation; and applying, by the server, the parsing action predicted by the multilayer perceptron to the state of parsing the sentence by using the neural transition-based parser. 3. The computer-implemented method of claim 1 , wherein the vector space is built by the state vector control loss function such that the state vectors in the same action class are clustered and the centroids of the state vectors are distributed in different action classes. 4. The computer-implemented method of claim 1 , wherein the vector space is built by the state vector control loss function such that the vector space is for building a decision tree that is used for the decision tree-based classifier and the state vectors in the same action class are gathered into the hyperrectangle by using an L p -norm and adjusting p. 5. The computer-implemented method of claim 1 , wherein, with each of given sets of trainable parameters of neural networks in the neural transition-based parser, training the neural transition-based parser comprises: calculating, by the server, a centroid vector for an action class by averaging the state vectors in the action class; calculating, by the server, an intra-class distance loss for the action class by calculating an averaged L p -norm of distances between the centroid vector and each of the state vectors in the action class; calculating, by the server, intra-class distance losses for respective action classes and a sum of the intra-class distance losses; calculating, by the server, an inter-class distance loss between a pair of action classes by considering an L p -norm of a difference between centroid vectors of the pair of action classes; calculating, by the server, inter-class distance losses for respective pairs of action classes and a sum of the inter-class distance losses; calculating, by the server, an additional loss, which includes the sum of the intra-class distance losses and the sum of the inter-class distance losses; and calculating, by the server, a training loss of the neural transition-based parser, which includes the additional loss and a standard cross-entropy loss, wherein the standard cross-entropy loss is computed from action probabilities. 6. The computer-implemented method of claim 5 , further comprising: determining, by the server, the optimized set of trainable parameters by minimizing the training loss. 7. A computer program product for using a neural transition-based parser to parse a sentence, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by one or more processors, the program instructions executable to: train, by a server, the neural transition-based parser by clustering state vectors, distributing centroids of the state vectors, gathering the state vectors in a same action class into a hyperrectangle, and determining, based on a set of action classes, an optimized set of trainable parameters of the neural transition-based parser; receive, by the server, a vector representation of a state of parsing the sentence, the vector representation being in a vector space built by a state vector control loss function in training the neural transition-based parser; predict, by the server, by using a decision tree-based classifier in the neural transition-based parser, a parsing action based on the vector representation; calculate, by the server, by using the decision tree-based classifier, a Gini coefficient and a number of samples, based on the vector representation; determine, by the server, whether either of two conditions is met, the two conditions being that the Gini coefficient is greater than a predetermined threshold of the Gini coefficient and the number of samples is less than a predetermined threshold of the number of samples; and in response to determining that neither of the two conditions is met, apply, by the server, the parsing action predicted by the decision tree-based classifier to the state of parsing the sentence by using the neural transition-based parser. 8. The computer program product of claim 7 , further comprising the program instructions executable to: in response to determining that either of the two conditions is met, use, by the server, a multilayer perceptron in the neural transition-based parser to predict the parsing action based on the vector representation; and apply, by the server, the parsing action predicted by the multilayer perceptron to the state of parsing the sentence by using the neural transition-based parser. 9. The computer program product of claim 7 , wherein the vector space is built by the state vector control loss function such that state vectors in the same action class are clustered and the centroids of the state vectors are distributed in different action classes. 10. The computer program product of claim 7 , wherein the vector space is built by the state vector control loss function such that the vector space is for building a decision tree that is used for the decision tree-based classifier and the state vectors in the same action class are gathered into the hyperrectangle by using an L p -norm and adjusting p. 11. The computer program product of claim 7 , for training the neural transition-based parser with each of given sets of trainable parameters of neural networks in the neural transition-based parser, further comprising the program instructions executable to: calculate, by the server, a centroid vector for an action class by averaging the state vectors in the action class; calculate, by the server, an intra-class distance loss for the action class by calculating an averaged L p -norm of distances between the centroid vector and each of the state vectors in the action class; calculate, by the server, intra-class distance losses for respective action classes and a sum of the intra-class d

Assignees

Inventors

Classifications

  • Feedforward networks · CPC title

  • Supervised learning · CPC title

  • G06N5/01Primary

    Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title

  • Architecture, e.g. interconnection topology · CPC title

  • Learning methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11816581B2 cover?
A fast neural transition-based parser. The fast neural transition-based parser includes a decision tree-based classifier and a state vector control loss function. The decision tree-based classifier is dynamically used to replace a multilayer perceptron in the fast neural transition-based parser, and the decision tree-based classifier increases speed of neural transition-based parsing. The state…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N5/01. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 14 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).