Performing fine-grained question type classification

US11520762B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11520762-B2
Application numberUS-201916713776-A
CountryUS
Kind codeB2
Filing dateDec 13, 2019
Priority dateDec 13, 2019
Publication dateDec 6, 2022
Grant dateDec 6, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method according to one embodiment includes converting an input question into a vector form using trained word embeddings; constructing a type similarity matrix using a predetermined ontology; and determining a score for all possible types for the input question, based on the input question in vector form and the type similarity matrix.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: converting, by a neural network, an input question into a vector form using trained word embeddings; constructing, by the neural network, a type similarity matrix using a predetermined ontology; performing, by the neural network, matrix multiplication between the input question in vector form and the type similarity matrix to determine a plurality of possible types for the input question; and scoring, by the neural network, each of the plurality of possible types for the input question to create a ranked list of possible types for the input question, wherein the predetermined ontology includes a plurality of labeled nodes representing entities and their associated types, wherein the labeled nodes within the predetermined ontology are restructured and reformatted into a series of vectors in Euclidean vector space to construct the type similarity matrix, wherein each vector within the series of vectors conveys a predetermined type within the predetermined ontology, wherein a location of each vector within the Euclidean vector space is used to determine a similarity score between the associated types of the labeled nodes within the predetermined ontology, wherein the neural network performs matrix multiplication between the input question in vector form and the type similarity matrix to determine a plurality of possible types for the input question, and wherein the plurality of possible types for the input question is refined utilizing a list of potential types for training data derived from the predetermined ontology. 2. The computer-implemented method of claim 1 , wherein the input question is converted into a vector by a bidirectional long short-term memory (LSTM) and a combination layer. 3. The computer-implemented method of claim 1 , wherein: the input question includes a text string, and the input question is converted into the vector form by a recurrent neural network (RNN). 4. The computer-implemented method of claim 1 , wherein each of a plurality of vectors are created within a Euclidean vector space to represent an associated type within the predetermined ontology, where a proximity of the vectors within the Euclidean vector space is associated with a similarity of the vectors. 5. The computer-implemented method of claim 1 , wherein the type similarity matrix is created utilizing a low rank decomposition method such as SVD on top of a co-occurrence matrix. 6. The computer-implemented method of claim 1 , wherein: the neural network is trained utilizing a plurality of question/answer pairs, wherein each of the question/answer pairs includes a question string and an associated answer string, and for each of the plurality of question/answer pairs, a type of the associated answer string is determined from the predetermined ontology and is assigned to the question string as its label to create a list of question/type pairs. 7. The computer-implemented method of claim 1 , wherein: the input question includes a text string that is converted into the vector form by a recurrent neural network (RNN). 8. The computer-implemented method of claim 1 , further comprising: encoding a difference between a vector of possible types for the input question as predicted by a system and a list of potential types as provided by training data by utilizing a loss function such as a weighted negative log-likelihood (NLL) operation; and minimizing the difference during a training phase of a neural network which modifies parameters of the neural network. 9. The computer-implemented method of claim 8 , further comprising back-propagating a loss value for the input question through the neural network. 10. A computer program product comprising one or more computer readable storage media, and program instructions collectively stored on the one or more computer readable storage media, the program instructions comprising instructions configured to cause one or more processors to perform a method comprising: converting, by a neural network utilizing the one or more processors, an input question into a vector form using trained word embeddings; constructing, by the neural network utilizing the one or more processors, a type similarity matrix using a predetermined ontology; performing, by the neural network utilizing the one or more processors, matrix multiplication between the input question in vector form and the type similarity matrix to determine a plurality of possible types for the input question; and scoring, by the neural network utilizing the one or more processors, each of the plurality of possible types for the input question to create a ranked list of possible types for the input question, wherein the predetermined ontology includes a plurality of labeled nodes representing entities and their associated types, wherein the labeled nodes within the predetermined ontology are restructured and reformatted into a series of vectors in Euclidean vector space to construct the type similarity matrix, wherein each vector within the series of vectors conveys a predetermined type within the predetermined ontology, wherein a location of each vector within the Euclidean vector space is used to determine a similarity score between the associated types of the labeled nodes within the predetermined ontology, wherein the neural network performs matrix multiplication between the input question in vector form and the type similarity matrix to determine a plurality of possible types for the input question, and wherein the plurality of possible types for the input question is refined utilizing a list of potential types for training data derived from the predetermined ontology. 11. The computer program product of claim 10 , wherein the input question is converted into a vector by a bidirectional long short-term memory (LSTM) and a combination layer. 12. The computer program product of claim 10 , wherein: the input question includes a text string, and the input question is converted into the vector form by a recurrent neural network (RNN). 13. The computer program product of claim 10 , wherein each of a plurality of vectors are created within a Euclidean vector space to represent an associated type within the predetermined ontology, where a proximity of the vectors within the Euclidean vector space is associated with a similarity of the vectors. 14. The computer program product of claim 10 , wherein the type similarity matrix is created utilizing a low rank decomposition method such as SVD on top of a co-occurrence matrix. 15. The computer program product of claim 12 , wherein: the neural network is trained utilizing a plurality of question/answer pairs, wherein each of the question/answer pairs includes a question string and an associated answer string, and for each of the plurality of question/answer pairs, a type of the associated answer string is determined from the predetermined ontology and is assigned to the question string as its label to create a list of question/type pairs. 16. A system, comprising: a processor; and logic integrated with the processor, executable by the processor, or integrated with and executable by the processor, the logic being configured to: convert, by a neural network, an input question into a vector form using trained word embeddings; construct, by the neural network, a type similarity matrix using a predetermined ontology; perform, by the neural network, matrix multiplication between the input question in vector form and the type similarity matrix to determine a plurality of possible types for

Assignees

Inventors

Classifications

  • G06F16/35Primary

    Clustering; Classification · CPC title

  • Vectors, bitmaps or matrices · CPC title

  • Distances to prototypes · CPC title

  • Classification techniques · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11520762B2 cover?
A computer-implemented method according to one embodiment includes converting an input question into a vector form using trained word embeddings; constructing a type similarity matrix using a predetermined ontology; and determining a score for all possible types for the input question, based on the input question in vector form and the type similarity matrix.
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/35. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 06 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).