Natural language processing matrices

US11113469B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11113469-B2
Application numberUS-201916366628-A
CountryUS
Kind codeB2
Filing dateMar 27, 2019
Priority dateMar 27, 2019
Publication dateSep 7, 2021
Grant dateSep 7, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A phrase may be received that includes a plurality of tokens in a natural language format. A plurality of levels relating to dependencies between tokens of the plurality of tokens within the phrase is determined. A matrix structure is generated for the phrase. The matrix structure utilizes a plurality of rows and a plurality of columns to store data of the phrase. The plurality of rows and the plurality of columns each indicate one of an order of tokens of the plurality of tokens or levels of the plurality of levels.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for organizing tokens of a phrase of a corpus of phrases, the method comprising: receiving a phrase that includes a plurality of tokens in a natural language format, wherein the plurality of tokens includes words of the phrase; determining a plurality of levels relating to dependencies between tokens of the plurality of tokens within the phrase; and generating, for the phrase, a matrix structure that utilizes a plurality of rows and a plurality of columns to store data of the phrase, wherein the plurality of rows and the plurality of columns each indicate one of an order of tokens of the plurality of tokens or levels of the plurality of levels. 2. The computer-implemented method of claim 1 , wherein the plurality of columns indicates the order of tokens of the plurality of tokens and the plurality of rows indicates the levels of the plurality of levels. 3. The computer-implemented method of claim 1 , further comprising determining, using natural language processing (NLP) techniques, that a token of the plurality of tokens is a trigger that changes a meaning of other tokens of the plurality of tokens. 4. The computer-implemented method of claim 3 , further comprising determining, using NLP techniques, a span of tokens of the plurality of tokens for which a meaning is changed by the trigger. 5. The computer-implemented method of claim 1 , wherein each token of the plurality of tokens is in a unique cell of the matrix structure. 6. The computer-implemented method of claim 1 , wherein each column of the matrix structure includes only a single cell of data on the phrase. 7. The computer-implemented method of claim 1 , wherein the matrix structure is created such that each column stores a single token of the plurality of tokens. 8. The computer-implemented method of claim 1 , wherein the matrix structure indicates syntactical information on each token of the phrase. 9. A computer-implemented method for organizing tokens of a phrase of a corpus of phrases, the method comprising: receiving a phrase that includes a plurality of tokens in a natural language format, wherein the plurality of tokens includes words of the phrase; generating a parse tree data structure for the phrase, the parse tree data structure includes a plurality of levels relating to dependencies within the phrase; and generating, for the phrase using the parse tree data structure, a matrix structure that utilizes a plurality of rows and a plurality of columns to store data of the phrase, wherein the plurality of rows and the plurality of columns each indicate one of an order of tokens of the plurality of tokens or levels of the plurality of levels. 10. A system comprising: a processor; and a memory in communication with the processor, the memory containing program instructions that, when executed by the processor, are configured to cause the processor to: receive a phrase that includes a plurality of tokens in a natural language format, wherein the plurality of tokens includes words of the phrase; determine a plurality of levels relating to dependencies between tokens of the plurality of tokens within the phrase; and generate, for the phrase, a matrix structure that utilizes a plurality of rows and a plurality of columns to store data of the phrase, wherein the plurality of rows and the plurality of columns each indicate one of an order of tokens of the plurality of tokens or levels of the plurality of levels. 11. The system of claim 10 , wherein the plurality of columns indicates the order of tokens of the plurality of tokens and the plurality of rows indicates the levels of the plurality of levels. 12. The system of claim 10 , the memory further comprising instructions that, when executed by the processor, cause the processor to determine, using natural language processing (NLP) techniques, that a token of the plurality of tokens is a trigger that changes a meaning of other tokens of the plurality of tokens. 13. The system of claim 12 , the memory further comprising instructions that, when executed by the processor, cause the processor to determine, using NLP techniques, a span of tokens of the plurality of tokens for which a meaning is changed by the trigger. 14. The system of claim 10 , wherein each token of the plurality of tokens is in a unique cell of the matrix structure. 15. The system of claim 10 , wherein each column of the matrix structure includes only a single cell of data on the phrase. 16. The system of claim 10 , wherein the matrix structure is created such that each column stores a single token of the plurality of tokens. 17. The system of claim 10 , wherein the matrix structure indicates syntactical information on each token of the phrase. 18. A computer program product, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to: receive a phrase that includes a plurality of tokens in a natural language format, wherein the plurality of tokens includes words of the phrase; determine a plurality of levels relating to dependencies between tokens of the plurality of tokens within the phrase; and generate, for the phrase, a matrix structure that utilizes a plurality of rows and a plurality of columns to store data of the phrase, wherein the plurality of rows and the plurality of columns each indicate one of an order of tokens of the plurality of tokens or levels of the plurality of levels. 19. The computer program product of claim 18 , wherein the plurality of columns indicates the order of tokens of the plurality of tokens and the plurality of rows indicates the levels of the plurality of levels. 20. The computer program product of claim 18 , the computer readable storage medium further comprising program instructions that, when executed by the computer, cause the computer to determine, using natural language processing (NLP) techniques, that a token of the plurality of tokens is a trigger that changes a meaning of other tokens of the plurality of tokens. 21. The computer program product of claim 20 , the computer readable storage medium further comprising program instructions that, when executed by the computer, cause the computer to determine, using NLP techniques, a span of tokens of the plurality of tokens for which a meaning is changed by the trigger. 22. The computer program product of claim 18 , wherein each token of the plurality of tokens is in a unique cell of the matrix structure. 23. The computer program product of claim 18 , wherein each column of the matrix structure includes only a single cell of data on the phrase. 24. The computer program product of claim 18 , wherein the matrix structure is created such that each column stores a single token of the plurality of tokens. 25. The computer program product of claim 18 , wherein the matrix structure indicates syntactical information on each token of the phrase.

Assignees

Inventors

Classifications

  • Phrasal analysis, e.g. finite state techniques or chunking · CPC title

  • G06F40/211Primary

    Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars · CPC title

  • G06F40/284Primary

    Lexical analysis, e.g. tokenisation or collocates · CPC title

  • Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11113469B2 cover?
A phrase may be received that includes a plurality of tokens in a natural language format. A plurality of levels relating to dependencies between tokens of the plurality of tokens within the phrase is determined. A matrix structure is generated for the phrase. The matrix structure utilizes a plurality of rows and a plurality of columns to store data of the phrase. The plurality of rows and the …
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F40/211. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 07 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).