Incident matching with vector-based natural language processing

US10970491B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10970491-B2
Application numberUS-202016809197-A
CountryUS
Kind codeB2
Filing dateMar 4, 2020
Priority dateMar 15, 2018
Publication dateApr 6, 2021
Grant dateApr 6, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A database may contain a corpus of text strings, the text strings respectively associated with vector representations thereof, where each of the vector representations is an aggregation of vector representations of words in the associated text string. An artificial neural network (ANN) may have been trained with mappings between: (i) the words in the text strings, and (ii) for each respective word, one or more sub strings of the text strings in which the word appears. A server device may be configured to: receive an input text string; generate an input aggregate vector representation of the input text string by applying an encoder of the ANN to words in the input text string; compare the input aggregate vector representation to the vector representations; identify a relevant subset of the vector representations; and transmit the text strings that are associated with the relevant subset of the vector representations.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: a database containing a corpus of text strings; an artificial neural network (ANN) comprising an encoder; and a server device configured to: for each text string of the corpus of text strings: identify words of the text string and, for each respective word, identify one or more substrings of the text string that are related to the respective word; adjust weights of the ANN to associate each respective word with the one or more substrings that are related to the respective word; provide each of the words of the text string to the encoder of the ANN to generate a respective vector representation for each of the words; aggregate each respective vector representation of each of the words of the text string to generate an aggregate vector representation of the text string; and store the aggregate vector representation of the text string in the database and associate the aggregate vector representation with the text string within the database. 2. The system of claim 1 , wherein the encoder comprises input layer of the ANN, wherein the encoder includes a respective input node for each unique word in the corpus of text strings. 3. The system of claim 1 , wherein the ANN comprises a hidden layer including a predetermined number of nodes, wherein the predetermined number of nodes corresponds to a number of entries in each respective vector representation generated by the encoder. 4. The system of claim 3 , wherein the predetermined number of nodes is 16 or more. 5. The system of claim 1 , wherein the ANN is a feed-forward multilayer, a convolutional, a recurrent, or a recursive ANN. 6. The system of claim 1 , wherein, after storing aggregate vector representations for the text strings of the corpus of text strings, the server device is configured to: receive, from a client device, an input text string; provide each of the words of the input text string as inputs to the encoder of the ANN to generate a respective vector representation for each of the words of the input text string; aggregate each respective vector representation of the words of the input text string to generate an aggregate vector representation of the input text string; compare the aggregate vector representation of the input text string to the stored aggregate vector representations, and based on the comparison, identify a relevant subset of the stored aggregate vector representations; and transmit, to the client device, matching text strings from the corpus of text strings, wherein the matching text strings are associated with the relevant subset of the stored aggregate vector representations. 7. The system of claim 6 , wherein the input text string is a query of a table of the database that includes the corpus of text strings, and wherein, to transmit the text strings to the client device, the server device us configured to: transmit, to the client device, records of the table of the database that include the matching text strings. 8. The system of claim 6 , wherein, to compare the aggregate vector representation of the input text string to the stored aggregate vector representations, the server device is configured to: calculate respective cosine similarities between the aggregate vector representation of the input text string and each of the stored aggregate vector representations; and identify the relevant subset of the stored aggregate vector representations based on the respective cosine similarities. 9. A method comprising: for each text string of a corpus of text strings stored in a database: identifying words of the text string and, for each respective word, identify one or more substrings of the text string that are related to the respective word; adjusting weights of an artificial neural network (ANN) to associate each respective word with the one or more substrings that are related to the respective word; providing each of the words of the text string to an encoder of the ANN to generate a respective vector representation for each of the words; aggregating each respective vector representation of each of the words of the text string to generate an aggregate vector representation of the text string; and storing the aggregate vector representation of the text string in the database and associating the aggregate vector representation with the text string within the database. 10. The method of claim 9 , wherein the encoder comprises input layer of the ANN and a decoder comprises an output layer of the ANN, wherein both the encoder and the decoder include a respective node for each unique word in the corpus of text strings. 11. The method of claim 9 , wherein adjusting the weights of the ANN comprises: performing at least one feed forward operation and one backpropagation operation to minimize a total error within the ANN with respect to the association between each respective word and the one or more substrings that are related to the respective word. 12. The method of claim 9 , comprising, after storing aggregate vector representations for the text strings of the corpus of text strings: receiving, from a client device, an input text string; providing each of the words of the input text string as inputs to the encoder of the ANN to generate a respective vector representation for each of the words of the input text string; aggregating each respective vector representation of the words of the input text string to generate an aggregate vector representation of the input text string; calculating respective cosine similarities between the aggregate vector representation of the input text string and each of the stored aggregate vector representations; identifying a relevant subset of the stored aggregate vector representations based on the respective cosine similarities; and transmitting, to the client device, matching text strings from the corpus of text strings, wherein the matching text strings are associated with the relevant subset of the stored aggregate vector representations. 13. The method of claim 12 , wherein the input text string is a query of a database table that includes the corpus of text strings, and wherein transmitting comprises: transmitting, to the client device, records of the table of the database that respectively include the matching text strings. 14. The method of claim 12 , wherein identifying the relevant subset of the stored aggregate vector representations comprises: identifying, as the relevant subset, a predetermined number of stored aggregate vector representations that are associated with relatively higher cosine similarities. 15. One or more non-transitory, computer-readable media at least collectively storing instructions executable by a processor of a computer device, the instructions comprising instructions to: for each text string of a corpus of text strings stored in a database: identify words of the text string and, for each respective word, identify one or more substrings of the text string that are related to the respective word; adjust weights of an artificial neural network (ANN) to associate each respective word with the one or more substrings that are related to the respective word; provide each of the words of the text string to an encoder of the ANN to generate a respective vector representation for each of the words; aggregate each respective vector representation of each of the words of the text string to generate an aggregate vector representation of the text string; and store the aggregate vector representation of the text string in the database and associate the aggregate vector representation with the text

Assignees

Inventors

Classifications

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • Combinations of networks · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

  • Learning methods · CPC title

  • Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10970491B2 cover?
A database may contain a corpus of text strings, the text strings respectively associated with vector representations thereof, where each of the vector representations is an aggregation of vector representations of words in the associated text string. An artificial neural network (ANN) may have been trained with mappings between: (i) the words in the text strings, and (ii) for each respective w…
Who is the assignee on this patent?
Servicenow Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/3329. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 06 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).