Convolutional neural network (cnn)-based suggestions for anomaly input

US2019108439A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2019108439-A1
Application numberUS-201715726268-A
CountryUS
Kind codeA1
Filing dateOct 5, 2017
Priority dateOct 5, 2017
Publication dateApr 11, 2019
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The technology disclosed determines one or more field values in a set of field values for a particular field in a fielded dataset that are similar to an input value using six similarity measures. A factor vector is generated per similarity measure and combined to form an input matrix. A convolutional neural network processes the input matrix to generate evaluation vectors. A fully-connected network evaluates the evaluation vectors to generate suggestion scalars for similarity to a particular input value. Thresholding is applied to suggestions scalars to determine one or more suggestion candidates for the particular input value.

First claim

Opening claim text (preview).

What is claimed is: 1 . A neural network-based suggestion method, including: determining that one or more field values in a set of field values are similar to an input value for a particular field in a fielded dataset, including comparing a particular input value to unique field values for the particular field by applying a plurality of similarity measures and generating a factor vector that has one scalar for each of the unique field values; evaluating the factor vector using convolution filters in a convolutional neural network (CNN) to generate evaluation vectors for similarity to the unique field values; further evaluating the evaluation vectors using a fully-connected (FC) neural network to produce suggestion scalars for similarity to the particular input value; and using the suggestion scalars to determine one or more suggestion candidates for the particular input value. 2 . The method of claim 1 , wherein the plurality of similarity measures include semantic similarity, syntactic similarity, soundex similarity, character-by-character format similarity, field length similarity, and dataset frequency similarity. 3 . The method of claim 1 , further including constructing an input to the CNN by column-wise arranging one or more factor vectors in an input matrix. 4 . The method of claim 3 , wherein the convolution filters apply row-wise on the input matrix. 5 . The method of claim 1 , further including: automatically constructing positive and negative examples for inclusion in a training dataset by: for a given linguistic similarity measure, determining a first set of similar field values from a vocabulary and determining a second set of dissimilar field values from the vocabulary; and randomly selecting some field values from the first and second sets as positive and negative examples respectively; iterating the determining and the selecting for a plurality of similarity measures; and storing the randomly selected field values for the plurality of similarity measures as the training dataset. 6 . The method of claim 5 , further including training the CNN and the FC neural network using the positive and negative examples in the training dataset. 7 . The method of claim 1 , further including: using at least one cost function to evaluate performance of the CNN and the FC neural network during training. 8 . The method of claim 1 , wherein the CNN is a one-layer CNN. 9 . The method of claim 1 , wherein the CNN is a two-layer CNN. 10 . The method of claim 1 , further including: determining which field values for a particular field in the fielded dataset are anomalous, including comparing a particular unique field value to other unique field values for the particular field by applying a plurality of similarity measures and generating a factor vector that has one scalar for each of the unique field values; evaluating the factor vector using the convolution filters in the CNN to generate evaluation vectors; further evaluating the evaluation vectors using the FC neural network to produce an anomaly scalar for the particular unique field value; and thresholding the anomaly scalar to determine whether the particular unique field value is anomalous. 11 . A system including one or more processors coupled to memory, the memory loaded with computer instructions to provide neural network-based suggestions, the instructions, when executed on the processors, implement actions comprising: determining that one or more field values in a set of field values are similar to an input value for a particular field in a fielded dataset, including comparing a particular input value to unique field values for the particular field by applying a plurality of similarity measures and generating a factor vector that has one scalar for each of the unique field values; evaluating the factor vector using convolution filters in a convolutional neural network (CNN) to generate evaluation vectors for similarity to the unique field values; further evaluating the evaluation vectors using a fully-connected (FC) neural network to produce suggestion scalars for similarity to the particular input value; and using the suggestion scalars to determine one or more suggestion candidates for the particular input value. 12 . The system of claim 11 , wherein the plurality of similarity measures include semantic similarity, syntactic similarity, soundex similarity, character-by-character format similarity, field length similarity, and dataset frequency similarity. 13 . The system of claim 11 , further implementing actions comprising constructing an input to the CNN by column-wise arranging one or more factor vectors in an input matrix. 14 . The system of claim 13 , wherein the convolution filters apply row-wise on the input matrix. 15 . The system of claim 11 , further implementing actions comprising: automatically constructing positive and negative examples for inclusion in a training dataset by: for a given linguistic similarity measure, determining a first set of similar field values from a vocabulary and determining a second set of dissimilar field values from the vocabulary; and randomly selecting some field values from the first and second sets as positive and negative examples respectively; iterating the determining and the selecting for a plurality of similarity measures; and storing the randomly selected field values for the plurality of similarity measures as the training dataset. 16 . The system of claim 15 , further implementing actions comprising training the CNN and the FC neural network using the positive and negative examples in the training dataset. 17 . The system of claim 11 , further implementing actions comprising: using at least one cost function to evaluate performance of the CNN and the FC neural network during training. 18 . The system of claim 11 , wherein the CNN is a one-layer CNN. 19 . The system of claim 11 , further implementing actions comprising: determining which field values for a particular field in the fielded dataset are anomalous, including comparing a particular unique field value to other unique field values for the particular field by applying a plurality of similarity measures and generating a factor vector that has one scalar for each of the unique field values; evaluating the factor vector using the convolution filters in the CNN to generate evaluation vectors; further evaluating the evaluation vectors using the FC neural network to produce an anomaly scalar for the particular unique field value; and thresholding the anomaly scalar to determine whether the particular unique field value is anomalous. 20 . A non-transitory computer readable storage medium impressed with computer program instructions to provide neural network-based suggestions, the instructions, when executed on a processor, implement a method comprising: determining that one or more field values in a set of field values are similar to an input value for a particular field in a fielded dataset, including comparing a particular input value to unique field values for the particular field by applying a plurality of similarity measures and generating a factor vector that has one scalar for each of the unique field values; evaluating the factor vector using convolution filters in a convolutional neural network (CNN) to generate evaluation vectors for similarity to the unique field values; further evaluating the evaluation vectors using a fu

Assignees

Inventors

Classifications

  • G06N3/08Primary

    Learning methods · CPC title

  • Activation functions · CPC title

  • Combinations of networks · CPC title

  • Architecture, e.g. interconnection topology · CPC title

  • Sales lead analysis · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2019108439A1 cover?
The technology disclosed determines one or more field values in a set of field values for a particular field in a fielded dataset that are similar to an input value using six similarity measures. A factor vector is generated per similarity measure and combined to form an input matrix. A convolutional neural network processes the input matrix to generate evaluation vectors. A fully-connected net…
Who is the assignee on this patent?
Salesforce Com Inc
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Apr 11 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).