Anomalous data detection in computer based reasoning and artificial intelligence systems

US11262742B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11262742-B2
Application numberUS-202016992842-A
CountryUS
Kind codeB2
Filing dateAug 13, 2020
Priority dateApr 9, 2018
Publication dateMar 1, 2022
Grant dateMar 1, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are provided herein for creating well-balanced computer-based reasoning systems and using those to control systems. The techniques include receiving a request to determine whether to use one or more particular data elements, features, cases, etc. in a computer-based reasoning model (e.g., as data elements, cases or features are being added, or as part of pruning existing features or cases). Conviction measures are determined and inclusivity conditions are tested. The result of comparing the conviction measure can be used to determine whether to include or exclude the feature, case, etc. in the model and/or whether there are anomalies in the model. A controllable system may then be controlled using the computer-based reasoning model. Examples controllable systems include self-driving cars, image labeling systems, manufacturing and assembly controls, federated systems, smart voice controls, automated control of experiments, energy transfer systems, health care systems, cybersecurity systems, and the like.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: training a computer-based reasoning model; receiving a request to determine whether one or more particular data elements in the computer-based reasoning model are anomalous; determining for each of the one or more particular data elements, one or more conviction scores, wherein determining one or more conviction scores for the one or more particular data elements comprises determining a familiarity conviction score for the one or more particular data elements and determining a distance contribution score for the one or more particular data elements; wherein: the familiarity conviction score is a measure of how much the one or more particular data elements distort a model calculated as a function of one or more measures of distribution similarity, and the distance contribution score is a locally weighted expected value of the distance from one point to its nearest neighbors calculated based on a function of similarity between the one or more particular data elements and neighboring data elements of the one or more particular data elements; determining whether the one or more conviction scores meet one or more anomalousness conditions; in response to determining that the one or more conviction scores meet the one or more anomalousness conditions, sending an alert to a second system that the one or more particular data elements in the computer-based reasoning model are anomalous; wherein determining whether the one or more conviction scores meet the anomalousness conditions comprises determining that the one or more particular data elements meet the anomalousness condition when the familiarity conviction score is beyond a first threshold and the distance contribution score is beyond a second threshold, wherein the method is performed on one or more computing devices. 2. The method of claim 1 , further comprising: including the one or more particular data elements in the computer-based reasoning model when the one or more anomalousness conditions is not met; causing, with a control system, control of a controllable system with the computer-based reasoning model. 3. The method of claim 1 , wherein determining whether the one or more conviction scores meet the anomalousness conditions comprises determining that the one or more particular data elements meet the one or more anomalousness conditions when the familiarity conviction score is below a first threshold and the distance contribution score is below a second threshold. 4. The method of claim 1 , further comprising, in response to determining that the one or more conviction scores meet the one or more anomalousness conditions, excluding the one or more particular data elements in the computer-based reasoning model. 5. The method of claim 1 , wherein receiving a request to determine whether to include one or more particular data elements comprises receiving a request to reduce the computer-based reasoning model to a particular size; and the method further comprises: determining a number of data elements to include in the computer-based reasoning model to reduce the computer-based reasoning model to a particular size; determining a subset of data elements to include, that includes the number of data elements, to include in the computer-based reasoning model based at least in part on the one or more conviction scores for data elements in the computer-based reasoning model; and including only the subset of data elements to include in the computer-based reasoning model, and excluding data elements from the computer-based reasoning model that are not in the subset of data elements to include. 6. The method of claim 1 , further comprising: initially receiving the one or more particular data elements as part of training for the computer-based reasoning model; in response to determining that the one or more conviction scores meet the one or more anomalousness conditions, sending an indication to a trainer associated with the training for the computer-based reasoning model that training related to the one or more particular data elements is anomalous. 7. The method of claim 1 , further comprising: receiving a request for an action to take in a current context associated with the one or more particular data elements; when the one or more anomalousness conditions is not met by the one or more conviction scores associated with the one or more particular data elements: determining the action to take based on comparing the current context to contexts associated with cases in the computer-based reasoning model; and responding to the request for the action to take with the determined action. 8. The method of claim 1 , further comprising: receiving a request for an action to take in a current context associated with the one or more particular data elements; when the one or more anomalousness conditions is met by the one or more conviction scores associated with the one or more particular data elements: removing the one or more particular data elements associated with the one or more convictions scores that met the one or more anomalousness conditions. 9. A system for executing instructions, wherein said instructions are instructions which, when executed by one or more computing devices, cause performance of a process including: training a computer-based reasoning model; receiving a request to determine whether one or more particular data elements in the computer-based reasoning model are anomalous; determining for each of the one or more particular data elements, one or more conviction scores, wherein determining one or more conviction scores for the one or more particular data elements comprises determining a familiarity conviction score for the one or more particular data elements and determining a distance contribution score for the one or more particular data elements; wherein: the familiarity conviction score is a measure of how much the one or more particular data elements distort a model calculated as a function of one or more measures of distribution similarity, and the distance contribution score is a locally weighted expected value of the distance from one point to its nearest neighbors calculated based on a function of similarity between the one or more particular data elements and neighboring data elements of the one or more particular data elements; determining whether the one or more conviction scores meet one or more anomalousness conditions; in response to determining that the one or more conviction scores meet the one or more anomalousness conditions, sending an alert to a second system that the one or more particular data elements in the computer-based reasoning model are anomalous; wherein determining whether the one or more conviction scores meet the anomalousness conditions comprises determining that the one or more particular data elements meet the anomalousness condition when the familiarity conviction score is beyond a first threshold and the distance contribution score is beyond a second threshold, wherein the process is performed on one or more computing devices. 10. The system of claim 9 , the process further comprising: including the one or more particular data elements in the computer-based reasoning model when the one or more anomalousness conditions is not met; causing, with a control system, control of a controllable system with the computer-based reasoning model. 11. The system of claim 9 , wherein determining whether the one or more conviction scores meet the anomalousness conditions comprises determining that the one or more particular data elements meet the one or more anomalousness conditions when the familiarity conviction score

Assignees

Inventors

Classifications

  • Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection · CPC title

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • Matching criteria, e.g. proximity measures · CPC title

  • Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11262742B2 cover?
Techniques are provided herein for creating well-balanced computer-based reasoning systems and using those to control systems. The techniques include receiving a request to determine whether to use one or more particular data elements, features, cases, etc. in a computer-based reasoning model (e.g., as data elements, cases or features are being added, or as part of pruning existing features or …
Who is the assignee on this patent?
Diveplane Corp
What technology area does this patent fall under?
Primary CPC classification G05B23/0281. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 01 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).