Spurious-data-based detection related to malicious activity

US12306938B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12306938-B2
Application numberUS-202318170502-A
CountryUS
Kind codeB2
Filing dateFeb 16, 2023
Priority dateFeb 16, 2023
Publication dateMay 20, 2025
Grant dateMay 20, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In some aspects, a computing system obtain a first dataset including a set of original data samples and a first set of spurious data samples. Based on a time period expiring, the computing system may replace the first set of spurious data samples in the first dataset with a second set of spurious data samples. The computing system may obtain an indication that a second dataset is available via a third-party computing device. Based on a determination that a subset of samples of the second dataset correspond to the first set of spurious data samples, the computing system may determine a time window in which an incident occurred. As an example, the time window may be determined to correspond to a time before the first set of spurious data samples were replaced with the second set of spurious data samples.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for using spurious data samples in a dataset to determine a time window during which a malicious device caused a cybersecurity incident, the system comprising: one or more processors; and a non-transitory, computer readable medium having instructions recorded thereon that, when executed by the one or more processors, cause operations comprising: obtaining a first dataset comprising a set of original data samples and a first set of spurious data samples, wherein spurious data samples of the first set of spurious data samples are stored at locations, identifiable by a key, within the first dataset, wherein the first set of spurious data samples are configured to decrease accuracy of a machine learning model by more than a threshold percentage amount; based on a time period expiring, replacing the first set of spurious data samples in the first dataset with a second set of spurious data samples; obtaining an indication that a second dataset is available via a third-party computing device; determining that a subset of samples of the second dataset match the first set of spurious data samples; based on the subset of samples of the second dataset matching the first set of spurious data samples determining a time window in which a cybersecurity incident occurred, wherein the time window corresponds to a time before the first set of spurious data samples were replaced with the second set of spurious data samples; and outputting an indication of the time window. 2. The system of claim 1 , wherein replacing the first set of spurious data samples in the first dataset with the second set of spurious data samples comprises: determining a first percentage corresponding to a number of data samples in the first dataset that belong to the first set of spurious data samples; determining a second percentage of the first dataset; generating the second set of spurious data samples, wherein the number of data samples in the second set of spurious data samples corresponds to the second percentage; and based on generating the second set of spurious data samples, replacing the first set of spurious data samples with the second set of spurious data samples. 3. The system of claim 1 , wherein the instructions, when executed, cause operations further comprising: determining a setting of a computing device associated with the cybersecurity incident, wherein the setting was active during the time window; and based on the setting, generating a recommendation for a modified setting, wherein the modified setting is predicted to prevent the cybersecurity incident from repeating. 4. The system of claim 1 , wherein the instructions, when executed, cause operations further comprising: determining a software version of software associated with the cybersecurity incident, wherein the software version of the software was installed during the time window; and based on the software version, generating a recommendation. 5. A method for using spurious data samples in a dataset to determine a time window during which a malicious actor caused a cybersecurity incident, the method comprising: obtaining a first dataset comprising a set of original data samples and a first set of spurious data samples; based on a time period expiring, replacing the first set of spurious data samples in the first dataset with a second set of spurious data samples; obtaining an indication that a second dataset is available via a third-party computing device; based on determining that a subset of samples of the second dataset correspond to the first set of spurious data samples, determining a time window in which a cybersecurity incident occurred, wherein the time window corresponds to a time before the first set of spurious data samples were replaced with the second set of spurious data samples; and outputting an indication of the time window. 6. The method of claim 5 , wherein replacing the first set of spurious data samples in the first dataset with the second set of spurious data samples comprises: determining a first percentage corresponding to a number of data samples in the first dataset that belong to the first set of spurious data samples; determining a second percentage of the first dataset; generating the second set of spurious data samples, wherein the number of data samples in the second set of spurious data samples corresponds to the second percentage; and based on generating the second set of spurious data samples, replacing the first set of spurious data samples with the second set of spurious data samples. 7. The method of claim 5 , further comprising: determining a setting of a computing device associated with the cybersecurity incident, wherein the setting was active during the time window; and based on the setting, generating a recommendation for a modified setting, wherein the modified setting is predicted to prevent the cybersecurity incident from repeating. 8. The method of claim 5 , further comprising: determining a software version of software associated with the cybersecurity incident, wherein the software version of the software was installed during the time window; and based on the software version, generating a recommendation. 9. The method of claim 5 , wherein determining that the subset of samples of the second dataset corresponds to the first set of spurious data samples comprises: generating a first hash of the subset of samples; and based on the first hash matching a second hash associated with the first set of spurious data samples, determining that the subset of samples corresponds to the first set of spurious data samples. 10. The method of claim 5 , further comprising steps for generating a spurious data sample. 11. The method of claim 5 , further comprising: based on replacing the first set of spurious data samples, storing an identifier associated with the second set of spurious data samples, wherein the identifier comprises an embedding of the second set of spurious data samples. 12. The method of claim 5 , wherein replacing the first set of spurious data samples in the first dataset with the second set of spurious data samples comprises: determining a second key indicative of locations within the first dataset that are different from the locations of samples of the first set of spurious data samples within the first dataset; and based on the locations indicated by the second key, adding the second set of spurious data samples to the first dataset. 13. A non-transitory, computer-readable medium comprising instructions that when executed by one or more processors, cause operations comprising: obtaining a first dataset comprising a set of original data samples and a first set of spurious data samples; based on a time period expiring, replacing the first set of spurious data samples in the first dataset with a second set of spurious data samples; obtaining an indication that a second dataset is available via a third-party computing device; based on determining that a subset of samples of the second dataset correspond to the first set of spurious data samples determining a time window in which a cybersecurity incident occurred; and outputting an indication of the time window. 14. The medium of claim 13 , wherein replacing the first set of spurious data samples in the first dataset with the second set of spurious data samples comprises: determining a first percentage corresponding to a number of data samples in the first dataset that belong to the first set of spurious data samples; determining a second percentage of the first dataset; generating the second set of spurious data

Assignees

Inventors

Classifications

  • G06F21/566Primary

    Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities · CPC title

  • Test or assess a computer or a system · CPC title

  • G06F21/554Primary

    involving event detection and direct action · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12306938B2 cover?
In some aspects, a computing system obtain a first dataset including a set of original data samples and a first set of spurious data samples. Based on a time period expiring, the computing system may replace the first set of spurious data samples in the first dataset with a second set of spurious data samples. The computing system may obtain an indication that a second dataset is available via …
Who is the assignee on this patent?
Capital One Services Llc
What technology area does this patent fall under?
Primary CPC classification G06F21/566. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 20 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).