Method and system for synthetic generation of time series data

US10664381B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10664381-B2
Application numberUS-201916454041-A
CountryUS
Kind codeB2
Filing dateJun 26, 2019
Priority dateJul 6, 2018
Publication dateMay 26, 2020
Grant dateMay 26, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for generating synthetic data are disclosed. For example, a system may include one or more memory units storing instructions and one or more processors configured to execute the instructions to perform operations. The operations may include receiving a dataset that includes time series data having a plurality of dimensions and generating a transformed dataset by performing a first data transformation. The first data transformation may include a time-based data processing method. The operations may include generating a synthetic transformed-dataset by implementing a data model using the transformed dataset. The data model may be configured to generate synthetic transformed-data based on a relationship between data of at least two dimensions of the transformed dataset. The operations may include generating a synthetic dataset by performing a second data transformation on the synthetic transformed-dataset. The second data transformation may include an inverse of the first data transformation.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for generating synthetic data, comprising: one or more memory units storing instructions; and one or more processors that execute the instructions to perform operations comprising: receiving a dataset comprising time series data having a plurality of dimensions; generating a transformed dataset by performing a first data transformation on the dataset, the first data transformation comprising a time-based data processing operation; generating a first synthetic transformed dataset by implementing a data model using the transformed dataset, the data model being configured to generate synthetic transformed data based on a relationship between data of at least two dimensions of the transformed dataset; generating a second synthetic transformed dataset by performing a second data transformation on the first synthetic transformed dataset, the second data transformation comprising an inverse of the first data transformation; receiving a plurality of sample datasets having a plurality of respective dimensions; generating a plurality of transformed sample datasets corresponding to the sample datasets by performing the first data transformation on the sample datasets; and training the data model to generate synthetic transformed data based on the transformed sample datasets. 2. The system of claim 1 , wherein: the operations further comprise generating the data model; and training the data model is based on generating the data model. 3. The system of claim 1 , wherein the first data transformation comprises encoding the dataset. 4. The system of claim 3 , wherein encoding the dataset comprises encoding a character as a number. 5. The system of claim 3 , wherein encoding the dataset comprises implementing a natural language model to encode string data as numeric data. 6. The system of claim 3 , wherein encoding the dataset comprises implementing an encoder model to reduce the number of dimensions of the dataset. 7. The system of claim 1 , wherein the data model comprises a recurrent neural network-convolutional neural network (RNN-CNN) model. 8. The system of claim 1 , wherein the data model comprises a Long Short Term Memory Convolutional Neural Network model. 9. The system of claim 1 , wherein the data model comprises an attention network model. 10. The system of claim 1 , wherein the relationship comprises a correlation between at least two dimensions of the dataset. 11. The system of claim 1 , wherein the first data transformation comprises vector subtraction. 12. The system of claim 1 , wherein the first data transformation comprises normalization. 13. The system of claim 1 , wherein the first data transformation comprises applying a logarithmic function. 14. The system of claim 1 , wherein the first data transformation comprises implementing a pooling operation. 15. The system of claim 1 , wherein the data model is configured to generate synthetic transformed data by at least assigning a probability to the synthetic transformed-data. 16. The system of claim 1 , wherein: receiving the dataset comprises receiving the dataset from a client device; and the operations further comprise transmitting the second synthetic transformed dataset to the client device. 17. The system of claim 1 , wherein receiving the dataset comprises receiving the dataset at a cloud service. 18. A method for generating synthetic data, the method comprising: receiving a dataset comprising time series data having a plurality of dimensions; generating a transformed dataset by performing a first data transformation on the dataset, the first data transformation comprising a time-based data processing operation; generating a first synthetic transformed-dataset by implementing a data model using the transformed dataset, the data model being configured to generate synthetic transformed-data based on a relationship between data of at least two dimensions of the transformed dataset; and generating a second synthetic transformed dataset by performing a second data transformation on the first synthetic transformed-dataset, the second data transformation comprising an inverse of the first data transformation; receiving a plurality of sample datasets having a plurality of respective dimensions; generating a plurality of transformed sample-datasets corresponding to the sample datasets by performing the first data transformation on the sample datasets; and training the data model to generate synthetic transformed-data based on the transformed sample-datasets. 19. A system for generating synthetic data, comprising: one or more memory units storing instructions; and one or more processors that execute the instructions to perform operations comprising: receiving, at a server, from a client device, a dataset comprising numeric time-series data having a plurality of dimensions; generating a transformed dataset by performing a first data transformation on the dataset, the first data transformation comprising subtracting data associated with a first time point from data associated with a second time point; generating a synthetic transformed-dataset by implementing a data model using the transformed dataset, the data model comprising an RNN-CNN model configured to generate synthetic transformed-data based on a relationship between data of at least two dimensions of the transformed dataset; generating a synthetic dataset by performing a second data transformation on the synthetic transformed-dataset, the second data transformation comprising an inverse of the first data transformation; and transmitting, to the client device, the synthetic dataset.

Assignees

Inventors

Classifications

  • Ensemble learning · CPC title

  • for test design, e.g. generating new test cases · CPC title

  • using kernel methods, e.g. support vector machines [SVM] · CPC title

  • for test execution, e.g. scheduling of test suites · CPC title

  • Correlation function computation {including computation of convolution operations (arithmetic circuits for sum of products per se, e.g. multiply-accumulators G06F7/5443; digital filters, e.g. FIR, IIR, adaptive filters H03H17/00)} · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10664381B2 cover?
Systems and methods for generating synthetic data are disclosed. For example, a system may include one or more memory units storing instructions and one or more processors configured to execute the instructions to perform operations. The operations may include receiving a dataset that includes time series data having a plurality of dimensions and generating a transformed dataset by performing a…
Who is the assignee on this patent?
Capital One Services Llc
What technology area does this patent fall under?
Primary CPC classification G06F9/541. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 26 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).