Techniques for similarity analysis and data enrichment using knowledge sources
US-2016092557-A1 · Mar 31, 2016 · US
US11822975B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11822975-B2 |
| Application number | US-202017102526-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 24, 2020 |
| Priority date | Jul 6, 2018 |
| Publication date | Nov 21, 2023 |
| Grant date | Nov 21, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods for generating synthetic data are disclosed. For example, a system may include one or more memory units storing instructions and one or more processors configured to execute the instructions to perform operations. The operations may include receiving a dataset including time-series data. The operations may include generating a plurality of data segments based on the dataset, determining respective segment parameters of the data segments, and determining respective distribution measures of the data segments. The operations may include training a parameter model to generate synthetic segment parameters. Training the parameter model may be based on the segment parameters. The operations may include training a distribution model to generate synthetic data segments. Training the distribution model may be based on the distribution measures and the segment parameters. The operations may include generating a synthetic dataset using the parameter model and the distribution model and storing the synthetic dataset.
Opening claim text (preview).
What is claimed is: 1. A system for generating synthetic data, comprising: one or more memory units storing instructions; and one or more processors that execute the instructions to perform operations comprising: receiving a request to generate a synthetic time-series dataset, the request including a request dataset; determining a profile of the request dataset; accessing a distribution model based on the determined profile of the request dataset, the distribution model having been trained to generate synthetic data segments based on distribution measures and segment parameters of actual time-series data, wherein the generated synthetic data segments satisfy a similarity metric representing a measure of similarity between the synthetic data segments and the actual time-series data; and generating, according to the distribution model, a synthetic time-series dataset. 2. The system of claim 1 , wherein: the operations further comprise generating synthetic segment parameters using a parameter model; and generating the synthetic time-series dataset comprises: generating synthetic data segments according to the distribution model; and combining the synthetic data segments to generate the synthetic time-series dataset. 3. The system of claim 2 , wherein combining the synthetic data segments comprises combining the synthetic data segments in two or more dimensions. 4. The system of claim 2 , the parameter model having been trained to generate synthetic segment parameters and segment sizes. 5. The system of claim 2 , wherein generating synthetic segment parameters using a parameter model comprises generating a sequence of synthetic segment parameters based on at least one of a segment parameter seed or an instruction to generate a random parameter seed. 6. The system of claim 5 , wherein the sequence of synthetic segment parameters extends forward or backward in time from the segment parameter seed or the random parameter seed. 7. The system of claim 1 , the operations further comprising searching a model index based on the profile of the request dataset to determine the distribution model. 8. The system of claim 7 , wherein searching the model index comprises searching the model index based on at least one of a model parameter, a model hyperparameter, or a model type. 9. The system of claim 7 , wherein: the request includes at least one of a data schema or a statistical metric; and the distribution model is determined based on the distribution model having at least one of a model data schema overlapping with the data schema or a model statistical metric within a tolerance of the statistical metric. 10. The system of claim 1 , the operations further comprising providing the synthetic time-series dataset at least one of a component within the system or a component outside the system. 11. The system of claim 1 , wherein determining the profile of the request dataset comprises retrieving the profile from the request. 12. The system of claim 1 , wherein determining the profile of the request dataset comprises accessing a storage location identified in the request. 13. The system of claim 1 , wherein the profile of the request dataset includes at least one of a number of dataset dimensions or a dataset format. 14. A method for generating synthetic data, the method comprising: receiving a request to generate a synthetic time-series dataset, the request including a request dataset; determining a profile of the request dataset; searching a model index based on the profile of the request dataset to determine a model; accessing a distribution model, the distribution model having been trained to generate synthetic data segments based on distribution measures and segment parameters of actual time-series data, wherein the generated synthetic data segments satisfy a similarity metric representing a measure of similarity between the synthetic data segments and the actual time-series data; generating, using the distribution model, synthetic data segments based on synthetic segment parameters of the request dataset; and generating, using the synthetic data segments, a synthetic time-series dataset. 15. The method of claim 14 , further comprising generating synthetic segment parameters using a parameter model, wherein generating the synthetic time-series dataset comprises: generating synthetic data segments according to the distribution model; and combining the synthetic data segments to generate the synthetic time-series dataset. 16. The method of claim 15 , wherein combining the synthetic data segments comprises combining the synthetic data segments in two or more dimensions. 17. The method of claim 15 , the parameter model having been trained to generate synthetic segment parameters and segment sizes. 18. The method of claim 15 , wherein generating synthetic segment parameters using a parameter model comprises generating a sequence of synthetic segment parameters based on at least one of a segment parameter seed or an instruction to generate a random parameter seed. 19. The method of claim 18 , wherein the sequence of synthetic segment parameters extends forward or backward in time from the segment parameter seed or the random parameter seed. 20. The method of claim 14 , wherein the profile of the request dataset includes at least one of a number of dataset dimensions or a dataset format.
Texturing; Colouring; Generation of textures or colours (retouching, inpainting or scratch removal G06T5/77) · CPC title
Auto-encoder networks; Encoder-decoder networks · CPC title
Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title
Supervised learning · CPC title
Adversarial learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.