System for reducing transaction failure
US-12175472-B2 · Dec 24, 2024 · US
US2025139500A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2025139500-A1 |
| Application number | US-202318496983-A |
| Country | US |
| Kind code | A1 |
| Filing date | Oct 30, 2023 |
| Priority date | Oct 30, 2023 |
| Publication date | May 1, 2025 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Determining whether synthetic data is sufficient for utilization in connection with one or more machine learning models. The computing device accesses a protected batch of data associated with a machine learning model. The computing device accesses a simulated batch of data, the simulated batch of data based upon but anonymizing the protected batch of data. The computing device accesses one or more comparisons of one or more variables in the protected batch of data and the simulated batch of data to obtain a similarity value. The computing device performs a machine learning function utilizing at least in-part the simulated batch of data if the similarity value exceeds a similarity threshold.
Opening claim text (preview).
What is claimed is: 1 . A method using a computing device to determine whether synthetic data is sufficient for utilization in connection with one or more machine learning models, the method comprising: accessing by a computing device a protected batch of data associated with a machine learning model; accessing by the computing device a simulated batch of data, the simulated batch of data based upon but anonymizing the protected batch of data; access results of one or more comparisons of one or more variables in the protected batch of data and the simulated batch of data to obtain a similarity value; and performing by the computing device a machine learning function utilizing at least in-part the simulated batch of data if the similarity value exceeds a similarity threshold. 2 . The method of claim 1 , wherein the machine learning function is performing by one or more machine learning model an inference utilizing at least in-part the simulated batch of data. 3 . The method of claim 1 , wherein the machine learning function is training a machine learning model with the simulated batch of data. 4 . The method of claim 1 , wherein the one or more comparisons include comparison of a distribution of one or more variables associated with the protected batch of data and a distribution of one or more variables associated with the simulated batch of data. 5 . The method of claim 1 , wherein the one or more comparisons include calculation and comparison of correlation matrices of two or more variables associated with the protected batch of data and the simulated batch of data. 6 . The method of claim 1 , wherein the one or more comparisons include generation of a hierarchy cluster to compare all variables in the protected batch of data and the simulated batch of data. 7 . The method of claim 1 , wherein the one or more comparisons include generation of a relationship correlation between one or more traits displayed by variables included in the protected batch of data and the simulated batch of data. 8 . The method of claim 1 , wherein the computing device displays an output of the one or more comparisons, the output displaying a difference in the protected batch of data and the simulated batch of data. 9 . A method using a computing device to determine whether synthetic data is sufficient for utilization in connection with one or more machine learning models, the method comprising: accessing by a computing device a protected batch of data associated with a machine learning model; accessing by the computing device a simulated batch of data, the simulated batch of data based upon but anonymizing the protected batch of data; access results of one or more comparisons of one or more variables in the protected batch of data and the simulated batch of data to obtain a similarity value; and performing by the computing device a machine learning function utilizing at least in-part the simulated batch of data if the similarity value exceeds a similarity threshold, the machine learning function performing by one or more machine learning models an inference utilizing at least in part the simulated batch of data. 10 . The method of claim 9 , wherein the one or more comparisons include comparison of a distribution of one or more variables associated with the protected batch of data and a distribution of one or more variables associated with the simulated batch of data. 11 . The method of claim 9 , wherein the one or more comparisons include calculation and comparison of correlation matrices of two or more variables associated with the protected batch of data and the simulated batch of data. 12 . The method of claim 9 , wherein the one or more comparisons include generation of a hierarchy cluster to compare all variables in the protected batch of data and the simulated batch of data. 13 . The method of claim 9 , wherein the one or more comparisons include generation of a relationship correlation between one or more traits displayed by variables included in the protected batch of data and the simulated batch of data. 14 . A method using a computing device to determine whether synthetic data is sufficient for utilization in connection with one or more machine learning models, the method comprising: accessing by a computing device a protected batch of data associated with a machine learning model; accessing by the computing device a simulated batch of data, the simulated batch of data based upon but anonymizing the protected batch of data; access results of one or more comparisons of one or more variables in the protected batch of data and the simulated batch of data to obtain a similarity value; and performing by the computing device a machine learning function utilizing at least in-part the simulated batch of data if the similarity value exceeds a similarity threshold, the machine learning function training a machine learning model with the simulated batch of data. 15 . The method of claim 14 , wherein the one or more comparisons include comparison of a distribution of one or more variables associated with the protected batch of data and a distribution of one or more variables associated with the simulated batch of data. 16 . The method of claim 14 , wherein the one or more comparisons include calculation and comparison of correlation matrices of two or more variables associated with the protected batch of data and the simulated batch of data. 17 . The method of claim 14 , wherein the one or more comparisons include generation of a hierarchy cluster to compare all variables in the protected batch of data and the simulated batch of data. 18 . The method of claim 14 , wherein the one or more comparisons include generation of a relationship correlation between one or more traits displayed by variables included in the protected batch of data and the simulated batch of data. 19 . A computer system to determine whether synthetic data is sufficient for utilization in connection with one or more machine learning models, the computer system comprising: one or more computer processors; one or more computer-readable storage media; program instructions stored on the computer-readable storage media for execution by at least one of the one or more processors, the program instructions comprising: program instructions to access a protected batch of data associated with a machine learning model; program instructions to access a simulated batch of data, the simulated batch of data based upon but anonymizing the protected batch of data; program instructions to access results of one or more comparisons of one or more variables in the protected batch of data and the simulated batch of data to obtain a similarity value; and program instructions to perform a machine learning function utilizing at least in-part the simulated batch of data if the similarity value exceeds a similarity threshold. 20 . The computer system of claim 19 , wherein the one or more comparisons include comparison of a distribution of one or more variables associated with the protected batch of data and a distribution of one or more variables associated with the simulated batch of data. 21 . The computer system of claim 19 , wherein the one or more comparisons include calculation and comparison of correlation matrices of two or more variables associated with the protected batch of data and the simulated batch of data. 22 . The computer system of claim 19 , wherein the one or more comparisons inclu
Related publications grouped by family.
Answers are generated from the same data shown on this page.