Artificial data generation for differential privacy

US2025131116A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025131116-A1
Application numberUS-202318490914-A
CountryUS
Kind codeA1
Filing dateOct 20, 2023
Priority dateOct 20, 2023
Publication dateApr 24, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An embodiment configures a plurality of parameters, the parameters being usable to generate artificial data from original data, the configuring adjusting a level of privacy in the artificial data. An embodiment fits a distribution type to a variable of the original data. An embodiment adjusts, using a desired level of privacy and the distribution type, a level of noise, wherein the level of noise corresponds to the desired level of privacy. An embodiment generates, using the distribution type and the level of noise, the artificial data, the artificial data achieving the desired level of privacy by including noise data corresponding to the level of noise.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method comprising: configuring a plurality of parameters, the parameters being usable to generate artificial data from original data, the configuring adjusting a level of privacy in the artificial data; fitting a distribution type to a variable of the original data; adjusting, using a desired level of privacy and the distribution type, a level of noise, wherein the level of noise corresponds to the desired level of privacy; and generating, using the distribution type and the level of noise, the artificial data, the artificial data achieving the desired level of privacy by including noise data corresponding to the level of noise. 2 . The computer-implemented method of claim 1 , wherein configuring the plurality of parameters comprises setting an upper bound parameter of a continuous variable comprising the original data to a first value according to a statistical characteristic of the continuous variable. 3 . The computer-implemented method of claim 1 , wherein configuring the plurality of parameters comprises setting a lower bound parameter of a continuous variable comprising the original data to a second value according to a statistical characteristic of the continuous variable. 4 . The computer-implemented method of claim 1 , wherein the variable contributes to a privacy aspect of the original data. 5 . The computer-implemented method of claim 1 , wherein fitting a distribution type to the variable of the original data further comprises: selecting, from a plurality of distribution type fittings according to a goodness of fit statistic computed on each distribution type fitting, the distribution type. 6 . The computer-implemented method of claim 1 , wherein the desired level of privacy is higher than a level of privacy in the original data. 7 . A computer program product comprising one or more computer readable storage media, and program instructions collectively stored on the one or more computer readable storage media, the program instructions executable by a processor to cause the processor to perform operations comprising: configuring a plurality of parameters, the parameters being usable to generate artificial data from original data, the configuring adjusting a level of privacy in the artificial data; fitting a distribution type to a variable of the original data; adjusting, using a desired level of privacy and the distribution type, a level of noise, wherein the level of noise corresponds to the desired level of privacy; and generating, using the distribution type and the level of noise, the artificial data, the artificial data achieving the desired level of privacy by including noise data corresponding to the level of noise. 8 . The computer program product of claim 7 , wherein the stored program instructions are stored in a computer readable storage device in a data processing system, and wherein the stored program instructions are transferred over a network from a remote data processing system. 9 . The computer program product of claim 7 , wherein the stored program instructions are stored in a computer readable storage device in a server data processing system, and wherein the stored program instructions are downloaded in response to a request over a network to a remote data processing system for use in a computer readable storage device associated with the remote data processing system, further comprising: program instructions to meter use of the program instructions associated with the request; and program instructions to generate an invoice based on the metered use. 10 . The computer program product of claim 7 , wherein configuring the plurality of parameters comprises setting an upper bound parameter of a continuous variable comprising the original data to a first value according to a statistical characteristic of the continuous variable. 11 . The computer program product of claim 7 , wherein configuring the plurality of parameters comprises setting a lower bound parameter of a continuous variable comprising the original data to a second value according to a statistical characteristic of the continuous variable. 12 . The computer program product of claim 7 , wherein the variable contributes to a privacy aspect of the original data. 13 . The computer program product of claim 7 , wherein fitting a distribution type to the variable of the original data further comprises: selecting, from a plurality of distribution type fittings according to a goodness of fit statistic computed on each distribution type fitting, the distribution type. 14 . The computer program product of claim 7 , wherein the desired level of privacy is higher than a level of privacy in the original data. 15 . A computer system comprising a processor and one or more computer readable storage media, and program instructions collectively stored on the one or more computer readable storage media, the program instructions executable by the processor to cause the processor to perform operations comprising: configuring a plurality of parameters, the parameters being usable to generate artificial data from original data, the configuring adjusting a level of privacy in the artificial data; fitting a distribution type to a variable of the original data; adjusting, using a desired level of privacy and the distribution type, a level of noise, wherein the level of noise corresponds to the desired level of privacy; and generating, using the distribution type and the level of noise, the artificial data, the artificial data achieving the desired level of privacy by including noise data corresponding to the level of noise. 16 . The computer system of claim 15 , wherein configuring the plurality of parameters comprises setting an upper bound parameter of a continuous variable comprising the original data to a first value according to a statistical characteristic of the continuous variable. 17 . The computer system of claim 15 , wherein configuring the plurality of parameters comprises setting a lower bound parameter of a continuous variable comprising the original data to a second value according to a statistical characteristic of the continuous variable. 18 . The computer system of claim 15 , wherein the variable contributes to a privacy aspect of the original data. 19 . The computer system of claim 15 , wherein fitting a distribution type to the variable of the original data further comprises: selecting, from a plurality of distribution type fittings according to a goodness of fit statistic computed on each distribution type fitting, the distribution type. 20 . The computer system of claim 15 , wherein the desired level of privacy is higher than a level of privacy in the original data. 21 . A data processing system comprising a processor and one or more computer readable storage media, and program instructions collectively stored on the one or more computer readable storage media, the program instructions executable by the processor to cause the processor to perform operations comprising. 22 . The data processing system of claim 21 , wherein configuring the plurality of parameters comprises setting an upper bound parameter of a continuous variable comprising the original data to a first value according to a statistical characteristic of the continuous variable. 23 . The data processing system of claim 21 , wherein configuring the plurality of parameters comprises setting a lower bound para

Assignees

Inventors

Classifications

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • Learning methods · CPC title

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • Probabilistic or stochastic networks · CPC title

  • Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025131116A1 cover?
An embodiment configures a plurality of parameters, the parameters being usable to generate artificial data from original data, the configuring adjusting a level of privacy in the artificial data. An embodiment fits a distribution type to a variable of the original data. An embodiment adjusts, using a desired level of privacy and the distribution type, a level of noise, wherein the level of noi…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F21/6218. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Apr 24 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).