Synthetic data generation apparatus, method for the same, and program

US12265649B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12265649-B2
Application numberUS-201816753037-A
CountryUS
Kind codeB2
Filing dateOct 5, 2018
Priority dateOct 13, 2017
Publication dateApr 1, 2025
Grant dateApr 1, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A synthetic data generation apparatus codes a value of each of category attributes contained in original data into a value of a numerical attribute in accordance with a coding rule; generates first synthetic data from the original data after coding using a synthetic data generation method for numerical attributes; if the value of the numerical attribute which is contained in the first synthetic data and corresponds to the value of one of the category attributes exceeds a range of values that can be assumed by the value of that numerical attribute, converts the value of that numerical attribute to a value included in the range of values that can be assumed by the value of that numerical attribute; and decodes the value of the numerical attribute which is contained in the first synthetic data after conversion and corresponds to the value of one of the category attributes to the value of that category attribute in accordance with the coding rule to obtain synthetic data.

First claim

Opening claim text (preview).

What is claimed is: 1. A synthetic data generation apparatus comprising: storage that stores a coding rule which indicates correspondence between a code and a value of a category attribute, and processing circuitry configured to: code a value of each of category attributes contained in original data into a value of a numerical attribute in accordance with the coding rule; generate first synthetic data from the original data after coding using a synthetic data generation method for numerical attributes; if the value of the numerical attribute which is contained in the first synthetic data and corresponds to the value of one of the category attributes exceeds a range of values that can be assumed by the value of that numerical attribute, convert the value of that numerical attribute to a value included in the range of values that can be assumed by the value of that numerical attribute; and decode the value of the numerical attribute which is contained in the first synthetic data after conversion and which corresponds to the value of one of the category attributes to the value of that category attribute in accordance with the coding rule to obtain synthetic data, wherein the value of the numerical attribute is a value that can be measured numerically, and the value of the category attribute is a value that cannot be measured numerically, the coding rule is a 1-of-K coding method, and the synthetic data maintains relationships among all the attributes in the original data, wherein (i) the relationships are variance-covariance or (ii) the relationships are correlation. 2. A non-transitory computer-readable recording medium having recorded thereon a program for causing a computer to function as the synthetic data generation apparatus according to claim 1 . 3. A synthetic data generation method for execution by a synthetic data generation apparatus that includes storage and processing circuitry, the synthetic data generation method comprising: a coding step in which the processing circuitry codes a value of each of category attributes contained in original data into a value of a numerical attribute in accordance with a coding rule which is stored in the storage and indicates correspondence between a code and a value of a category attribute; a data formatting step in which the processing circuitry generates first synthetic data from the original data after coding using a synthetic data generation method for numerical attributes; a conversion step in which, if the value of the numerical attribute which is contained in the first synthetic data and corresponds to the value of one of the category attributes exceeds a range of values that can be assumed by the value of that numerical attribute, the processing circuitry converts the value of that numerical attribute to a value included in the range of values that can be assumed by the value of that numerical attribute; and a decoding step in which the processing circuitry decodes the value of the numerical attribute which is contained in the first synthetic data after conversion and which corresponds to the value of one of the category attributes to the value of that category attribute in accordance with the coding rule to obtain synthetic data, wherein the value of the numerical attribute is a value that can be measured numerically, and the value of the category attribute is a value that cannot be measured numerically, the coding rule is a 1-of-K coding method, and the synthetic data maintains relationships among all the attributes in the original data, wherein (i) the relationships are variance-covariance or (ii) the relationships are correlation.

Assignees

Inventors

Classifications

  • Data format conversion from or to a database · CPC title

  • Clustering or classification · CPC title

  • by anonymising data, e.g. decorrelating personal data from the owner's identification · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12265649B2 cover?
A synthetic data generation apparatus codes a value of each of category attributes contained in original data into a value of a numerical attribute in accordance with a coding rule; generates first synthetic data from the original data after coding using a synthetic data generation method for numerical attributes; if the value of the numerical attribute which is contained in the first synthetic…
Who is the assignee on this patent?
Nippon Telegraph & Telephone
What technology area does this patent fall under?
Primary CPC classification G06F21/6254. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 01 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).