Method and apparatus for transforming data

US2018095933A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2018095933-A1
Application numberUS-201715720998-A
CountryUS
Kind codeA1
Filing dateSep 29, 2017
Priority dateSep 30, 2016
Publication dateApr 5, 2018
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A data transformation apparatus selects items one by one and generates a first weight dataset and a second weight dataset on the basis of similarity between first records in a first dataset and second records in a second datasets. The first records and second records respectively include first item values and second item values that belong to the selected item. Based on the first weight dataset, the data transformation apparatus transforms the first dataset into a first similarity-determining dataset including third records. Each third record includes a numerical value that indicates a relationship between transformed item values belonging to different items. Further, based on the second weight dataset, the data transformation apparatus transforms the second dataset into a second similarity-determining dataset including fourth records. Each fourth record includes a numerical value that indicates a relationship between transformed item values belonging to different items.

First claim

Opening claim text (preview).

What is claimed is: 1 . A non-transitory computer-readable storage medium storing therein a data transformation program that causes a computer to perform a procedure comprising: obtaining a first dataset and a second dataset, the first dataset being a collection of first records each including a numerical value that indicates a relationship between two or more first item values belonging to a plurality of different items, the second dataset being a collection of second records each including a numerical value that indicates a relationship between two or more second item values belonging to the plurality of different items; selecting one of the plurality of different items so as to divide the first item values into selected first item values and non-selected first item values, as well as the second item values into selected second item values and non-selected second item values; calculating similarity between relationships of the selected first item values with the non-selected first item values in the first dataset and relationships of the selected second item values with the non-selected second item values in the second dataset; generating, based on the calculated similarity, a first weight dataset that indicates influence of the selected first item values on a subset of transformed item values that belongs to the selected item, as well as a second weight dataset that indicates influence of the selected second item values on the subset of transformed item values that belongs to the selected item; repeating the calculating of similarity and the generating of a first weight dataset and a second weight dataset, while changing the selected item; transforming the first dataset into a first similarity-determining dataset, based on the first weight datasets generated for the plurality of different items as a result of the repeating, the first similarity-determining dataset being a collection of third records each including a numerical value that indicates a relationship between two or more of the transformed item values belonging to the plurality of different items; and transforming the second dataset into a second similarity-determining dataset, based on the second weight datasets generated for the plurality of different items as a result of the repeating, the second similarity-determining dataset being a collection of fourth records each including a numerical value that indicates a relationship between two or more of the transformed item values belonging to the plurality of different items. 2 . The non-transitory computer-readable storage medium according to claim 1 , wherein the generating of a first weight dataset and a second weight dataset includes: generating, with respect to each of the plurality of different items, initial first and second weight datasets formed from initial weight values, the initial first weight datasets including non-selected initial first weight datasets generated with respect to items other than the selected item, the initial second weight datasets including non-selected initial second weight datasets generated with respect to items other than the selected item; and calculating, based on the non-selected initial first weight datasets and non-selected initial second weight datasets, similarity between relationships of the selected first item values with the non-selected first item values in the first dataset and relationships of the selected second item values with the non-selected second item values in the second dataset. 3 . The non-transitory computer-readable storage medium according to claim 1 , wherein the generating of a first weight dataset and a second weight dataset includes: repeating a process of selecting the plurality of different items individually and generating new first and second weight datasets for the selected items until a specific end condition is met. 4 . The non-transitory computer-readable storage medium according to claim 1 , wherein the procedure further includes: calculating similarity between numerical values included in the third records of the first similarity-determining dataset and numerical values included in the fourth records of the second similarity-determining dataset. 5 . The non-transitory computer-readable storage medium according to claim 1 , wherein: the first weight dataset is a first matrix that satisfies orthonormality conditions, the first matrix being formed from weight values that indicate individual influences of the selected first item values on the transformed item values; and the second weight dataset is a second matrix that satisfies orthonormality conditions, the second matrix being formed from weight values that indicate individual influences of the selected second item values on the transformed item values. 6 . A data transformation method comprising: obtaining a first dataset and a second dataset, the first dataset being a collection of first records each including a numerical value that indicates a relationship between two or more first item values belonging to a plurality of different items, the second dataset being a collection of second records each including a numerical value that indicates a relationship between two or more second item values belonging to the plurality of different items; selecting, by a processor, one of the plurality of different items so as to divide the first item values into selected first item values and non-selected first item values, as well as the second item values into selected second item values and non-selected second item values; calculating, by the processor, similarity between relationships of the selected first item values with the non-selected first item values in the first dataset and relationships of the selected second item values with the non-selected second item values in the second dataset; generating, by the processor, based on the calculated similarity, a first weight dataset that indicates influence of the selected first item values on a subset of transformed item values that belongs to the selected item, as well as a second weight dataset that indicates influence of the selected second item values on the subset of transformed item values that belongs to the selected item; repeating, by the processor, the calculating of similarity and the generating of a first weight dataset and a second weight dataset, while changing the selected item; transforming, by the processor, the first dataset into a first similarity-determining dataset, based on the first weight datasets generated for the plurality of different items as a result of the repeating, the first similarity-determining dataset being a collection of third records each including a numerical value that indicates a relationship between two or more of the transformed item values belonging to the plurality of different items; and transforming, by the processor, the second dataset into a second similarity-determining dataset, based on the second weight datasets generated for the plurality of different items as a result of the repeating, the second similarity-determining dataset being a collection of fourth records each including a numerical value that indicates a relationship between two or more of the transformed item values belonging to the plurality of different items. 7 . A data transformation apparatus comprising: a memory configured to store therein a first dataset and a second dataset, the first dataset being a collection of first records each including a numerical value that indicates a relationship between two or more first item values belonging to a plurality of different items, the second dataset being a collection of second records each including a numerical value that indicates a relationship between two or more second item values belonging to the plurality of dif

Assignees

Inventors

Classifications

  • Vector processors · CPC title

  • for evaluating statistical data {, e.g. average values, frequency distributions, probability functions, regression analysis (forecasting specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title

  • G06F15/82Primary

    data or demand driven · CPC title

  • ASIC · CPC title

  • Multidimensional correlation or convolution · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2018095933A1 cover?
A data transformation apparatus selects items one by one and generates a first weight dataset and a second weight dataset on the basis of similarity between first records in a first dataset and second records in a second datasets. The first records and second records respectively include first item values and second item values that belong to the selected item. Based on the first weight dataset…
Who is the assignee on this patent?
Fujitsu Ltd
What technology area does this patent fall under?
Primary CPC classification G06F15/82. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Apr 05 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).