Method and apparatus for transforming data

US10769100B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10769100-B2
Application numberUS-201715720998-A
CountryUS
Kind codeB2
Filing dateSep 29, 2017
Priority dateSep 30, 2016
Publication dateSep 8, 2020
Grant dateSep 8, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A data transformation apparatus selects items one by one and generates a first weight dataset and a second weight dataset on the basis of similarity between first records in a first dataset and second records in a second datasets. The first records and second records respectively include first item values and second item values that belong to the selected item. Based on the first weight dataset, the data transformation apparatus transforms the first dataset into a first similarity-determining dataset including third records. Each third record includes a numerical value that indicates a relationship between transformed item values belonging to different items. Further, based on the second weight dataset, the data transformation apparatus transforms the second dataset into a second similarity-determining dataset including fourth records. Each fourth record includes a numerical value that indicates a relationship between transformed item values belonging to different items.

First claim

Opening claim text (preview).

What is claimed is: 1. A non-transitory computer-readable storage medium storing therein a data transformation program that causes a computer to perform a procedure comprising: obtaining a first dataset having three or more items and a second dataset having the three or more items in a memory, the first dataset being a collection of first records each including a numerical value that indicates a relationship among three or more first item values belonging to the three or more items, respectively, the second dataset being a collection of second records each including a numerical value that indicates a relationship among three or more second item values belonging to the three or more items, respectively; selecting one of the three or more items so as to divide the first item values into selected first item values belonging to a selected item which is selected and non-selected first item values belonging to two or more non-selected items which are not selected, as well as the second item values into selected second item values belonging to the selected item and non-selected second item values belonging to the two or more non-selected items; calculating similarity between relationships of the selected first item values with the non-selected first item values in the first dataset and relationships of the selected second item values with the non-selected second item values in the second dataset; generating, based on the calculated similarity, a first weight dataset that includes first weight values to be multiplied by the selected first item values to calculate a subset of transformed item values that belongs to the selected item, as well as a second weight dataset that includes second weight values to be multiplied by the selected second item values to calculate the subset of transformed item values that belongs to the selected item, the first weight dataset being a first matrix that satisfies orthonormality conditions, the first matrix being formed from the first weight values, the second weight dataset being a second matrix that satisfies orthonormality conditions, the second matrix being formed from the second weight values; repeating the calculating of similarity and the generating of the first weight dataset and the second weight dataset, while changing the selected item; transforming the first dataset having the three or more items into a first similarity-determining dataset having the three or more items, based on the first weight datasets generated for the three or more items as a result of the repeating, the first similarity-determining dataset being a collection of third records each including a numerical value that indicates a relationship among three or more of the transformed item values belonging to the three or more items, respectively; transforming the second dataset having the three or more items into a second similarity-determining dataset having the three or more items, based on the second weight datasets generated for the three or more items as a result of the repeating, the second similarity-determining dataset being a collection of fourth records each including a numerical value that indicates a relationship among three or more of the transformed item values belonging to the three or more items, respectively; and storing the first similarity-determining dataset and the second similarity-determining dataset in the memory. 2. The non-transitory computer-readable storage medium according to claim 1 , wherein the generating of a first weight dataset and a second weight dataset includes: generating, with respect to each of the three or more items, initial first and second weight datasets formed from initial weight values, the initial first weight datasets including non-selected initial first weight datasets generated with respect to the two or more non-selected item, the initial second weight datasets including non-selected initial second weight datasets generated with respect to the two or more non-selected; and calculating, based on the non-selected initial first weight datasets and non-selected initial second weight datasets, similarity between relationships of the selected first item values with the non-selected first item values in the first dataset and relationships of the selected second item values with the non-selected second item values in the second dataset. 3. The non-transitory computer-readable storage medium according to claim 1 , wherein the generating of a first weight dataset and a second weight dataset includes: repeating a process of selecting the three or more items individually and generating new first and second weight datasets for the selected items until a specific end condition is met. 4. The non-transitory computer-readable storage medium according to claim 1 , wherein the procedure further includes: calculating similarity between numerical values included in the third records of the first similarity-determining dataset and numerical values included in the fourth records of the second similarity-determining dataset. 5. A data transformation method comprising: obtaining a first dataset having three or more items and a second dataset having the three or more items in a memory, the first dataset being a collection of first records each including a numerical value that indicates a relationship among three or more first item values belonging to the three or more items, respectively, the second dataset being a collection of second records each including a numerical value that indicates a relationship among three or more second item values belonging to the three or more items, respectively; selecting, by a processor, one of the three or more items so as to divide the first item values into selected first item values belonging to a selected item which is selected and non-selected first item values belonging to two or more non-selected items which are not selected, as well as the second item values into selected second item values belonging to the selected item and non-selected second item values belonging to the two or more non-selected items; calculating, by the processor, similarity between relationships of the selected first item values with the non-selected first item values in the first dataset and relationships of the selected second item values with the non-selected second item values in the second dataset; generating, by the processor, based on the calculated similarity, a first weight dataset that includes first weight values to be multiplied by the selected first item values to calculate a subset of transformed item values that belongs to the selected item, as well as a second weight dataset that includes second weight values to be multiplied by the selected second item values to calculate the subset of transformed item values that belongs to the selected item, the first weight dataset being a first matrix that satisfies orthonormality conditions, the first matrix being formed from the first weight values, the second weight dataset being a second matrix that satisfies orthonormality conditions, the second matrix being formed from the second weight values; repeating, by the processor, the calculating of similarity and the generating of the first weight dataset and the second weight dataset, while changing the selected item; transforming, by the processor, the first dataset having the three or more items into a first similarity-determining dataset having the three or more items, based on the first weight datasets generated for the three or more items as a result of the repeating, the first similarity-determining dataset being a collection of third records each including a numerical value that indicates a relationship among three or more of the transformed item values belonging to the three or more items, respectively; transforming, by the processor, the second dataset having

Assignees

Inventors

Classifications

  • G06F17/10Primary

    Complex mathematical operations {(function generation by table look-up G06F1/03; evaluation of elementary functions by calculation G06F7/544)} · CPC title

  • ASIC · CPC title

  • G06F15/82Primary

    data or demand driven · CPC title

  • Vector processors · CPC title

  • Multidimensional correlation or convolution · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10769100B2 cover?
A data transformation apparatus selects items one by one and generates a first weight dataset and a second weight dataset on the basis of similarity between first records in a first dataset and second records in a second datasets. The first records and second records respectively include first item values and second item values that belong to the selected item. Based on the first weight dataset…
Who is the assignee on this patent?
Fujitsu Ltd
What technology area does this patent fall under?
Primary CPC classification G06F17/10. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 08 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).