System and method of generating reusable distance measures for data processing

US9058345B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9058345-B2
Application numberUS-64095809-A
CountryUS
Kind codeB2
Filing dateDec 17, 2009
Priority dateDec 17, 2009
Publication dateJun 16, 2015
Grant dateJun 16, 2015

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one embodiment the present invention includes a computer-implemented method of analyzing data. The method includes storing, by a computer system, a column definition that includes metadata that defines a column. The method further includes generating, by the computer system, a distance measure for the column. The method further includes storing, by the computer system, the distance measure for the column as part of the metadata for the column in the column definition. In this manner, improvements may result in the areas of reuse, delegation, usability, and precalculation.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of analyzing data, comprising: storing, by a computer system, a column definition that includes metadata that defines a column; generating, by the computer system, a user interface that receives selection of a distance measure function for the column; storing, by the computer system, the distance measure function for the column as part of the metadata for the column in the column definition, wherein the distance measure function stored in the metadata for the column corresponds to rules for comparing a plurality of data values of the column, wherein the distance measure function stored in the metadata for the column corresponds to a function that compares the plurality of data values of the column; receiving, by the computer system, a request to compare the plurality of data values of the column; accessing, by the computer system after receiving the request, the distance measure function stored in the metadata for the column; generating, by the computer system after receiving the request, a plurality of distance measure results by applying the distance measure function stored in the metadata for the column to the plurality of data values; and displaying, by the computer system, a comparison between the plurality of distance measure results, wherein storing the distance measure function includes: receiving a first type for the distance measure function for the column, generating a first set of distance measure results that corresponds to the first type for the distance measure function, saving the distance measure function having the first type in the metadata for the column, receiving a second type for the distance measure function for the column, wherein the first type and the second type are two of a derived type, a distance along a hierarchy type, an axis distortion and automatic binning type, a categorical type, a routine type, and a measurement-specific adoption rule type, generating a second set of distance measure results that corresponds to the second type for the distance measure function, and saving the distance measure function having the second type in the metadata for the column. 2. The computer-implemented method of claim 1 , wherein the distance measure function is a norm function stored in the metadata for the column, further comprising: calculating, by the computer system, a norm for the column according to the norm function. 3. The computer-implemented method of claim 1 , wherein the distance measure function is a metric function stored in the metadata for the column, further comprising: calculating, by the computer system, a metric for the column according to the metric function. 4. The computer-implemented method of claim 1 , wherein the column includes a plurality of component columns that include a plurality of distance measure functions, respectively, and wherein the distance measure function is derived from the plurality of distance measure functions of the plurality of component columns, respectively. 5. The computer-implemented method of claim 1 , wherein the distance measure function stored in the metadata for the column is generated according to a distance along a hierarchy. 6. The computer-implemented method of claim 1 , wherein the distance measure function stored in the metadata for the column is generated according to axis distortion and automatic binning. 7. The computer-implemented method of claim 1 , wherein the distance measure function stored in the metadata for the column is generated according to a categorical metric. 8. The computer-implemented method of claim 1 , further comprising: storing representations of the plurality of data values of the column, wherein the Representations are suited for comparison of the values. 9. The computer-implemented method of claim 1 , further comprising: reading, by the computer system, all the plurality of data values from a Database prior to generating the plurality of distance measure results. 10. The computer-implemented method of claim 1 , wherein the distance measure function stored in the metadata for the column includes external hierarchy information stored in the column definition. 11. The computer-implemented method of claim 1 , wherein the distance measure function stored in the metadata for the column includes aggregated column information stored in the column definition. 12. A computer program, embodied on a tangible non-transitory recording medium, for controlling a computer system to analyze data, the computer program comprising: a database management program that is configured to control the computer system to store, in the computer system, a column definition that includes metadata that defines a column; a distance measurement generator program that is configured to control the computer system to generate a user interface that receives selection of a distance measure function for the column; and a comparison program, wherein the database management program is further configured to control the computer system to store the distance measure function for the column as part of the metadata for the column in the column definition, wherein the distance measure function stored in the metadata for the column corresponds to rules for comparing a plurality of data values of the column, and wherein the distance measure function stored in the metadata for the column corresponds to a function that compares the plurality of data values of the column, wherein the comparison program is configured to control the computer system to receive a request to compare the plurality of data values of the column, wherein the database management program is further configured to control the computer system to access, after receiving the request, the distance measure function stored in the metadata for the column, wherein the comparison program is further configured to control the computer system to generate, after receiving the request, a plurality of distance measure results by applying the distance measure function stored in the metadata for the column to the plurality of data values, and wherein the comparison program is further configured to control the computer system to display a comparison between the plurality of distance measure results, wherein storing the distance measure function includes: the database management program being further configured to control the computer system to receive a first type for the distance measure function for the column, the database management program being further configured to control the computer system to generate a first set of distance measure results that corresponds to the first type for the distance measure function, the database management program being further configured to control the computer system to save the distance measure function having the first type in the metadata for the column, the database management program being further configured to control the computer system to receive a second type for the distance measure function for the column, wherein the first type and the second type are two of a derived type, a distance along a hierarchy type, an axis distortion and automatic binning type, a categorical type, a routine type, and a measurement-specific adoption rule type, the database management program being further configured to control the computer system to generate a second set of distance measure results that corresponds to the second type for the distance measure function, and the database management program being further configured to control the computer system to save the distance measure function having the second type in the metadata for the column.

Assignees

Inventors

Classifications

  • G06F16/217Primary

    Database tuning (G06F16/2282 takes precedence; database performance monitoring G06F11/3409) · CPC title

  • Query processing support for facilitating data mining operations in structured databases · CPC title

  • Data format conversion from or to a database · CPC title

  • Physics · mapped topic

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9058345B2 cover?
In one embodiment the present invention includes a computer-implemented method of analyzing data. The method includes storing, by a computer system, a column definition that includes metadata that defines a column. The method further includes generating, by the computer system, a distance measure for the column. The method further includes storing, by the computer system, the distance measure f…
Who is the assignee on this patent?
Rinneberg Thomas, Sap Se
What technology area does this patent fall under?
Primary CPC classification G06F16/217. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 16 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).