In-database connectivity components analysis of data

US9792337B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9792337-B2
Application numberUS-201615253745-A
CountryUS
Kind codeB2
Filing dateAug 31, 2016
Priority dateMar 14, 2013
Publication dateOct 17, 2017
Grant dateOct 17, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method determines the connectivity components defined by a set of relations over a set of data elements. For each first data element of a selected subset of data elements, a second data element that is linked to the first data element by a path of relations is selected as its representative, using a randomization process. A new set of relations is created by replacing each first data element of the subset by its representative in at least part of the set of relations.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for determining connectivity components defined by an input set of relations over an input set of data elements, wherein a connectivity component is a subset of the input set data elements that are pair-wise connected such that no other data element of the input set can be added to the subset that is connected to any of the data elements of the subset, and wherein two data elements are connected whenever the two data elements are related by a relation in the set of relations or a path of relations between data elements exists that connects the two data elements, the method comprising: performing the following actions of selecting and forming iteratively beginning with a set of data elements that is the input set of data elements and a set of relations that is the input set of relations, and at each subsequent iteration using the contracted set of data elements as the set of data elements and the contracted set of relations as the set of relations, until the contracted set of relations is empty: selecting a respective representative for each data element, each representative being a data element in the set of data elements, to form a contracted set of data elements, the selecting comprising: assigning a random number to each data element in the set of data elements; identifying for each data element a respective group of data elements that includes the data element and all other data elements that are related to the data element; selecting as the respective representative a particular data element in the respective group that has an assigned random number with a predetermined random number position among the data elements in the respective group; and replacing each data element with the representative of the data element to form a contracted set of data elements; forming a contracted set of relations over the contracted set of data elements, the forming comprising: defining two representatives in the contracted set of data elements as related in the contracted set of relations if the two representatives represent data elements that were connected in the set of relations; and defining all other pairs of data elements in the contracted set of data elements as not related in the contracted set of relations; whereby, when the contracted set of relations is empty, each data element in the contracted set of data elements is an isolated data element, an isolated data element being a data element that is not related to any other data element in the contracted set of data elements; and outputting each isolated data element as a respective representative of a respective connectivity component of the input set of data elements, each connectivity component of the input set of data elements being representing by a distinct isolated data element. 2. The method of claim 1 , wherein the random number is a pseudorandom number between zero and one. 3. The method of claim 1 , wherein the predetermined random number position is the highest random number. 4. The method of claim 1 , wherein each data element has a respective weight and a respective leader, the method further comprising: before performing the actions of selecting and forming: assigning a common weight to each data element in the input set of data elements; and nominating each data element in the input set of data elements as its own leader. 5. The method of claim 4 , wherein selecting a respective representative comprises: assigning as the weight of each respective representative a sum of the weights of the data elements represented by the respective representative; and for each first data element in the set of data elements, identifying the data elements nominating the first data element and replacing the nominations of the first data element with nominations nominating the representative of the first data element. 6. The method of claim 5 , comprising: outputting the weight of each representative as a size of the connectivity component represented by the representative. 7. The method of claim 6 , wherein the common weight is one. 8. The method of claim 6 , comprising: outputting the leader of each data element of the input set of data elements as an identifier of the connectivity component of which the data element is a member. 9. The method of claim 1 , comprising: after each iteration of selecting and forming, outputting each isolated data element in the contracted set of data elements and removing the isolated data element from the set of contracted data elements. 10. Non-transitory computer-readable storage media encoded with computer program instructions for determining connectivity components defined by an input set of relations over an input set of data elements, wherein a connectivity component is a subset of the input set data elements that are pair-wise connected such that no other data element of the input set can be added to the subset that is connected to any of the data elements of the subset, and wherein two data elements are connected whenever the two data elements are related by a relation in the set of relations or a path of relations between data elements exists that connects the two data elements, the instructions when executed by one or more computers causing the one or more computers to perform operations comprising: performing the following actions of selecting and forming iteratively beginning with a set of data elements that is the input set of data elements and a set of relations that is the input set of relations, and at each subsequent iteration using the contracted set of data elements as the set of data elements and the contracted set of relations as the set of relations, until the contracted set of relations is empty: selecting a respective representative for each data element, each representative being a data element in the set of data elements, to form a contracted set of data elements, the selecting comprising: assigning a random number to each data element in the set of data elements; identifying for each data element a respective group of data elements that includes the data element and all other data elements that are related to the data element; selecting as the respective representative a particular data element in the respective group that has an assigned random number with a predetermined random number position among the data elements in the respective group; and replacing each data element with the representative of the data element to form a contracted set of data elements; forming a contracted set of relations over the contracted set of data elements, the forming comprising: defining two representatives in the contracted set of data elements as related in the contracted set of relations if the two representatives represent data elements that were connected in the set of relations; and defining all other pairs of data elements in the contracted set of data elements as not related in the contracted set of relations; whereby, when the contracted set of relations is empty, each data element in the contracted set of data elements is an isolated data element, an isolated data element being a data element that is not related to any other data element in the contracted set of data elements; and outputting each isolated data element as a respective representative of a respective connectivity component of the input set of data elements, each connectivity component of the input set of data elements being representing by a distinct isolated data element. 11. The media of claim 10 , wherein each data element has a respective weight and a respective leader, the operations further comprising: before performing the actions of selec

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9792337B2 cover?
A method determines the connectivity components defined by a set of relations over a set of data elements. For each first data element of a selected subset of data elements, a second data element that is linked to the first data element by a path of relations is selected as its representative, using a randomization process. A new set of relations is created by replacing each first data element …
Who is the assignee on this patent?
Pivotal Software Inc
What technology area does this patent fall under?
Primary CPC classification G06F17/30569. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 17 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).