Random number generator in a parallel processing database

US10922053B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10922053-B2
Application numberUS-201916667706-A
CountryUS
Kind codeB2
Filing dateOct 29, 2019
Priority dateSep 29, 2012
Publication dateFeb 16, 2021
Grant dateFeb 16, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A random number generation process generated uncorrelated random numbers from identical random number sequences on parallel processing database segments of an MPP database without communications between the segments by establishing a different starting position in the sequence on each segment using an identifier that is unique to each segment, query slice information and the number of segments. A master node dispatches a seed value to initialize the random number sequence generation on all segments, and dispatches the query slice information and information as to the number of segments during a normal query plan dispatch process.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: a massively parallel processing database system comprising a master node and a plurality of segment nodes, the master node and each of the plurality of segment nodes comprising one or more respective computers; and one or more non-transitory computer storage media encoded with computer program instructions that when executed by computers of the massively parallel processing database system cause the computers to perform operations comprising: receiving, by the master node, a query that specifies a random number operator that, when executed on a database relation, specifies the generation of a respective random number for each row of the database relation; computing by each segment node of the plurality of segment nodes, a respective step value that specifies how many values in a master sequence of random numbers to skip when selecting random numbers for the random number operator; computing, by each segment node of the plurality of segment nodes, a respective offset that specifies a starting position in the master sequence of random numbers; and generating, by each segment node of the plurality of segment nodes for one or more rows of a respective partition of the database relation assigned to the segment node, respective random numbers from the master sequence of random numbers starting from the respective offset computed by the segment node and repeatedly skipping a number of random numbers in the master sequence specified by the step value computed by the segment node. 2. The system of claim 1 , wherein computing the respective step value comprises designating the number of the plurality of segments nodes as the step value. 3. The system of claim 1 , wherein computing the respective offset comprises determining a position of the respective segment node in an ordered list of the plurality of segment nodes and designating the position as the respective offset. 4. The system of claim 1 , further comprising: providing, by the master node to each segment node of the plurality of segment nodes, a starting seed value for a random number generation procedure that is configured to generate a same master sequence of random numbers when started from a same starting seed value, wherein the starting seed value is generated using an identifier associated with the query, and wherein different queries are associated with different identifiers. 5. The system of claim 1 , further comprising: providing, by the master node to each segment node of the plurality of segment nodes, a starting seed value for a random number generation procedure that is configured to generate a same master sequence of random numbers when started from a same starting seed value, wherein the starting seed value is generated using an identifier associated with the query, and wherein same queries are associated with same identifiers. 6. The system of claim 1 , wherein the query is divided by the master node into two or more query slices, wherein each query slice computes a portion of the output of the random number operator. 7. The system of claim 6 , wherein computing the respective step value comprises multiplying the number of the query slices with the number of the segment nodes. 8. A method comprising: maintaining a massively parallel processing database system comprising a master node and a plurality of segment nodes, the master node and each of the plurality of segment nodes comprising one or more respective computers; receiving, by the master node, a query that specifies a random number operator that, when executed on a database relation, specifies the generation of a respective random number for each row of the database relation; computing by each segment node of the plurality of segment nodes, a respective step value that specifies how many values in a master sequence of random numbers to skip when selecting random numbers for the random number operator; computing, by each segment node of the plurality of segment nodes, a respective offset that specifies a starting position in the master sequence of random numbers; and generating, by each segment node of the plurality of segment nodes for one or more rows of a respective partition of the database relation assigned to the segment node, respective random numbers from the master sequence of random numbers starting from the respective offset computed by the segment node and repeatedly skipping a number of random numbers in the master sequence specified by the step value computed by the segment node. 9. The method of claim 8 , wherein computing the respective step value comprises designating the number of the plurality of segments nodes as the step value. 10. The method of claim 8 , wherein computing the respective offset comprises determining a position of the respective segment node in an ordered list of the plurality of segment nodes and designating the position as the respective offset. 11. The method of claim 8 , further comprising: providing, by the master node to each segment node of the plurality of segment nodes, a starting seed value for a random number generation procedure that is configured to generate a same master sequence of random numbers when started from a same starting seed value, wherein the starting seed value is generated using an identifier associated with the query, and wherein different queries are associated with different identifiers. 12. The method of claim 8 , further comprising: providing, by the master node to each segment node of the plurality of segment nodes, a starting seed value for a random number generation procedure that is configured to generate a same master sequence of random numbers when started from a same starting seed value, wherein the starting seed value is generated using an identifier associated with the query, and wherein same queries are associated with same identifiers. 13. The method of claim 8 , wherein the query is divided by the master node into two or more query slices, wherein each query slice computes a portion of the output of the random number operator. 14. The method of claim 13 , wherein computing the respective step value comprises multiplying the number of the query slices with the number of the segment nodes. 15. A non-transitory computer storage medium encoded with a computer program, the computer program storing instructions that when executed by one or more computers causes the one or more computers to perform operations comprising: maintaining a massively parallel processing database system comprising a master node and a plurality of segment nodes, the master node and each of the plurality of segment nodes comprising one or more respective computers; receiving, by the master node, a query that specifies a random number operator that, when executed on a database relation, specifies the generation of a respective random number for each row of the database relation; computing by each segment node of the plurality of segment nodes, a respective step value that specifies how many values in a master sequence of random numbers to skip when selecting random numbers for the random number operator; computing, by each segment node of the plurality of segment nodes, a respective offset that specifies a starting position in the master sequence of random numbers; and generating, by each segment node of the plurality of segment nodes for one or more rows of a respective partition of the database relation assigned to the segment node, respective random numbers from the master sequence of random numbers starting from the respective offset computed by the segment node and repeatedly skipping a num

Assignees

Inventors

Classifications

  • Random or pseudo-random number generators · CPC title

  • Distributed queries · CPC title

  • G06F7/582Primary

    Pseudo-random number generators · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10922053B2 cover?
A random number generation process generated uncorrelated random numbers from identical random number sequences on parallel processing database segments of an MPP database without communications between the segments by establishing a different starting position in the sequence on each segment using an identifier that is unique to each segment, query slice information and the number of segments.…
Who is the assignee on this patent?
Pivotal Software Inc
What technology area does this patent fall under?
Primary CPC classification G06F7/582. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 16 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).