What technology area does this patent fall under?

Primary CPC classification G06F16/24532. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Feb 10 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Parallelized segment generation via key-based subdivision in database systems

US2022043690A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2022043690-A1
Application number	US-202016985957-A
Country	US
Kind code	A1
Filing date	Aug 5, 2020
Priority date	Aug 5, 2020
Publication date	Feb 10, 2022
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for execution by a record processing and storage system includes assigning each of a plurality of key space sub-intervals of a cluster key domain to a corresponding one of a plurality of processing core resources, and generating a plurality of segments from the set of records via the plurality of processing core resources. Each processing core resource in the plurality of processing core resources generates a subset of the plurality of segments by identifying a proper subset of the set of records based on having cluster key values included in a corresponding one of the plurality of key space sub-intervals, and by generating the subset of the plurality of segments to include the proper subset of the set of records.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for execution by a record processing and storage system, comprising: assigning each of a plurality of key space sub-intervals of a cluster key domain spanned by a plurality of cluster key values of a set of records to a corresponding one of a plurality of processing core resources; and generating a plurality of segments from the set of records via the plurality of processing core resources, wherein each processing core resource in the plurality of processing core resources generates a subset of the plurality of segments by: identifying, via each processing core resource, a proper subset of the set of records based on having cluster key values included in a corresponding one of the plurality of key space sub-intervals; and generating, via the each processing core resource, the subset of the plurality of segments to include the proper subset of the set of records. 2 . The method of claim 1 , further comprising segregating the cluster key domain into the plurality of key space sub-intervals. 3 . The method of claim 2 , further comprising: determining a selected number of key space sub-intervals to be generated based on a number of processing core resources in the plurality of processing core resources; wherein the cluster key domain is segregated into the selected number of key space sub-intervals. 4 . The method of claim 2 , further comprising: determining a target number of records to be included in each proper subset of the set of records based on: a total number of records in the set of records, and a selected number of key space sub-intervals to be generated; wherein the cluster key domain is segregated into the selected number of key space sub-intervals based on the target number of records. 5 . The method of claim 1 , wherein each the plurality of key space sub-intervals includes a corresponding one of a plurality of proper subsets of the plurality of cluster key values of the cluster key domain, wherein each of the plurality of proper subsets of the plurality of cluster key values are mutually exclusive and collectively exhaustive with respect to the plurality of cluster key values, and wherein each of the plurality of proper subsets of the plurality of cluster keys include sequential ones of the plurality of cluster key values in accordance with an ordering of the plurality of cluster key values. 6 . The method of claim 5 , wherein a first proper subset of the plurality of proper subsets includes a first number of cluster key values, and wherein a second proper subset of the plurality of proper subsets includes a second number of cluster key values that is different from the first number of cluster key values. 7 . The method of claim 1 , wherein generating the plurality of segments from the set of records via the plurality of processing core resources further comprises: accessing, via the each processing core resource, the proper subset of the set of records from storage in a row-based format; wherein the subset of the plurality of segments are generated to include the proper subset of the set of records in a column-based format. 8 . The method of claim 7 , wherein generating the plurality of segments from the set of records via the plurality of processing core resources further comprises: generating a plurality of record groups from the proper subset of the set of records based on cluster key values of the proper subset of the set of records; generating a set of column-formatted record data for each of the plurality of record groups; and generating a set of segments from each set of column-formatted record data. 9 . The method of claim 8 , wherein generating the set of segments from each set of column-formatted record data includes generating segment metadata for each set of segments. 10 . The method of claim 8 , wherein generating the set of segments from each set of column-formatted record data includes applying a redundancy storage error coding scheme to each set of column-formatted record data to generate a corresponding set of segments. 11 . The method of claim 1 , wherein the set of records are included in a plurality of pages stored by a page storage system, and wherein each page of the plurality of pages includes a plurality of records in the set of records. 12 . The method of claim 11 , further comprising: generating the plurality of pages; and determining to convert the plurality of pages into the plurality of records based on storage utilization data. 13 . The method of claim 11 , wherein identifying the proper subset of the set of records via the each processing core resource includes: accessing, via the each processing core resource, each of the plurality of pages; extracting, via the each processing core resource, ones of the plurality of records in the each of the plurality of pages having cluster key values included in the corresponding one of the plurality of key space sub-intervals. 14 . The method of claim 13 , wherein identifying the proper subset of the set of records via the each processing core resource further includes: populating a data structure with location data for the ones of the plurality of records in corresponding ones of the plurality of pages, wherein the data structure is organized based on an ordering of cluster key values of the ones of the plurality of records; extracting records from the plurality of pages in accordance with the ordering of cluster key values by utilizing the data structure. 15 . The method of claim 14 , wherein the data structure implements a min-heap organized by cluster key values. 16 . The method of claim 11 , wherein one plurality of records of one page of the plurality of pages includes: a first record having a first cluster key value included in a first one of the plurality of key space sub-intervals; and a second record having a second cluster key value included in a second one of the plurality of key space sub-intervals. wherein another plurality of records of another page of the plurality of pages includes: a third record having a third cluster key value included in the first one of the plurality of key space sub-intervals; and a fourth record having a fourth cluster key value included in the second one of the plurality of key space sub-intervals. 17 . The method of claim 16 , wherein generating the plurality of segments from the set of records via the plurality of processing core resources includes: accessing, via a first processing core resource, the one page and the another page; identifying, via the first processing core resource, a corresponding first proper subset of the set of records to include the first record and the third record, and to not include the second record and the fourth record, by identifying cluster key values included in the first one of the plurality of key space sub-intervals based on the first one of the plurality of key space sub-intervals being assigned to the first processing core resource; accessing, via a second processing core resource, the one page and the another page; and identifying, via the second processing core resource, a corresponding second proper subset of the set of records to include the second record and the fourth record, and to not include the first record and the third record, by identifying cluster key values included in the second one of the plurality of key space sub-intervals based on the second one of the plurality of key space sub-intervals being assigned to the second processing core resource. 18 . The method of claim

Assignees

Ocient Holdings LLC

Inventors

Classifications

G06F16/24532Primary
of parallel queries · CPC title
G06F16/24557
Efficient disk access during query execution · CPC title
G06F16/9027
Trees · CPC title
G06F16/906
Clustering; Classification · CPC title
G06F9/5066Primary
Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs (mappping at compile time, see G06F8/451) · CPC title

Patent family

Related publications grouped by family.

View patent family 80113820

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022043690A1 cover?: A method for execution by a record processing and storage system includes assigning each of a plurality of key space sub-intervals of a cluster key domain to a corresponding one of a plurality of processing core resources, and generating a plurality of segments from the set of records via the plurality of processing core resources. Each processing core resource in the plurality of processing co…
Who is the assignee on this patent?: Ocient Holdings LLC
What technology area does this patent fall under?: Primary CPC classification G06F16/24532. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Feb 10 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Automated provisioning for database performance

Monitoring of storage units in a dispersed storage network

Database management system cluster node subtasking data query

Frequently asked questions