Query execution in database systems based on disjunction probability

US12468706B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12468706-B2
Application numberUS-202418901356-A
CountryUS
Kind codeB2
Filing dateSep 30, 2024
Priority dateOct 27, 2022
Publication dateNov 11, 2025
Grant dateNov 11, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A database system operates by: determining a query for execution against a dataset that indicates a filtering predicate denoting a disjunction between a first range-based predicate and a second range-based predicate; accessing distribution data for the dataset indicating a plurality of kernels for a plurality of points in a multi-dimensional space; identifying a first sub-region within the multi-dimensional space corresponding to the first range-based predicate; identifying a second sub-region within the multi-dimensional space corresponding to the second range-based predicate; computing a disjunction probability approximation value based on an average portion summation value across a plurality of portion summation values generated for the plurality of kernels; and executing the query based on the disjunction probability approximation value.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: determining a query for execution against a dataset that indicates a filtering predicate denoting a disjunction between a first range-based predicate and a second range-based predicate; accessing distribution data for the dataset indicating a plurality of kernels in a multi-dimensional space; identifying a first sub-region within the multi-dimensional space corresponding to the first range-based predicate; identifying a second sub-region within the multi-dimensional space corresponding to the second range-based predicate; computing, for a kernel of the plurality of kernels: a first portion included within the first sub-region; a second portion included within the second sub-region; a portion summation value based on a summation of portions that includes the first portion and the second portion; computing a disjunction probability approximation value based on an average portion summation value across a plurality of portion summation values generated for the plurality of kernels; and executing the query based on the disjunction probability approximation value. 2 . The method of claim 1 , wherein the first sub-region and the second sub-region have a non-null intersection region, wherein a difference between the non-null intersection region and the first sub-region is non-null, and wherein a difference between the non-null intersection region and the second sub-region is non-null. 3 . The method of claim 1 , wherein the kernel of the plurality of kernels have a same size as at least one other kernel of the plurality of kernels, and wherein computing the portion summation value is further based on: setting the portion summation value as the summation of portions when the summation of portions is less than or equal to a value corresponding to the same size; and setting the portion summation value as the value corresponding to same size when the summation of portions is greater than the value corresponding to the same size. 4 . The method of claim 3 , wherein the portion summation value for at least one kernel of the plurality of kernels is greater than the value corresponding to the same size based on the at least one of the plurality of kernels intersecting an intersection region between the first sub-region and the second sub-region. 5 . The method of claim 1 , wherein the kernel of the plurality of kernels is defined by a kernel function centered at a corresponding point of a plurality of points in the multi-dimensional space, and wherein determining the portion summation value for the kernel of the plurality of kernels is based on: computing an integral of the kernel function centered at the corresponding point of the plurality of points, wherein the integral is computed over a corresponding region of integration of a plurality of regions of integration, wherein the plurality of regions of integration includes a first region of integration defined by the first sub-region and a second region of integration defined by the second sub-region, wherein the first portion of the kernel of the plurality of kernels included within the first sub-region is computed based on a first integral computed over the first region of integration, wherein the second portion of the kernel included within the first sub-region is computed based on a second integral computed over the first region of integration. 6 . The method of claim 5 , wherein the kernel of the plurality of kernels has a same total integral value as at least one of the other kernels of the plurality of kernels when an unbounded integration is applied, and wherein the portion summation value for the kernel is computed based on: computing a summation of the integral of the at least one other kernel and for the kernel; setting the portion summation value as the summation of the at least one other kernel and the kernel when the summation of the at least one other kernel and the kernel is less than or equal to same total integral value; and setting the portion summation value as the same total integral value when the summation of the at least one other kernel and the kernel is greater than the same total integral value, wherein computing the disjunction probability approximation value is further based on: dividing the portion summation value by the same total integral value. 7 . The method of claim 1 , wherein the first range-based predicate denotes a first numeric value range for a first column storing a first numeric value for a relational database row of a plurality of relational database rows of the dataset, and wherein the second range-based predicate denotes a second numeric value range for a second column storing a second numeric value for the relational database rows of the plurality of relational database rows of the dataset. 8 . The method of claim 7 , wherein the multi-dimensional space has a first dimension corresponding to first numeric values of the first column, and wherein the multi-dimensional space has a second dimension corresponding to second numeric values of the second column. 9 . The method of claim 8 , wherein the distribution data for the dataset indicates multivariate kernel-based probability distribution data, and wherein the multivariate kernel-based probability distribution data has a same dimensionality as the multi-dimensional space. 10 . The method of claim 9 , wherein the multivariate kernel-based probability distribution data corresponds to a kernel-based estimation of a corresponding multivariate probability density function. 11 . The method of claim 9 , wherein the same dimensionality is equal to two based on the multivariate kernel-based probability distribution data being generated for only the first column and the second column. 12 . The method of claim 9 , wherein a kernel function defines each of the kernel of the plurality of kernels as a paraboloid intersecting a two-dimensional plane defining the multi-dimensional space at a circular region surrounding the kernel of the plurality of kernels, wherein computing the first portion of the kernel of the plurality of kernels included within the first sub-region is based on a first volume under first portions of the paraboloid having corresponding portions of the circular region within the first sub-region; and wherein computing the second portion of the kernel of the plurality of kernels included within the second sub-region is based on a second volume under second portions of the paraboloid having corresponding portions of the circular region within the second sub-region. 13 . The method of claim 9 , wherein the same dimensionality is greater than two based on the distribution data for the dataset indicating multivariate kernel-based probability distribution data for the first column, the second column, and at least one additional column. 14 . The method of claim 1 , further comprising: receiving the dataset for storage; processing the dataset for storage based on: generating the distribution data for the dataset; and storing the dataset and the distribution data for the dataset in database system memory resources, wherein the distribution data is accessed via the database system memory resources, and wherein executing the query includes accessing at least a portion of the dataset via the database system memory resources. 15 . The method of claim 14 , wherein generating the distribution data for the dataset includes at least one of: sampling a subset of a plurality of rows of the dataset; determining the plurality of points based on values of the subset of a plurality of rows; or determining a k

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12468706B2 cover?
A database system operates by: determining a query for execution against a dataset that indicates a filtering predicate denoting a disjunction between a first range-based predicate and a second range-based predicate; accessing distribution data for the dataset indicating a plurality of kernels for a plurality of points in a multi-dimensional space; identifying a first sub-region within the mult…
Who is the assignee on this patent?
Ocient Holdings LLC
What technology area does this patent fall under?
Primary CPC classification G06F16/24553. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 11 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).