Selecting backing stores based on data request

US11762830B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11762830-B2
Application numberUS-202117473320-A
CountryUS
Kind codeB2
Filing dateSep 13, 2021
Priority dateJul 6, 2017
Publication dateSep 19, 2023
Grant dateSep 19, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for improving database searches are described herein. In an embodiment, a server computer system stores one or more first datasets in a first data repository and one or more second datasets in a second data repository. The server computer receives a request to perform an analysis on a particular dataset. The server computer determines that the particular dataset is stored in the first data repository and the second data repository. Based, at least in part, on an attribute of the request, the server computer selects the second data repository and responds to the request with data from the particular dataset stored in the second data repository.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of determining a backing store for responding to a query, comprising: storing, in a columnar datastore, one or more first datasets; storing, in an index data repository, indices of one or more second datasets; receiving a query to perform an analysis on a particular dataset; determining that the particular dataset is stored in the columnar datastore and an index of the particular dataset is stored in the index data repository; determining whether a number of rows in the particular dataset is more than a threshold; in response to determining that the number of rows exceeds the threshold, determining whether the query includes a column aggregation; in response to determining that the query includes a column aggregation, responding to the query with data from the columnar datastore; receiving a second query to perform an analysis on a certain dataset; determining that the certain dataset is stored in the columnar datastore and an index of the certain dataset is stored in the index data repository; determining whether a number of rows in the certain dataset is more than the threshold; in response to determining that the number of rows in the certain dataset exceeds the threshold, determining whether the second query includes a column aggregation; in response to determining that the second query does not include a column aggregation, determining whether the second query includes a row filtering condition; in response to determining that the second query includes a row filtering condition, responding to the second query with second data from the index data repository; receiving a third query to perform an analysis on a specific dataset; determining that the specific dataset is stored in the columnar datastore and an index of the specific dataset is stored in the index data repository; determining whether a number of rows in the specific dataset is more than the threshold; in response to determining that the threshold exceeds the number of rows in the specific dataset, responding to the third query with third data from the columnar datastore, wherein one or more steps is performed by a computer. 2. The computer-implemented method of claim 1 , further comprising: determining that the particular dataset is not subject to access controls, the determining of whether the number of rows in the particular dataset is more than the threshold being performed in response to determining that the particular dataset is not subject to access controls. 3. The computer-implemented method of claim 2 , further comprising determining that a view of the particular dataset is stored in the index data repository before determining that the particular dataset is not subject to access controls. 4. The computer-implemented method of claim 1 , the row filtering condition specifying access controls. 5. The computer-implemented method of claim 1 , further comprising: receiving a fourth query to perform an analysis on a distinct dataset; determining that the distinct dataset is stored in the columnar datastore and an index of the distinct dataset is stored in the index data repository; determining, in response to receiving the fourth query, whether the distinct dataset is subject to access control; in response to determining that the distinct dataset is subject to access control; responding to the fourth query using the index of the distinct dataset. 6. The computer-implemented method of claim 1 , further comprising: receiving a fourth query to perform an analysis on a distinct dataset; determining that the distinct dataset is stored in the columnar datastore and an index of the distinct dataset is stored in the index data repository; determining, in response to receiving the fourth query, whether the distinct dataset is subject to access control; in response to determining that the particular dataset is not subject to access control, determining whether the number of rows in the particular dataset is more than the threshold. 7. The computer-implemented method of claim 1 , further comprising: receiving a fourth query to perform an analysis on a distinct dataset; determining that the distinct dataset is stored in the columnar datastore and an index of the distinct dataset is stored in the index data repository; determining whether a number of rows in the distinct dataset is more than the threshold; in response to determining that the number of rows in the distinct dataset exceeds the threshold, determining whether the fourth query includes a column aggregation; in response to determining that the fourth query does not include a column aggregation, determining whether the fourth query includes a row filtering condition; in response to determining that the fourth query does not include a row filtering condition, responding to the fourth query with fourth data from the columnar datastore. 8. One or more non-transitory computer-readable storage media storing instructions which when executed cause one or more processors to perform a method of determining a backing store for responding to a query, the method comprising: storing, in a columnar datastore, one or more first datasets; storing, in an index data repository, indices of one or more second datasets; receiving a query to perform an analysis on a particular dataset; determining that the particular dataset is stored in the columnar datastore and an index of the particular dataset is stored in the index data repository; determining whether a number of rows in the particular dataset is more than a threshold; in response to determining that the number of rows exceeds the threshold, determining whether the query includes a column aggregation; in response to determining that the query includes a column aggregation, responding to the query with data from the columnar datastore; in response to determining that the query does not include a column aggregation, determining whether the query includes a row filtering condition; in response to determining that the query includes a row filtering condition, responding to the query with data from the index data repository; in response to determining that the threshold exceeds the number of rows, responding to the query with data from the columnar datastore. 9. The one or more non-transitory computer-readable storage media of claim 8 , the method further comprising: determining that the particular dataset is not subject to access controls, the determining of whether the number of rows in the particular dataset is more than the threshold being performed in response to determining that the particular dataset is not subject to access controls. 10. The one or more non-transitory computer-readable storage media of claim 9 , the method further comprising determining that a view of the particular dataset is stored in the index data repository before determining that the particular dataset is not subject to access controls. 11. The one or more non-transitory computer-readable storage media of claim 8 , the row filtering condition specifying access controls. 12. The one or more non-transitory computer-readable storage media of claim 8 , the method further comprising: determining, in response to receiving the query, whether the particular dataset is subject to access control; in response to determining that the particular dataset is subject to access control; responding to the query using the index. 13. The one or more non-transitory computer-readable storage media of claim 12 , wherein determining whether the number of rows in the particular dataset is more than a threshold is performed in resp

Assignees

Inventors

Classifications

  • G06F16/278Primary

    Data partitioning, e.g. horizontal or vertical partitioning · CPC title

  • Management thereof · CPC title

  • Query processing · CPC title

  • Column-oriented storage; Management thereof · CPC title

  • Query optimisation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11762830B2 cover?
Techniques for improving database searches are described herein. In an embodiment, a server computer system stores one or more first datasets in a first data repository and one or more second datasets in a second data repository. The server computer receives a request to perform an analysis on a particular dataset. The server computer determines that the particular dataset is stored in the firs…
Who is the assignee on this patent?
Palantir Technologies Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/278. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 19 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).