Selecting backing stores based on data request

US2022012223A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2022012223-A1
Application numberUS-202117473320-A
CountryUS
Kind codeA1
Filing dateSep 13, 2021
Priority dateJul 6, 2017
Publication dateJan 13, 2022
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for improving database searches are described herein. In an embodiment, a server computer system stores one or more first datasets in a first data repository and one or more second datasets in a second data repository. The server computer receives a request to perform an analysis on a particular dataset. The server computer determines that the particular dataset is stored in the first data repository and the second data repository. Based, at least in part, on an attribute of the request, the server computer selects the second data repository and responds to the request with data from the particular dataset stored in the second data repository.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method of determining a backing store for responding to a query, comprising: storing, in a columnar datastore, one or more first datasets; storing, in an index data repository, indices of one or more second datasets; receiving a query to perform an analysis on a particular dataset; determining that the particular dataset is stored in the columnar datastore and an index of the particular dataset is stored in the index data repository; determining whether a number of rows in the particular dataset is more than a threshold; in response to determining that the number of rows exceeds the threshold, determining whether the query includes a column aggregation; in response to determining that the threshold exceeds the number of rows, responding to the query with data from the columnar datastore, wherein one or more steps is performed by a computer. 2 . The computer-implemented method of claim 1 , further comprising: determining that the particular dataset is not subject to access controls, the determining of whether the number of rows in the particular dataset is more than the threshold being performed in response to determining that the particular dataset is not subject to access controls. 3 . The computer-implemented method of claim 2 , further comprising determining that a view of the particular dataset is stored in the index data repository before determining that the particular dataset is not subject to access controls. 4 . The computer-implemented method of claim 1 , further comprising, in response to determining that the query includes a column aggregation, responding to the query with data from the columnar datastore. 5 . The computer-implemented method of claim 1 , further comprising, in response to determining that the query does not include a column aggregation, determining whether the query includes a row filtering condition. 6 . The computer-implemented method of claim 5 , the row filtering condition specifying access controls. 7 . The computer-implemented method of claim 5 , further comprising: in response to determining that the query includes a row filtering condition, identifying a second number of rows that match the query based on the index; when the second number of rows is greater than a second threshold, responding to the query with data from the columnar datastore; when the second number of rows is lower than the second threshold, responding to the query using the index. 8 . The computer-implemented method of claim 7 , the second threshold being expressed as a percentage of the number of rows in the particular dataset. 9 . The computer-implemented method of claim 5 , further comprising, in response to determining that the query includes a row filtering condition, responding to the query with data from the index data repository. 10 . The computer-implemented method of claim 5 , further comprising, in response to determining that the query does not include a row filtering condition, responding to the query with data from the columnar datastore. 11 . One or more non-transitory computer-readable storage media storing instructions which when executed cause one or more processors to perform a method of determining a backing store for responding to a query, the method comprising: storing, in a columnar datastore, one or more first datasets; storing, in an index data repository, indices of one or more second datasets; receiving a query to perform an analysis on a particular dataset; determining that the particular dataset is stored in the columnar datastore and an index of the particular dataset is stored in the index data repository; determining whether a number of rows in the particular dataset is more than a threshold; in response to determining that the number of rows exceeds the threshold, determining whether the query includes a column aggregation; in response to determining that the threshold exceeds the number of rows, responding to the query with data from the columnar datastore. 12 . The one or more non-transitory computer-readable storage media of claim 11 , the method further comprising: determining that the particular dataset is not subject to access controls, the determining of whether the number of rows in the particular dataset is more than the threshold being performed in response to determining that the particular dataset is not subject to access controls. 13 . The one or more non-transitory computer-readable storage media of claim 12 , the method further comprising determining that a view of the particular dataset is stored in the index data repository before determining that the particular dataset is not subject to access controls. 14 . The one or more non-transitory computer-readable storage media of claim 11 , the method further comprising, in response to determining that the query includes a column aggregation, responding to the query with data from the columnar datastore. 15 . The one or more non-transitory computer-readable storage media of claim 11 , the method further comprising, in response to determining that the query does not include a column aggregation, determining whether the query includes a row filtering condition. 16 . The one or more non-transitory computer-readable storage media of claim 15 , the row filtering condition specifying access controls. 17 . The one or more non-transitory computer-readable storage media of claim 15 , the method further comprising: in response to determining that the query includes a row filtering condition, identifying a second number of rows that match the query based on the index; when the second number of rows is greater than a second threshold, responding to the query with data from the columnar datastore; when the second number of rows is lower than the second threshold, responding to the query using the index. 18 . The one or more non-transitory computer-readable storage media of claim 17 , the second threshold being expressed as a percentage of the number of rows in the particular dataset. 19 . The one or more non-transitory computer-readable storage media of claim 15 , the method further comprising, in response to determining that the query includes a row filtering condition, responding to the query with data from the index data repository. 20 . The one or more non-transitory computer-readable storage media of claim 15 , further comprising, in response to determining that the query does not include a row filtering condition, responding to the query with data from the columnar datastore.

Assignees

Inventors

Classifications

  • G06F16/278Primary

    Data partitioning, e.g. horizontal or vertical partitioning · CPC title

  • Column-oriented storage; Management thereof · CPC title

  • Query processing · CPC title

  • to a system of files or objects, e.g. local or distributed file system or database · CPC title

  • Query optimisation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022012223A1 cover?
Techniques for improving database searches are described herein. In an embodiment, a server computer system stores one or more first datasets in a first data repository and one or more second datasets in a second data repository. The server computer receives a request to perform an analysis on a particular dataset. The server computer determines that the particular dataset is stored in the firs…
Who is the assignee on this patent?
Palantir Technologies Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/278. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jan 13 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).