Columnar Techniques for Big Metadata Management

US2023185816A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2023185816-A1
Application numberUS-202318166056-A
CountryUS
Kind codeA1
Filing dateFeb 8, 2023
Priority dateNov 13, 2020
Publication dateJun 15, 2023
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for managing big metadata using columnar techniques includes receiving a query request requesting data blocks from a data table that match query parameters. The data table is associated with system tables that each includes metadata for a corresponding data block of the data table. The method includes generating, based on the query request, a system query to return a subset of rows that correspond to the data blocks that match the query parameters. The method further includes generating, based on the query request and the system query, a final query to return a subset of data blocks from the data table corresponding to the subset of rows. The method also includes determining whether any of the data blocks in the subset of data blocks match the query parameters, and returning the matching data blocks when one or more data blocks match the query parameters.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method executed by data processing hardware that causes the data processing hardware to perform operations comprising: storing, on memory hardware in communication with the data processing hardware, a data table associated with a system table, the system table comprising a plurality of rows, each respective row of the plurality of rows of the system table comprising: a plurality of data blocks; and metadata associated with each data block of the plurality of data blocks of the respective row; receiving, from a client device, a query requesting return of data blocks from the data table that match query parameters; determining that the query parameters satisfy the respective metadata of one or more rows of the plurality of rows; and returning, to the client device, the one or more rows of the plurality of rows. 2 . The method of claim 1 , wherein the data table comprises a fact table and a dimension table, and wherein the query filters the dimension table. 3 . The method of claim 2 , wherein the operations further comprise generating, based on the query, a fact query that filters the fact table. 4 . The method of claim 1 , wherein a portion of the system table is cached in volatile memory. 5 . The method of claim 4 , wherein the operations further comprise, determining, based on access statistics, the portion of the system table cached in volatile memory. 6 . The method of claim 1 , wherein the data table is stored in a column-major format. 7 . The method of claim 1 , wherein the operations further comprise generating, based on the query, a system query to identify, based on the respective metadata, the one or more rows of the plurality of rows that satisfy the query parameters. 8 . The method of claim 7 , wherein the operations further comprise, generating, based on the query and the system query, a final query to return a subset of data blocks from the data table, each data block of the subset of data blocks corresponding to one of the one or more rows of the plurality of rows that satisfy the query parameters. 9 . The method of claim 8 , wherein generating the final query comprises generating a semi-join of the query and the system query. 10 . The method of claim 7 , wherein the system query comprises a falsifiable expression. 11 . A system comprising: data processing hardware; and memory hardware in communication with the data processing hardware, the memory hardware storing instructions that when executed on the data processing hardware cause the data processing hardware to perform operations comprising: storing, on the memory hardware, a data table associated with a system table, the system table comprising a plurality of rows, each respective row of the plurality of rows of the system table comprising: a plurality of data blocks; and metadata associated with each data block of the plurality of data blocks of the respective row; receiving, from a client device, a query requesting return of data blocks from the data table that match query parameters; determining that the query parameters satisfy the respective metadata of one or more rows of the plurality of rows; and returning, to the client device, the one or more rows of the plurality of rows. 12 . The system of claim 11 , wherein the data table comprises a fact table and a dimension table, and wherein the query filters the dimension table. 13 . The system of claim 12 , wherein the operations further comprise generating, based on the query, a fact query that filters the fact table. 14 . The system of claim 11 , wherein a portion of the system table is cached in volatile memory. 15 . The system of claim 14 , wherein the operations further comprise, determining, based on access statistics, the portion of the system table cached in volatile memory. 16 . The system of claim 11 , wherein the data table is stored in a column-major format. 17 . The system of claim 11 , wherein the operations further comprise generating, based on the query, a system query to identify, based on the respective metadata, the one or more rows of the plurality of rows that satisfy the query parameters. 18 . The system of claim 17 , wherein the operations further comprise, generating, based on the query and the system query, a final query to return a subset of data blocks from the data table, each data block of the subset of data blocks corresponding to one of the one or more rows of the plurality of rows that satisfy the query parameters. 19 . The system of claim 18 , wherein generating the final query comprises generating a semi-join of the query and the system query. 20 . The system of claim 17 , wherein the system query comprises a falsifiable expression.

Assignees

Inventors

Classifications

  • with dedicated cache, e.g. instruction or stack · CPC title

  • Query processing support for facilitating data mining operations in structured databases · CPC title

  • Caching of specific data in cache memory · CPC title

  • Query execution · CPC title

  • G06F16/221Primary

    Column-oriented storage; Management thereof · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2023185816A1 cover?
A method for managing big metadata using columnar techniques includes receiving a query request requesting data blocks from a data table that match query parameters. The data table is associated with system tables that each includes metadata for a corresponding data block of the data table. The method includes generating, based on the query request, a system query to return a subset of rows tha…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/2465. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 15 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).