Querying Over External Tables In Database Systems

US2020334239A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2020334239-A1
Application numberUS-201916385774-A
CountryUS
Kind codeA1
Filing dateApr 16, 2019
Priority dateApr 16, 2019
Publication dateOct 22, 2020
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, methods, and devices for querying over an external table are disclosed. A method includes connecting a database platform to an external table such that the database platform has read access for the external table and does not have write access for the external table. The method includes receiving a query comprising a predicate, the query directed at least to data in the external table. The method includes determining, based on metadata, one or more partitions in the external table comprising data satisfying the predicate. The method includes pruning, based on the metadata, all partitions in the external table that do not comprise any data satisfying the predicate. The method includes generating a query plan comprising a plurality of discrete subtasks. The method includes assigning, based on the metadata, the plurality of discrete subtasks to one or more nodes in an execution platform.

First claim

Opening claim text (preview).

1 . A system comprising: means for defining, a database platform, an external table based on content that is stored in an external data storage platform that is separate from the database platform, the database platform not having write access to the external table; means for generating metadata for the external table, the metadata comprising information about data stored in the external table; means for receiving a notification that a modification has been made to the content; and means for refreshing the metadata for the external table in response to the modification being made to the content. 2 . The system of claim 1 , wherein the means for generating the metadata comprises one or more of: means for defining cumulative table metadata for the external table; means for defining a grouping expression property for a grouping of partitions in the external table; means for defining an expression property for a partition of the external table; means for defining partition statistics for a partition of the external table; and means for defining a column expression property for a column of a partition of the external table. 3 . The system of claim 1 , further comprising means for storing the metadata in a partition storage object on a shared storage platform associated with the database platform, wherein the shared storage platform is separate from the external data storage platform. 4 . The system of claim 1 , further comprising one or more of: means for refreshing the metadata at threshold time periods; means for refreshing the metadata in response to a threshold number of modifications being made to the content; and means for refreshing the metadata in response to a request to refresh the metadata. 5 . The system of claim 1 , further comprising means for reading multiple different file formats in the content in the external data storage platform. 6 . The system of claim 1 , wherein the means for receiving the notification that the modification has been made to the content comprises one or more of: means for querying the external data storage platform to determine whether any modifications have been made to the content; means for receiving the notification from the external data storage platform; and means for receiving the notification from a client associated with the content. 7 . The system of claim 1 , wherein the means for defining the external table based on the content comprises: means for receiving an indication of a hierarchical structure of the content in the external data storage platform, the hierarchical structure defining folders and subfolders; means for receiving an indication of a partitioning structure for the content in the external data storage platform; and means for defining partitions in the external table based on where files are uploaded within the hierarchical structure and further based on the partitioning structure. 8 . The system of claim 7 , wherein the means for defining the partitions in the external table comprises: means for generating metadata for the partitions in the external table; and means for pointing the metadata to the appropriate folders and subfolders of the hierarchical structure. 9 . The system of claim 8 , wherein the means for receiving the notification comprises means for receiving a notification that a certain subfolder within the hierarchical structure has been modified. 10 . The system of claim 9 , further comprising: means for identifying a certain partition in the external table that corresponds with the certain subfolder; and means for generating change tracking metadata for the certain partition, the change tracking metadata indicating how the certain subfolder was modified and when the certain subfolder was modified. 11 . A method comprising: defining, by a database platform, an external table based on content that is stored in an external data storage platform that is separate from the database platform, the database platform not having write access to the external table; generating metadata for the external table, the metadata comprising information about data stored in the external table; receiving a notification that a modification has been made to the content; and refreshing the metadata for the external table in response to the modification being made to the content. 12 . The method of claim 11 , wherein generating the metadata comprises one or more of: defining cumulative table metadata for the external table; defining a grouping expression property for a grouping of partitions in the external table; defining an expression property for a partition of the external table; defining partition statistics for a partition of the external table; and defining a column expression property for a column of a partition of the external able. 13 . The method of claim 11 , further comprising storing the metadata in a partition storage object on a shared storage platform associated with the database platform, wherein the shared storage platform is separate from the external data storage platform. 14 . The method of claim 11 , further comprising one or more of: refreshing the metadata at threshold time periods; refreshing the metadata in response to a threshold number of modifications being made to the content; and refreshing the metadata in response to a request to refresh the metadata. 15 . The method of claim 11 , further comprising reading multiple different file formats in the content in the external data storage platform. 16 . The method of claim 11 , wherein receiving the notification that the modification has been made to the content comprises one or more of: querying the external data storage platform to determine whether any modifications have been made to the content; receiving the notification from the external data storage platform; and receiving the notification from a client associated with the content. 17 . The method of claim 11 , wherein defining the external table based on the content comprises: receiving an indication of a hierarchical structure of the content in the external data storage platform, the hierarchical structure defining folders and subfolders; receiving an indication of a partitioning structure for the content in the external data storage platform; and defining partitions in the external table based on where files are uploaded within the hierarchical structure and further based on the partitioning structure. 18 . The method of claim 17 , wherein defining the partitions in the external table comprises: generating metadata for the partitions in the external table; and pointing the metadata to the appropriate folders and subfolders of the hierarchical structure. 19 . The method of claim 18 , wherein receiving the notification comprises means for receiving a notification that a certain subfolder within the hierarchical structure has been modified. 20 . The method of claim 19 , further comprising: identifying a certain partition in the external table that corresponds with the certain subfolder; and generating change tracking metadata for the certain partition, the change tracking metadata indicating how the certain subfolder was modified and when the certain subfolder was modified. 21 . One or more non-transitory computer readable storage media storing instructions that, when executed by one or more processors, cause the one or more processors to perform operations comprising: defining, by a database platform, an external tabl

Assignees

Inventors

Classifications

  • Change logging, detection, and notification (replication G06F16/27) · CPC title

  • Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title

  • Query rewriting; Transformation · CPC title

  • Interactive query statement specification based on a database schema · CPC title

  • Updating materialised views · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020334239A1 cover?
Systems, methods, and devices for querying over an external table are disclosed. A method includes connecting a database platform to an external table such that the database platform has read access for the external table and does not have write access for the external table. The method includes receiving a query comprising a predicate, the query directed at least to data in the external table.…
Who is the assignee on this patent?
Snowflake Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/2358. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Oct 22 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).