Unified table data access in user-specified formats on internal storage and user-managed storage

US12050582B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-12050582-B1
Application numberUS-202318498463-A
CountryUS
Kind codeB1
Filing dateOct 31, 2023
Priority dateJun 23, 2023
Publication dateJul 30, 2024
Grant dateJul 30, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The subject technology provides embodiments for supporting a unified table which may be a managed table or an unmanaged table. Managed tables are those where the subject technology manages the metastore/catalog for the table, whereas unmanaged tables are tables where an external catalog controls the table and the subject technology integrates with that catalog to work with the table, but does not assume control of the table.

First claim

Opening claim text (preview).

The invention claimed is: 1. A network-based database system comprising: at least one hardware processor; and a memory storing instructions that cause the at least one hardware processor to perform operations comprising: generating, by a compute service manager, multiple jobs based on a source table format of a source table, the network-based database system comprising a catalog including the source table; generating, by an execution node, a set of files based on the multiple jobs; generating, by the execution node, a set of expression property files for a new table based on the set of files, each expression property file from the set of expression property files comprising at least a number of values when different than a number of rows, a minimum value and a maximum value, a number of distinct values, and a number of null values; and committing the new table created based at least in part on the set of files, the new table comprising an unmanaged table, an external catalog controlling the unmanaged table, the external catalog being different from the catalog including the source table, and the network-based database system providing a catalog integration to access the new table that is controlled by the external catalog. 2. The system of claim 1 , wherein the operations further comprise: determining a first set of files that were added since a previous refresh of the source table, the previous refresh comprising a prior time when a snapshot of the source table was obtained; and determining a second set of files that were removed since the previous refresh of the source table. 3. The system of claim 1 , wherein the operations further comprise: identifying a particular file, the particular file storing a manifest list of a current snapshot of the source table. 4. The system of claim 3 , wherein the set of files are generated based on the manifest list, each file comprising a set of files from the manifest list. 5. The system of claim 1 , wherein the operations further comprise: determining that changes to the new table has occurred since committing the new table; and performing a refreshing process to update the new table with a first set of files that have been added and a second set of files that have been removed. 6. The system of claim 1 , wherein the set of expression property files are stored in a metadata database, the metadata database being a different storage location than a particular storage location of the new table. 7. The system of claim 1 , wherein the operations further comprise: processing, by the execution node, the set of files in parallel to generate the set of expression property files. 8. A method comprising: generating, by a compute service manager of a network-based database system, multiple jobs based on a source table format of a source table, the network-based database system comprising a catalog including the source table; generating, by an execution node, a set of files based on the multiple jobs; generating, by the execution node, a set of expression property files for a new table based on the set of files, each expression property file from the set of expression property files comprising at least a number of values when different than a number of rows, a minimum value and a maximum value, a number of distinct values, and a number of null values; and committing the new table created based at least in part on the set of files, the new table comprising an unmanaged table, an external catalog controlling the unmanaged table, the external catalog being different from the catalog including the source table, and the network-based database system providing a catalog integration to access the new table that is controlled by the external catalog. 9. The method of claim 8 , further comprising: determining a first set of files that were added since a previous refresh of the source table, the previous refresh comprising a prior time when a snapshot of the source table was obtained; and determining a second set of files that were removed since the previous refresh of the source table. 10. The method of claim 8 , further comprising: identifying a particular file, the particular file storing a manifest list of a current snapshot of the source table. 11. The method of claim 10 , wherein the set of files are generated based on the manifest list, each file comprising a set of files from the manifest list. 12. The method of claim 8 , further comprising: determining that changes to the new table has occurred since committing the new table; and performing a refreshing process to update the new table with a first set of files that have been added and a second set of files that have been removed. 13. The method of claim 8 , wherein the set of expression property files are stored in a metadata database, the metadata database being a different storage location than a particular storage location of the new table. 14. A non-transitory computer-storage medium comprising instructions that, when executed by one or more processors of a machine, configure the machine to perform operations comprising: generating, by a compute service manager of a network-based database system, multiple jobs based on a source table format of a source table, the network-based database system comprising a catalog including the source table; generating, by an execution node, a set of files based on the multiple jobs; generating, by the execution node, a set of expression property files for a new table based on the set of files, each expression property file from the set of expression property files comprising at least a number of values when different than a number of rows, a minimum value and a maximum value, a number of distinct values, and a number of null values; and committing the new table created based at least in part on the set of files, the new table comprising an unmanaged table, an external catalog controlling the unmanaged table, the external catalog being different from the catalog including the source table, and the network-based database system providing a catalog integration to access the new table that is controlled by the external catalog. 15. The system of claim 1 , wherein the catalog integration comprises an object that defines a source of metadata and schema for the new table that the network-based database system does not manage, the operations further comprising: creating the catalog integration based on a set of statements, the set of statements comprising at least a statement comprising a catalog source, a catalog namespace, and a table format.

Assignees

Inventors

Classifications

  • using data annotations, e.g. user-defined metadata · CPC title

  • Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title

  • Tablespace storage structures; Management thereof · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12050582B1 cover?
The subject technology provides embodiments for supporting a unified table which may be a managed table or an unmanaged table. Managed tables are those where the subject technology manages the metastore/catalog for the table, whereas unmanaged tables are tables where an external catalog controls the table and the subject technology integrates with that catalog to work with the table, but does n…
Who is the assignee on this patent?
Snowflake Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/2282. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 30 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).