Providing table data access in user-specified formats on user-managed storage
US-11899646-B2 · Feb 13, 2024 · US
US12050582B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-12050582-B1 |
| Application number | US-202318498463-A |
| Country | US |
| Kind code | B1 |
| Filing date | Oct 31, 2023 |
| Priority date | Jun 23, 2023 |
| Publication date | Jul 30, 2024 |
| Grant date | Jul 30, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The subject technology provides embodiments for supporting a unified table which may be a managed table or an unmanaged table. Managed tables are those where the subject technology manages the metastore/catalog for the table, whereas unmanaged tables are tables where an external catalog controls the table and the subject technology integrates with that catalog to work with the table, but does not assume control of the table.
Opening claim text (preview).
The invention claimed is: 1. A network-based database system comprising: at least one hardware processor; and a memory storing instructions that cause the at least one hardware processor to perform operations comprising: generating, by a compute service manager, multiple jobs based on a source table format of a source table, the network-based database system comprising a catalog including the source table; generating, by an execution node, a set of files based on the multiple jobs; generating, by the execution node, a set of expression property files for a new table based on the set of files, each expression property file from the set of expression property files comprising at least a number of values when different than a number of rows, a minimum value and a maximum value, a number of distinct values, and a number of null values; and committing the new table created based at least in part on the set of files, the new table comprising an unmanaged table, an external catalog controlling the unmanaged table, the external catalog being different from the catalog including the source table, and the network-based database system providing a catalog integration to access the new table that is controlled by the external catalog. 2. The system of claim 1 , wherein the operations further comprise: determining a first set of files that were added since a previous refresh of the source table, the previous refresh comprising a prior time when a snapshot of the source table was obtained; and determining a second set of files that were removed since the previous refresh of the source table. 3. The system of claim 1 , wherein the operations further comprise: identifying a particular file, the particular file storing a manifest list of a current snapshot of the source table. 4. The system of claim 3 , wherein the set of files are generated based on the manifest list, each file comprising a set of files from the manifest list. 5. The system of claim 1 , wherein the operations further comprise: determining that changes to the new table has occurred since committing the new table; and performing a refreshing process to update the new table with a first set of files that have been added and a second set of files that have been removed. 6. The system of claim 1 , wherein the set of expression property files are stored in a metadata database, the metadata database being a different storage location than a particular storage location of the new table. 7. The system of claim 1 , wherein the operations further comprise: processing, by the execution node, the set of files in parallel to generate the set of expression property files. 8. A method comprising: generating, by a compute service manager of a network-based database system, multiple jobs based on a source table format of a source table, the network-based database system comprising a catalog including the source table; generating, by an execution node, a set of files based on the multiple jobs; generating, by the execution node, a set of expression property files for a new table based on the set of files, each expression property file from the set of expression property files comprising at least a number of values when different than a number of rows, a minimum value and a maximum value, a number of distinct values, and a number of null values; and committing the new table created based at least in part on the set of files, the new table comprising an unmanaged table, an external catalog controlling the unmanaged table, the external catalog being different from the catalog including the source table, and the network-based database system providing a catalog integration to access the new table that is controlled by the external catalog. 9. The method of claim 8 , further comprising: determining a first set of files that were added since a previous refresh of the source table, the previous refresh comprising a prior time when a snapshot of the source table was obtained; and determining a second set of files that were removed since the previous refresh of the source table. 10. The method of claim 8 , further comprising: identifying a particular file, the particular file storing a manifest list of a current snapshot of the source table. 11. The method of claim 10 , wherein the set of files are generated based on the manifest list, each file comprising a set of files from the manifest list. 12. The method of claim 8 , further comprising: determining that changes to the new table has occurred since committing the new table; and performing a refreshing process to update the new table with a first set of files that have been added and a second set of files that have been removed. 13. The method of claim 8 , wherein the set of expression property files are stored in a metadata database, the metadata database being a different storage location than a particular storage location of the new table. 14. A non-transitory computer-storage medium comprising instructions that, when executed by one or more processors of a machine, configure the machine to perform operations comprising: generating, by a compute service manager of a network-based database system, multiple jobs based on a source table format of a source table, the network-based database system comprising a catalog including the source table; generating, by an execution node, a set of files based on the multiple jobs; generating, by the execution node, a set of expression property files for a new table based on the set of files, each expression property file from the set of expression property files comprising at least a number of values when different than a number of rows, a minimum value and a maximum value, a number of distinct values, and a number of null values; and committing the new table created based at least in part on the set of files, the new table comprising an unmanaged table, an external catalog controlling the unmanaged table, the external catalog being different from the catalog including the source table, and the network-based database system providing a catalog integration to access the new table that is controlled by the external catalog. 15. The system of claim 1 , wherein the catalog integration comprises an object that defines a source of metadata and schema for the new table that the network-based database system does not manage, the operations further comprising: creating the catalog integration based on a set of statements, the set of statements comprising at least a statement comprising a catalog source, a catalog namespace, and a table format.
using data annotations, e.g. user-defined metadata · CPC title
Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title
Tablespace storage structures; Management thereof · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.