What technology area does this patent fall under?

Primary CPC classification G06F16/24568. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 15 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Optimized processing of data in different formats

US11727013B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11727013-B2
Application number	US-202217930150-A
Country	US
Kind code	B2
Filing date	Sep 7, 2022
Priority date	Apr 9, 2021
Publication date	Aug 15, 2023
Grant date	Aug 15, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Hybrid tables can be used in different use-case scenarios. Hybrid tables provide a flexible mechanism to support files and data in different formats while providing access to the different types of data as part of one table. This flexibility can allow the use of hybrid tables in data lake or other similar environments.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: storing a first set of data in a first format in a first cloud storage location; storing a second set of data in a second format in a second cloud storage location; classifying a first subset of the first set of data in the first format as high-value data and classifying a second subset of the first set of data as low-value data; and ingesting a copy of the high-value data from the first cloud storage location into the second cloud storage location in the second format, wherein the first subset of the first set of data in the first format is maintained and not deleted in the first cloud storage location in response to ingesting the copy of the high-value data; providing an interface for accessing the first and second sets of data; receiving, via the interface, a first query referencing the first and second sets of data; determining that the first query references the first subset of the first data; executing the first query using the first subset of data in the second cloud storage location in the second format and the second set of data; receiving, via the interface, a second query referencing the first and second sets of data, determining that the second query references a second subset of the first set of data not ingested into the second cloud storage location; converting the second subset of the first set of data from the first format into a common format; converting the second set of data from the second format into the common format; joining the second subset of the first set of data in the common format and the second set of data in the common format to generate joined data; and executing the second query based on the joined data. 2. The method of claim 1 , wherein the first cloud storage location is in an external cloud storage location and wherein the second cloud storage location is a network-based data warehouse system, wherein the first format is a raw format and the second format is a formatted format used by the network-based data warehouse system. 3. The method of claim 1 , wherein the classifying is performed based on query patterns. 4. The method of claim 1 , wherein the classifying is performed based on scan statistics. 5. The method of claim 1 , wherein the classifying is performed based on metadata received from a client. 6. The method of claim 1 , further comprising: re-classifying the first subset from high-value data to low-value data; and in response to re-classifying, deleting the ingested copy of the first subset in the second format. 7. A machine-storage medium embodying instructions that, when executed by a machine, cause the machine to perform operations comprising: storing a first set of data in a first format in a first cloud storage location; storing a second set of data in a second format in a second cloud storage location; classifying a first subset of the first set of data in the first format as high-value data and classifying a second subset of the first set of data as low-value data; and ingesting a copy of the high-value data from the first cloud storage location into the second cloud storage location in the second format, wherein the first subset of the first set of data in the first format is maintained and not deleted in the first cloud storage location in response to ingesting the copy of the high-value data; providing an interface for accessing the first and second sets of data; receiving, via the interface, a first query referencing the first and second sets of data; determining that the first query references the first subset of the first data; executing the first query using the first subset of data in the second cloud storage location in the second format and the second set of data; receiving, via the interface, a second query referencing the first and second sets of data; determining that the second query references a second subset of the first set of data not ingested into the second cloud storage location; converting the second subset of the first set of data from the first format into a common format; converting the second set of data from the second format into the common format; joining the second subset of the first set of data in the common format and the second set of data in the common format to generate joined data; and executing the second query based on the joined data. 8. The machine-storage medium of claim 7 , wherein the first cloud storage location is in an external cloud storage location and wherein the second cloud storage location is a network-based data warehouse system, wherein the first format is a raw format and the second format is a formatted format used by the network-based data warehouse system. 9. The machine-storage medium of claim 7 , wherein the classifying is performed based on query patterns. 10. The machine-storage medium of claim 7 , wherein the classifying is performed based on scan statistics. 11. The machine-storage medium of claim 7 , wherein the classifying is performed based on metadata received from a client. 12. The machine-storage medium of claim 7 , further comprising: re-classifying the first subset from high-value data to low-value data; and in response to re-classifying, deleting the ingested copy of the first subset in the second format. 13. A system comprising: at least one hardware processor; and at least one memory storing instructions that, when executed by the at least one hardware processor, cause the at least one hardware processor to perform operations comprising: storing a first set of data in a first format in a first cloud storage location; storing a second set of data in a second format in a second cloud storage location; classifying a first subset of the first set of data in the first format as high-value data and classifying a second subset of the first set of data as low-value data; and ingesting a copy of the high-value data from the first cloud storage location into the second cloud storage location in the second format, wherein the first subset of the first set of data in the first format is maintained and not deleted in the first cloud storage location in response to ingesting the copy of the high-value data; providing an interface for accessing the first and second sets of data; receiving, via the interface, a first query referencing the first and second sets of data; determining that the first query references the first subset of the first data; executing the first query using the first subset of data in the second cloud storage location in the second format and the second set of data; receiving, via the interface, a second query referencing the first and second sets of data; determining that the second query references a second subset of the first set of data not ingested into the second cloud storage location; converting the second subset of the first set of data from the first format into a common format; converting the second set of data from the second format into the common format; joining the second subset of the first set of data in the common format and the second set of data in the common format to generate joined data; and executing the second query based on the joined data. 14. The system of claim 13 , wherein the first cloud storage location is in an external cloud storage location and wherein the second cloud storage location is a network-based data warehouse system, wherein the first format is a raw format and the second format is a formatted format used by the network-based data warehouse system. 15. The system of claim 13 , wherein the classifying is performed based on query

Assignees

Snowflake Inc

Inventors

Classifications

G06F16/24568Primary
Data stream processing; Continuous queries · CPC title
G06F16/24544Primary
Join order optimisation · CPC title
G06F16/25
Integrating or interfacing systems involving database management systems · CPC title

Patent family

Related publications grouped by family.

View patent family 83451105

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11727013B2 cover?: Hybrid tables can be used in different use-case scenarios. Hybrid tables provide a flexible mechanism to support files and data in different formats while providing access to the different types of data as part of one table. This flexibility can allow the use of hybrid tables in data lake or other similar environments.
Who is the assignee on this patent?: Snowflake Inc
What technology area does this patent fall under?: Primary CPC classification G06F16/24568. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 15 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).