System and method for syncing asynchronously received sequential data from disparate sources
US-2024346043-A1 · Oct 17, 2024 · US
US11809451B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11809451-B2 |
| Application number | US-201414518971-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 20, 2014 |
| Priority date | Feb 19, 2014 |
| Publication date | Nov 7, 2023 |
| Grant date | Nov 7, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Example caching systems and methods are described. In one implementation, a method identifies multiple files used to process a query and distributes each of the multiple files to a particular execution node to execute the query. Each execution node determines whether the distributed file is stored in the execution node's cache. If the execution node determines that the file is stored in the cache, it processes the query using the cached file. If the file is not stored in the cache, the execution node retrieves the file from a remote storage device, stores the file in the execution node's cache, and processes the query using the file.
Opening claim text (preview).
The invention claimed is: 1. A method comprising: receiving a query directed to database data stored across a plurality of shared storage devices; referencing a metadata store to locate a set of files that comprises data that needs to be processed to respond to the query; referencing the metadata store to determine whether the set of files is cached among execution nodes of an execution platform comprising a plurality of execution nodes, wherein the execution platform is separate from the metadata store and the plurality of shared storage devices; in response to determining that at least a portion of the set of files is cached among the plurality of execution nodes, assigning by one or more processors, processing of one or more of the set of files to each of one or more execution nodes that have cached at least a portion of the set of files; for each of the one or more assigned execution nodes: determining, by the assigned execution node, whether the assigned one or more files is stored at least in part in a cache of the assigned execution node; and in response to the assigned execution node determining the assigned one or more files is not entirely stored in the cache of the assigned execution node: retrieving a missing portion of the assigned one or more files from one or more remote storage devices of the plurality of remote storage devices including the missing portion of the assigned one or more files, wherein the plurality of execution nodes are organized into one or more virtual warehouses having one or more logical mappings between them, and a virtual warehouse including the assigned execution node dynamically establishes a communication link with each of the one or more of the plurality of remote storage devices based at least in part on the query so that the assigned execution node may retrieve the missing portion; storing, by the assigned execution node, the missing portion of the assigned one or more files in the cache of the assigned execution node so that the entire one or more files is stored in the cache of the assigned execution node, wherein a size and composition of the cache is adjusted to accommodate the missing portion of the assigned one or more files; processing the query using the assigned one or more files stored in the cache of the assigned execution node; and updating the metadata store to indicate the entire assigned one or more files is now cached in the cache of the assigned execution node; wherein any of the set of files stored in the plurality of shared storage devices may be accessed by any of a plurality of execution nodes of the execution platform; wherein any of the set of files stored in the plurality of shared storage devices may be stored in a cache of any of the plurality of execution nodes of the execution platform; and wherein any of the set of files stored in the plurality of shared storage devices may be stored in a cache of multiple execution nodes of the plurality of execution nodes of the execution platform at one point in time; and in response to a determination of a change in the number of execution nodes of the execution platform, wherein the change is creating a new execution node, the new execution node comprising a plurality of processors, wherein the cache varies among the plurality of processors, wherein a first subset of the plurality of processors comprises a minimal cache and a second subset of the plurality of processors comprises a cache providing faster input-output operations, reassign processing, among the changed number of execution nodes of the execution platform, of the set of files comprising data that needs to be processed to respond to the query. 2. The method of claim 1 , further comprising: in response to the assigned execution node determining the assigned one or more files is entirely stored in the cache of the assigned execution node, processing, using one or more processors of the assigned execution node, the query using the assigned one or more files stored in the cache of the assigned execution node. 3. The method of claim 1 , wherein updating the metadata store to indicate the entire assigned one or more files is now cached in the assigned execution node comprises updating the metadata store to identify all files that are duplicated in the cache of the assigned execution node. 4. The method of claim 1 , further comprising determining, by the assigned execution node, whether to store the assigned one or more files in faster or slower memory by implementing a least recently used (LRU) algorithm. 5. The method of claim 4 , wherein implementing the LRU algorithm further comprises identifying one or more copies of the assigned one or more files to be removed from the cache. 6. The method of claim 1 , wherein the metadata store is separate and independently scalable from each of the resource manager, the plurality of shared storage devices, and the execution platform, and wherein the metadata store comprises a complete metadata listing of the database data stored across the plurality of shared storage devices and a complete listing of files cached in the plurality execution nodes of the execution platform. 7. The method of claim 1 , wherein each execution node of the execution platform comprises a cache, wherein the cache includes a first storage portion and a second storage portion, wherein the first storage portion is significantly faster than the second storage portion. 8. The method of claim 1 , wherein the query directed to the database data comprises a single instruction that is applied by the execution platform to each of the set of files substantially simultaneously. 9. The method of claim 1 , wherein each execution node of the plurality of execution nodes comprises at least one processor and at least one local cache caching a copy of at least a portion of the database data. 10. The method of claim 1 , further comprising: in response to the assigned execution node determining the assigned one or more files is not stored in the cache of the assigned execution node, modifying, by the assigned execution node a database data structure of the retrieved copy of the assigned one or more files prior to storing the retrieved copy in the cache. 11. The method of claim 10 , wherein modifying the database data structure of the retrieved copy includes decrypting the retrieved copy. 12. The method of claim 10 , wherein modifying the database data structure of the retrieved copy includes decompressing the retrieved copy. 13. A system comprising: a plurality of shared storage devices collectively storing database data; a metadata store separate from the plurality of shared storage devices, the metadata store comprising metadata for the database data stored across the plurality of shared storage devices; and one or more processors operatively coupled to the metadata store, the one or more processors to: receive a query directed to the database data stored across the plurality of shared storage devices reference the metadata store to locate a set of files that comprises data that needs to be processed to respond to the query; reference the metadata store to determine whether the set of files is cached among execution nodes of an execution platform comprising a plurality of execution nodes, wherein the execution platform is separate from the metadata store and the plurality of shared storage devices; and in response to determining that at least a portion of the set of files is cached among the plurality of execution nodes, assigning processing of one or more of the set of files to each of one or more execution nodes that have cached at least a portion of the set of files; for
Asynchronous replication or reconciliation · CPC title
Intra-oral devices · CPC title
Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title
the resource being the memory · CPC title
considering hardware capabilities · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.