Scalable analysis platform for semi-structured data
US-2017206256-A1 · Jul 20, 2017 · US
US10169433B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10169433-B2 |
| Application number | US-201514810144-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 27, 2015 |
| Priority date | Jul 29, 2014 |
| Publication date | Jan 1, 2019 |
| Grant date | Jan 1, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Various embodiments can include systems, methods, and non-transitory computer readable media configured to receive at least one operation to be performed using (i) first data that is managed by a first computing system and (ii) second data that is managed by a second computing system, the operation being received through an interface provided by the computing system, and wherein the operation is based at least in part on a Structured Query Language (SQL). At least one optimization can be performed based at least in part on the operation. The operation can be executed using at least the first data and the second data. A result generated can be provided upon executing the operation through the interface provided by the computing system. The computing system, the first computing system, and the second computing system are each able to concurrently process, access, and create at least a portion of the generated result.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method comprising: receiving, by a computing system, at least one operation to be performed against a first data set and a second data set, the operation being received through an interface provided by the computing system, and being based at least in part on a Structured Query Language (SQL), wherein the first data set and the second data set are at least: stored in different database types; stored in independent storage environments; and managed by independent computing systems, including at least a first computing system and a different computing system; performing, by the computing system, at least one optimization with respect to the operation; executing, by the computing system, the operation using at least the first data and the second data based on accessing the first data directly from a first storage environment and accessing the second data directly from a different storage environment; and providing, by the computing system, a generated result upon executing the operation through the interface provided by the computing system, wherein the computing system, the first computing system, and the different computing system are each able to concurrently process, access, and create at least a portion of the generated result without combining the first data set and the second data set into a centralized data set prior to providing the generated result. 2. The computer-implemented method of claim 1 , wherein the first data set is associated with a SQL database type, and wherein the second data set is associated with a NoSQL database type. 3. The computer-implemented method of claim 1 , wherein the first data set and the second data set correspond to at least one of: a text file, a log file, a document file, an image file, an audio file, a video file, a spreadsheet file, an information source, or an information sink. 4. The computer-implemented method of claim 1 , wherein the first data set is a first database table associated with the first database type, and wherein the second data set is a second database table associated with the second database type, and wherein executing, by the computing system, the operation further comprises: joining, by the computing system, the first database table managed by the first computing system and the second database table managed by the different computing system. 5. The computer-implemented method of claim 4 , the method further comprising: receiving, by the computing system, at least one second operation to modify the joined first database table and the second database table; and creating, by the computing system, a new database table based at least in part on the modification of the joined first database table and the second database table, wherein the new database table is stored in a local data store managed by the computing system. 6. The computer-implemented method of claim 5 , the method further comprising: receiving, by the computing system, at least one third operation to modify the joined first database table and the second database table; and modifying, by the computing system, the new database table stored in the local data store managed by the computing system. 7. The computer-implemented method of claim 1 , wherein the first data set is a first database table associated with the first database type, and wherein the second data set is a text file, and wherein executing, by the computing system, the operation further comprises: joining, by the computing system, the first database table and the text file. 8. The computer-implemented method of claim 1 , wherein the result is a database table, and wherein providing, by the computing system, the result further comprises: providing, by the computing system, the result through the interface, the result being a tabular representation of the database table. 9. The computer-implemented method of claim 1 , wherein the operation performs at least in part a user-defined function (UDF) with at least one of the first computing system and the second computing system, wherein at least one of the first computing system and the second computing system is remote from the computing system. 10. The computer-implemented method of claim 1 , the method further comprising: inferring, by the computing system, a table schema based at least in part on the first data set, the second data set, or both. 11. The computer-implemented method of claim 1 , the method further comprising: creating, by the computing system, a script that includes the at least one operation in response to a request by a user that provided the operation. 12. The computer-implemented method of claim 1 , wherein the computing system, the first computing system, and the different computing system are at least one of: a server, a mobile computing device, a computing system having varying access to network connectivity, a virtualized computing system, a non-virtualized computing system, embedded computing system, or special-purpose computing system. 13. The computer-implemented method of claim 1 , the method further comprising: receiving, by the computing system and through the interface, a first operation and a second operation to be performed by the computing system; performing, by the computing system, at least one optimization with respect to the first operation; executing, by the computing system, the first operation; while the first operation is executing, performing, by the computing system, at least one optimization with respect to the second operation; and executing, by the computing system, the second operation. 14. The computer-implemented method of claim 1 , wherein performing, by the computing system, at least one optimization further comprises: translating, by the computing system, the operation into a single native code representation. 15. The computer-implemented method of claim 14 , the method further comprising: optimizing, by the computing system, the single native code representation of the operation, wherein the optimization includes at least one of: register allocation, loop unrolling, inlining, or constant folding. 16. A computing system comprising: at least one processor; and a memory storing instructions that, when executed by the at least one processor, cause the computing system to perform: receiving at least one operation to be performed on a first data set and a second data set, the operation being received through an interface provided by the computing system, and being based at least in part on a Structured Query Language (SQL), wherein the first data set and the second data set are at least; stored in different database types; stored in independent storage environments; and managed by independent computing systems, including at least a first computing system and a different computing system; performing at least one optimization based at least in part on the operation; executing the operation using at least the first data and the second data based on accessing the first data directly from a first storage environment associated with the first computing system and accessing the second data directly from a second storage environment associated with the different computing system; and providing a generated result upon executing the operation through the interface provided by the computing system, wherein the computing system, the first computing system, and the second computing system are each able to concurrently process, access, and create at least a portion of the generated result without combining the first data set and the second data s
Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title
Relational databases · CPC title
Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP · CPC title
of operators · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.