Updating a location marker of an ingestion buffer based on storing buckets in a shared storage system
US-11250056-B1 · Feb 15, 2022 · US
US11394794B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11394794-B2 |
| Application number | US-201916656486-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 17, 2019 |
| Priority date | Oct 18, 2018 |
| Publication date | Jul 19, 2022 |
| Grant date | Jul 19, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Described is a system, method, and computer program product is provided that implements high-volume data ingestion in a relational database system. A middle-tier structure is provided that sits between the IoT data producers and the back-end database system. Data records are gathered together and organized at the middle tier, and groups of those records are ingested on a group-basis into the database in a manner which bypasses standard SQL engine processing.
Opening claim text (preview).
What is claimed is: 1. A method for implementing high-volume ingestion of data into a relational database system, comprising: implementing a database architecture having a middle tier server that is located between data producers and a relational database management system (RDBMS); maintaining one or more queues at the middle tier server, wherein data records generated by the data producers are placed into the one or more queues; identifying a set of data within the one or more queues for ingestion into the RDBMS; acquiring a connection from a connection pool to process the set of data; transferring the set of data to the RDBMS with the connection acquired from the connection pool; and inserting the set of data into a table within the RDBMS with a bypass of a SQL processing engine. 2. The method of claim 1 , wherein the RDBMS comprises a sharded database, the one or more queues comprises a first queue that corresponds to a first shard and a second queue that corresponds to a second shard, wherein a first data record generated by the data producers that corresponds to the first shard is placed into the first queue, and a second data record generated by the data producers that corresponds to the second shard is placed into the second queue. 3. The method of claim 2 , wherein a shard in the sharded database comprises a plurality of partitions, the first queue corresponds to a first partition in the first shard and the second queue corresponds to a second partition in the second shard, wherein the first data record generated by the data producers that corresponds to the first partition in the first shard is placed into the first queue, and the second data record generated by the data producers that corresponds to the second partition in the second shard is placed into the second queue. 4. The method of claim 3 , wherein the first data record is associated with a key that corresponds to the first partition in the first shard, wherein the key identifies the first queue as corresponding to the first data record. 5. The method of claim 3 , wherein the first data record is associated with a key that corresponds to the first partition in the first shard, wherein the key is used as a locality key that identifies an entry in a routing table to acquire the connection from the connection pool appropriate for the first shard. 6. The method of claim 1 , wherein the set of data within the one or more queues is identified for ingestion into the RDBMS by determining that a volume of data within a queue has reached a size threshold. 7. The method of claim 1 , wherein the set of data within the one or more queues is identified for ingestion into the RDBMS by determining that a time threshold has been reached for the data records within a queue. 8. The method of claim 1 , wherein the set of data is inserted into the table within the RDBMS by formatting a new data block that includes the set of data, and placing the new data block into a datafile corresponding to the table. 9. The method of claim 1 , wherein the bypass of the SQL processing engine to insert the set of data into the table within the RDBMS is performed without generating log records on a row-by-row basis in the RDBMS. 10. A system for implementing high-volume ingestion of data into a relational database system, comprising: a processor; a memory for holding programmable code; and wherein the programmable code includes instructions for implementing a database architecture having a middle tier server that is located between data producers and a relational database management system (RDBMS), maintaining one or more queues at the middle tier server, wherein data records generated by the data producers are placed into the one or more queues, identifying a set of data within the one or more queues for ingestion into the RDBMS, acquiring a connection from a connection pool to process the set of data, transferring the set of data to the RDBMS with the connection acquired from the connection pool, and inserting the set of data into a table within the RDBMS with a bypass of a SQL processing engine. 11. The system of claim 10 , wherein the RDBMS comprises a sharded database, the one or more queues comprises a first queue that corresponds to a first shard and a second queue that corresponds to a second shard, wherein a first data record generated by the data producers that corresponds to the first shard is placed into the first queue, and a second data record generated by the data producers that corresponds to the second shard is placed into the second queue. 12. The system of claim 11 , wherein a shard in the sharded database comprises a plurality of partitions, the first queue corresponds to a first partition in the first shard and the second queue corresponds to a second partition in the second shard, wherein the first data record generated by the data producers that corresponds to the first partition in the first shard is placed into the first queue, and the second data record generated by the data producers that corresponds to the second partition in the second shard is placed into the second queue. 13. The system of claim 12 , wherein the first data record is associated with a key that corresponds to the first partition in the first shard, wherein the key identifies the first queue as corresponding to the first data record. 14. The system of claim 12 , wherein the first data record is associated with a key that corresponds to the first partition in the first shard, wherein the key is used as a locality key that identifies an entry in a routing table to acquire the connection from the connection pool appropriate for the first shard. 15. The system of claim 10 , wherein the set of data within the one or more queues is identified for ingestion into the RDBMS by determining that a volume of data within a queue has reached a size threshold. 16. The system of claim 10 , wherein the set of data within the one or more queues is identified for ingestion into the RDBMS by determining that a time threshold has been reached for the data records within a queue. 17. The system of claim 10 , wherein the set of data is inserted into the table within the RDBMS by formatting a new data block that includes the set of data, and placing the new data block into a datafile corresponding to the table. 18. The system of claim 10 , wherein the bypass of the SQL processing engine to insert the set of data into the table within the RDBMS is performed without generating log records on a row-by-row basis in the RDBMS. 19. A computer program product embodied on a computer readable medium, the computer readable medium having stored thereon a sequence of instructions which, when executed by a processor, executes a method for implementing high-volume ingestion of data into a relational database system, comprising: implementing a database architecture having a middle tier server that is located between data producers and a relational database management system (RDBMS); maintaining one or more queues at the middle tier server, wherein data records generated by the data producers are placed into the one or more queues; identifying a set of data within the one or more queues for ingestion into the RDBMS; acquiring a connection from a connection pool to process the set of data; transferring the set of data to the RDBMS with the connection acquired from the connection pool; and inserting the set of data into a table within the RDBMS with a bypass of a SQL processing engine. 20. The computer program product of claim 1
Grouping or aggregating service requests, e.g. for unified processing · CPC title
Message passing systems or structures, e.g. queues · CPC title
Relational databases · CPC title
Storing data temporarily at an intermediate stage, e.g. caching · CPC title
Tablespace storage structures; Management thereof · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.