Fast ingestion of records in a database using data locality and queuing

US11394794B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11394794-B2
Application numberUS-201916656486-A
CountryUS
Kind codeB2
Filing dateOct 17, 2019
Priority dateOct 18, 2018
Publication dateJul 19, 2022
Grant dateJul 19, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Described is a system, method, and computer program product is provided that implements high-volume data ingestion in a relational database system. A middle-tier structure is provided that sits between the IoT data producers and the back-end database system. Data records are gathered together and organized at the middle tier, and groups of those records are ingested on a group-basis into the database in a manner which bypasses standard SQL engine processing.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for implementing high-volume ingestion of data into a relational database system, comprising: implementing a database architecture having a middle tier server that is located between data producers and a relational database management system (RDBMS); maintaining one or more queues at the middle tier server, wherein data records generated by the data producers are placed into the one or more queues; identifying a set of data within the one or more queues for ingestion into the RDBMS; acquiring a connection from a connection pool to process the set of data; transferring the set of data to the RDBMS with the connection acquired from the connection pool; and inserting the set of data into a table within the RDBMS with a bypass of a SQL processing engine. 2. The method of claim 1 , wherein the RDBMS comprises a sharded database, the one or more queues comprises a first queue that corresponds to a first shard and a second queue that corresponds to a second shard, wherein a first data record generated by the data producers that corresponds to the first shard is placed into the first queue, and a second data record generated by the data producers that corresponds to the second shard is placed into the second queue. 3. The method of claim 2 , wherein a shard in the sharded database comprises a plurality of partitions, the first queue corresponds to a first partition in the first shard and the second queue corresponds to a second partition in the second shard, wherein the first data record generated by the data producers that corresponds to the first partition in the first shard is placed into the first queue, and the second data record generated by the data producers that corresponds to the second partition in the second shard is placed into the second queue. 4. The method of claim 3 , wherein the first data record is associated with a key that corresponds to the first partition in the first shard, wherein the key identifies the first queue as corresponding to the first data record. 5. The method of claim 3 , wherein the first data record is associated with a key that corresponds to the first partition in the first shard, wherein the key is used as a locality key that identifies an entry in a routing table to acquire the connection from the connection pool appropriate for the first shard. 6. The method of claim 1 , wherein the set of data within the one or more queues is identified for ingestion into the RDBMS by determining that a volume of data within a queue has reached a size threshold. 7. The method of claim 1 , wherein the set of data within the one or more queues is identified for ingestion into the RDBMS by determining that a time threshold has been reached for the data records within a queue. 8. The method of claim 1 , wherein the set of data is inserted into the table within the RDBMS by formatting a new data block that includes the set of data, and placing the new data block into a datafile corresponding to the table. 9. The method of claim 1 , wherein the bypass of the SQL processing engine to insert the set of data into the table within the RDBMS is performed without generating log records on a row-by-row basis in the RDBMS. 10. A system for implementing high-volume ingestion of data into a relational database system, comprising: a processor; a memory for holding programmable code; and wherein the programmable code includes instructions for implementing a database architecture having a middle tier server that is located between data producers and a relational database management system (RDBMS), maintaining one or more queues at the middle tier server, wherein data records generated by the data producers are placed into the one or more queues, identifying a set of data within the one or more queues for ingestion into the RDBMS, acquiring a connection from a connection pool to process the set of data, transferring the set of data to the RDBMS with the connection acquired from the connection pool, and inserting the set of data into a table within the RDBMS with a bypass of a SQL processing engine. 11. The system of claim 10 , wherein the RDBMS comprises a sharded database, the one or more queues comprises a first queue that corresponds to a first shard and a second queue that corresponds to a second shard, wherein a first data record generated by the data producers that corresponds to the first shard is placed into the first queue, and a second data record generated by the data producers that corresponds to the second shard is placed into the second queue. 12. The system of claim 11 , wherein a shard in the sharded database comprises a plurality of partitions, the first queue corresponds to a first partition in the first shard and the second queue corresponds to a second partition in the second shard, wherein the first data record generated by the data producers that corresponds to the first partition in the first shard is placed into the first queue, and the second data record generated by the data producers that corresponds to the second partition in the second shard is placed into the second queue. 13. The system of claim 12 , wherein the first data record is associated with a key that corresponds to the first partition in the first shard, wherein the key identifies the first queue as corresponding to the first data record. 14. The system of claim 12 , wherein the first data record is associated with a key that corresponds to the first partition in the first shard, wherein the key is used as a locality key that identifies an entry in a routing table to acquire the connection from the connection pool appropriate for the first shard. 15. The system of claim 10 , wherein the set of data within the one or more queues is identified for ingestion into the RDBMS by determining that a volume of data within a queue has reached a size threshold. 16. The system of claim 10 , wherein the set of data within the one or more queues is identified for ingestion into the RDBMS by determining that a time threshold has been reached for the data records within a queue. 17. The system of claim 10 , wherein the set of data is inserted into the table within the RDBMS by formatting a new data block that includes the set of data, and placing the new data block into a datafile corresponding to the table. 18. The system of claim 10 , wherein the bypass of the SQL processing engine to insert the set of data into the table within the RDBMS is performed without generating log records on a row-by-row basis in the RDBMS. 19. A computer program product embodied on a computer readable medium, the computer readable medium having stored thereon a sequence of instructions which, when executed by a processor, executes a method for implementing high-volume ingestion of data into a relational database system, comprising: implementing a database architecture having a middle tier server that is located between data producers and a relational database management system (RDBMS); maintaining one or more queues at the middle tier server, wherein data records generated by the data producers are placed into the one or more queues; identifying a set of data within the one or more queues for ingestion into the RDBMS; acquiring a connection from a connection pool to process the set of data; transferring the set of data to the RDBMS with the connection acquired from the connection pool; and inserting the set of data into a table within the RDBMS with a bypass of a SQL processing engine. 20. The computer program product of claim 1

Assignees

Inventors

Classifications

  • H04L67/566Primary

    Grouping or aggregating service requests, e.g. for unified processing · CPC title

  • Message passing systems or structures, e.g. queues · CPC title

  • Relational databases · CPC title

  • Storing data temporarily at an intermediate stage, e.g. caching · CPC title

  • Tablespace storage structures; Management thereof · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11394794B2 cover?
Described is a system, method, and computer program product is provided that implements high-volume data ingestion in a relational database system. A middle-tier structure is provided that sits between the IoT data producers and the back-end database system. Data records are gathered together and organized at the middle tier, and groups of those records are ingested on a group-basis into the da…
Who is the assignee on this patent?
Oracle Int Corp
What technology area does this patent fall under?
Primary CPC classification H04L67/566. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Jul 19 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).