Database Logging Using Storage Class Memory
US-2017004317-A1 · Jan 5, 2017 · US
US11640410B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-11640410-B1 |
| Application number | US-201514957421-A |
| Country | US |
| Kind code | B1 |
| Filing date | Dec 2, 2015 |
| Priority date | Dec 2, 2015 |
| Publication date | May 2, 2023 |
| Grant date | May 2, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Data replication groups may be used to store data in a distributed computing environment. The data replication groups may include a set of nodes executing a consensus protocol to maintain data durably. The nodes of the data replication groups may generate logs containing information corresponding committed operations performed by the nodes. These logs may be collected and processed to obtain useful information corresponding to the operation of the data replication group. Furthermore, this processed information may be provided in the form of a stream to enable event driven operations corresponding to the logs.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method, comprising: receiving a subscription request to subscribe to an event stream associated with log events of one or more data replication groups, the one or more data replication groups comprising a plurality of nodes configured to store data replicated across the plurality of nodes, with each of the nodes being a separate virtual computing process running on a different separate physical host and implementing a consensus protocol enabling data replication between the plurality of nodes; obtaining a batch of logs containing committed operations performed by the plurality of nodes defining a data replication group of the one or more data replication groups, the batch of logs having information sufficient to: generate a timeline of a plurality of updates to values of a record in a key-value store of at least one node of the plurality of nodes over a time period such that the values of a plurality of keys of the key-value store can be determined at an arbitrary point in the time period based on the timeline; and reconstruct the data replication group of the one or more data replication groups at the arbitrary point in the time period by replaying a plurality of state-machine transitions that occurred at the plurality of nodes during execution of the data replication group defined by the plurality of nodes, the replaying of the plurality of state-machine transitions of the plurality of nodes including replaying the plurality of updates to values of a record in the key-value store of at least one node of the plurality of nodes; providing the batch of logs to a log collector; modifying the batch of logs to reduce the amount of duplicative information contained in the batch of logs by removing redundant log entries, and by removing entries corresponding to periodic health checks of the node and liveliness of the node; generating one or more events based at least in part on the modified batch of logs; and making available the generated one or more events to the event stream such that a subscriber obtains information corresponding to the generated one or more events. 2. The computer-implemented method of claim 1 , further comprising transmitting a command to the node, the command configured to cause the node to delete the batch of logs maintained locally by the node. 3. The computer-implemented method of claim 1 , wherein modifying the batch of logs is performed by a log processor provided by the subscriber, where the log processor is a script configured to remove log entries from the batch of logs as indicated by the subscriber. 4. The computer-implemented method of claim 1 , wherein the one or more events correspond to a change in key-value information. 5. The computer-implemented method of claim 1 , wherein modifying the batch of logs to reduce the amount of duplicative information contained in the batch of logs by removing redundant log entries further comprises retaining operations performed by the node comprising updates to values maintained as a key-value pair in the data replication group. 6. The computer-implemented method of claim 1 , wherein the batch of logs contains committed operations performed by a plurality of nodes of the data replication group, and wherein the redundant log entries comprise duplicative entries from different nodes of the plurality of nodes. 7. A system, comprising: one or more processors; and memory that includes instructions that, when executed by the one or more processors, cause the system to: obtain a batch of logs containing operations performed by at least two nodes of a data replication group executed by a computer system, the at least two nodes being separate virtual computing processes running on different physical hosts and implementing a consensus protocol enabling data replication between a plurality of nodes including the at least two nodes, the batch of logs having information sufficient to: generate a timeline of a plurality of updates to values of a record in a key-value store of at least one node of the plurality of nodes over a time period such that the values of a plurality of keys of the key-value store can be determined at a point in the time period based on the timeline; and restore the data replication group at the point in the time period by replaying a plurality of state-machine transitions that occurred during execution of the data replication group including state-machine transitions that occurred at the at least two nodes, the replaying of the plurality of state-machine transitions including replaying a plurality of updates to the key-value store of at least one of the at least two nodes; remove redundant log entries and entries corresponding to heartbeat operations from the batch of logs, wherein the entries corresponding to the heartbeat operations from the batch of logs further comprise entries indicating at least one of a health status or a liveness of the node; provide the batch of logs to a stream service; and cause the stream service to make data available by at least: generating one or more events based at least in part on the provided batch of logs; and transmitting the generated one or more events to one or more subscribers of the stream service. 8. The system of claim 7 , wherein the memory further includes instructions that, when executed by the one or more processors, cause the system to process the batch of logs to retain operations performed by the node comprising updates to values maintained in as a key-value pair in the data replication group. 9. The system of claim 7 , wherein the memory further includes instructions that, when executed by the one or more processors, cause the system to remove redundant log entries and entries corresponding to heartbeat operations from the batch of logs further comprise executable code provided by a client of the stream service. 10. The system of claim 7 , wherein the memory further includes instructions that, when executed by the one or more processors, cause the system to, as a result of successfully obtaining the batch of logs, transmit a command to a log pusher to trim a log maintained by the node. 11. The system of claim 7 , wherein the memory further includes instructions that, when executed by the one or more processors, cause the system to store the batch of logs in a remote storage system. 12. The system of claim 11 , wherein the remote storage system is configured such that the batch of logs maintained in the remote storage system is queryable. 13. The system of claim 11 , wherein obtaining the batch of logs further comprises obtaining the batch of logs from the remote storage system prior to removing redundant log entries and entries corresponding to heartbeat operations from the batch of logs. 14. A non-transitory computer-readable storage medium having stored thereon executable instructions that, when executed by one or more processors of a computer system, cause the computer system to at least: receive a request to subscribe to an event stream associated with one or more data replication groups that comprise a plurality of nodes that are separate virtual computing processes running on different physical hosts and enabling data replication between the plurality of nodes; obtain a batch of logs from memory of a node of the one or more data replication groups, where the log contains information associated with operations performed by the node, the batch of logs having information sufficient to: generate a timeline of one or more of updates to one or more values of a record in a key-value store of at least one node of plurality of nodes over a time period su
Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title
Change logging, detection, and notification (replication G06F16/27) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.