Automatic archiving of data store log data

US11238008B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11238008-B2
Application numberUS-201916563676-A
CountryUS
Kind codeB2
Filing dateSep 6, 2019
Priority dateMay 14, 2015
Publication dateFeb 1, 2022
Grant dateFeb 1, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and computer-readable media for automatic archiving of data store log data are disclosed. One or more operation records in a log are selected for archival. The one or more operation records comprise data indicative of operations performed on one or more data objects of a data store. The one or more operation records are selected for archival prior to deletion from the log. The one or more operation records are replicated from the log to an archive. Based at least in part on the replicating, the one or more operation records in the log are marked as archived. Based at least in part on the marking as archived, the deletion of the one or more operation records from the log is permitted.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: one or more computing devices configured to implement a data store archiving system, wherein the data store archiving system is configured to: select, for archival, one or more operation records in a log, wherein the one or more operation records comprise data indicative of operations performed on one or more data objects of a data store, and wherein the one or more operation records are selected for archival prior to deletion from the log; assign, to one or more workers, one or more archiving jobs comprising data indicative of the one or more operation records selected for archival; flag the one or more operation records as unarchived; postpone the deletion of the one or more operation records from the log based at least in part on the flagging the one or more operation records as unarchived; replicate, subsequent to flagging the one or more operation records as unarchived, the one or more operation records from the log to an archive, wherein the one or more operation records are replicated from the log to the archive by the one or more workers; generate metadata indicative of a mapping between one or more of the data objects referenced in the one or more operation records and one or more locations of the one or more operation records in the archive; and mark the one or more operation records for deletion from the log after the one or more operation records are replicated to the archive. 2. The system as recited in claim 1 , wherein the data store archiving system is further configured to: send, to a client, at least a portion of the metadata, wherein at least a portion of the one or more operation records are retrieved by the client from the archive using the at least a portion of the metadata. 3. The system as recited in claim 1 , wherein the one or more operation records are published from the data store to the log using a durable log publisher, and wherein the log comprises a plurality of replication nodes of a directed acyclic graph. 4. The system as recited in claim 1 , wherein the one or more operation records replicated to the archive are stored in the archive without respective expiration times. 5. A computer-implemented method, comprising: selecting, for archival, one or more operation records in a log, wherein the one or more operation records comprise data indicative of operations performed on one or more data objects of a data store, and wherein the one or more operation records are selected for archival prior to deletion from the log; marking the one or more operation records as unarchived; postponing the deletion of the one or more operation records from the log based at least in part on the marking the one or more operation records as unarchived; replicating, subsequent to marking the one or more operation records as unarchived, the one or more operation records from the log to an archive; marking the one or more operation records in the log as archived based at least in part on the replicating the one or more operation records from the log to the archive; and permitting deletion of the one or more operation records from the log based at least in part on the marking the one or more operation records in the log as archived. 6. The method as recited in claim 5 , further comprising: assigning, to one or more workers, one or more archiving jobs comprising data indicative of the one or more operation records selected for archival, wherein the one or more operation records are replicated from the log to the archive using the one or more workers; and maintaining a centralized record of a respective status of the one or more archiving jobs. 7. The method as recited in claim 5 , further comprising: generating metadata indicative of a mapping between one or more of the data objects referenced in the one or more operation records and one or more locations of the one or more operation records in the archive. 8. The method as recited in claim 7 , further comprising: sending, to a client, at least a portion of the metadata, wherein at least a portion of the one or more operation records are retrieved by the client from the archive using the at least a portion of the metadata. 9. The method as recited in claim 5 , wherein a particular shard comprises the one or more operation records, and wherein the particular shard is replicated from the log to the archive based at least in part on a marking of the particular shard as read-only. 10. The method as recited in claim 5 , wherein the one or more operation records are replicated from the log to the archive based at least in part on an imminent deletion from the log. 11. The method as recited in claim 5 , wherein the one or more operation records are replicated from the log to the archive based at least in part on the addition of the one or more operation records to the log. 12. The method as recited in claim 5 , wherein the one or more operation records in the log are selected for archival based at least in part on membership in a particular key space specified for archival. 13. The method as recited in claim 5 , wherein the one or more operation records are published from the data store to the log using a durable log publisher, and wherein the log comprises a plurality of replication nodes of a directed acyclic graph. 14. The method as recited in claim 5 , wherein the one or more operation records replicated to the archive are stored in the archive without respective expiration times. 15. One or more non-transitory computer-readable storage media storing program instructions computer-executable on or across one or more processors to perform: selecting, for archival, one or more operation records in a log, wherein the one or more operation records comprise data indicative of operations performed on one or more data objects of a data store, and wherein the one or more operation records are selected for archival prior to deletion from the log; marking the one or more operation records as unarchived; postponing the deletion of the one or more operation records from the log based at least in part on the marking the one or more operation records as unarchived; replicating, subsequent to marking the one or more operation records as unarchived, the one or more operation records from the log to an archive; generating metadata indicative of a mapping between one or more of the data objects referenced in the one or more operation records and one or more locations of the one or more operation records in the archive; and marking the one or more operation records in the log as archived based at least in part on the replicating the one or more operation records from the log to the archive; and causing deletion of the one or more operation records from the log based at least in part on the marking the one or more operation records in the log as archived. 16. The one or more non-transitory computer-readable storage media as recited in claim 15 , wherein the program instructions are further computer-executable to perform: assigning, to one or more workers, one or more archiving jobs comprising data indicative of the one or more operation records selected for archival, wherein the one or more operation records are replicated from the log to the archive using the one or more workers; and maintaining a centralized record of a respective status of the one or more archiving jobs. 17. The one or more non-transitory computer-readable storage media as recited in claim 15 , wherein the program instructions are further computer-executable to perform: sending, to a client, at l

Assignees

Inventors

Classifications

  • Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title

  • G06F16/113Primary

    Details of archiving (lifecycle management in storage systems G06F3/0649; point-in-time backing up or restoration of persistent data G06F11/1446) · CPC title

  • Delete operations (erasing in storage systems G06F3/0652) · CPC title

  • Techniques for file synchronisation in file systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11238008B2 cover?
Methods, systems, and computer-readable media for automatic archiving of data store log data are disclosed. One or more operation records in a log are selected for archival. The one or more operation records comprise data indicative of operations performed on one or more data objects of a data store. The one or more operation records are selected for archival prior to deletion from the log. The…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/113. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 01 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).