Data expiration for stream storages

US11740828B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11740828-B2
Application numberUS-202117223263-A
CountryUS
Kind codeB2
Filing dateApr 6, 2021
Priority dateApr 6, 2021
Publication dateAug 29, 2023
Grant dateAug 29, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The described technology is generally directed towards fine-grained data event expiration in a streaming data storage system. An event to append is given an expiration period, and the expiration time for the events in a data stream or segment of a data stream is the largest expiration time among events in the data stream or segment. Different segments can have different expiration times for their events. In a segment comprising a group of events, a subgroup of expired events prior to a stream cut are deleted by an expiration task. For a subgroup of unexpired events prior to a stream cut, the expiration task retains (does not delete) the subgroup of events. If a scaling operation is performed on a segment, the new successor segment or segments inherit the largest expiration time of the predecessor segment or segments.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: a processor; and a memory that stores executable instructions that, when executed by the processor, facilitate performance of operations, the operations comprising: associating a stream cut with a first segment of a data stream of events and with a second segment of the data stream of events; maintaining a first expiration time for the first segment, the first expiration time being defined based on a first greatest event expiration time of a first group of events appended to the first segment, the first segment comprising a first subgroup of events that are prior to the stream cut and currently expired based on the first expiration time; maintaining a second expiration time for the second segment, the second expiration time being defined based on a second greatest event expiration time of a second group of events appended to the second segment, the second segment comprising a second subgroup of events that are prior to the stream cut and are currently unexpired based on the second expiration time; and deleting the first subgroup of events and retaining the second subgroup of events. 2. The system of claim 1 , wherein the operations further comprise performing a scaling operation that splits the first segment into a third segment and a fourth segment, setting a third expiration time of the third segment to the first expiration time, and setting a fourth expiration time of the fourth segment to the first expiration time. 3. The system of claim 1 , wherein the stream cut is a first stream cut, and wherein the operations further comprise associating a second stream cut with the third segment and the fourth segment, determining that a third subgroup of events that are prior to the second stream cut are currently expired, deleting the third subgroup of events, and deleting the second segment. 4. The system of claim 3 , wherein the operations further comprise deleting a predecessor segment of the second segment, the predecessor segment having been sealed in conjunction with the creation of the second segment in a scaling operation. 5. The system of claim 3 , wherein the operations further comprise detecting an empty epoch having no associated segment, and deleting the empty epoch. 6. The system of claim 1 , wherein the operations further comprise performing a scaling operation that merges the first segment and the second segment into a third segment, and setting a third expiration time of the third segment to the greater of the first expiration time or the second expiration time. 7. The system of claim 6 , wherein the stream cut is a first stream cut, and wherein the operations further comprise associating a second stream cut with the third segment, determining that a third subgroup of events that are prior to the second stream cut are currently expired, deleting the third subgroup of events, deleting the first segment and deleting the second segment. 8. The system of claim 6 , wherein the operations further comprise deleting a predecessor segment of the first segment, the predecessor segment having been sealed in conjunction with the creation of the second segment in a scaling operation. 9. The system of claim 6 , wherein the operations further comprise detecting an empty epoch having no associated segment, and deleting the empty epoch. 10. The system of claim 1 , wherein the first expiration time for the first segment is maintained in first metadata of a first segment store instance associated with the first segment, and wherein the second expiration time for the second segment is maintained in second metadata of a second segment store instance associated with the second segment. 11. A method, comprising: appending, by a streaming data storage system comprising a processor, an event to a segment of a data stream, the event associated with an event expiration time, and associated with a routing key by which the segment is determined; obtaining, by the streaming data storage system, a stream cut expiration time associated with the segment and a stream cut; determining, by the streaming data storage system, whether the event expiration time is greater than the stream cut expiration time associated with the segment; in response to the event expiration time being determined to be greater than the stream cut expiration time associated with the segment, updating the stream cut expiration time to equal the event expiration time; and deleting, by the streaming data storage system, events from the segment that are prior to the stream cut and that are expired based on the stream cut expiration time. 12. The method of claim 11 , wherein the appending of the event to the segment and the updating of the stream cut expiration time to equal the event to equal the event expiration time occur in an atomic operation. 13. The method of claim 11 , wherein the event is a first event, wherein the segment is a first segment, wherein the event expiration time is a first expiration time, wherein the routing key is a first routing key, and further comprising appending, by the streaming data storage system, a second event to a second segment of the data stream, the second event associated with a second event expiration time, and associated with a second routing key by which the second segment is selected, obtaining, by the streaming data storage system, a second stream cut expiration time associated with the second segment and the stream cut, determining, by the streaming data storage system, whether the second event expiration time is greater than the second stream cut expiration time associated with the second segment, in response to the second event expiration time being determined to be greater than the second stream cut expiration time associated with the second segment, updating, by the streaming data storage system, the second stream cut expiration time to equal the second event expiration time, and deleting, by the streaming data storage system, events from the second segment that are prior to the stream cut and that are expired based on the second stream cut expiration time. 14. The method of claim 11 , wherein the segment is a predecessor segment, and further comprising detecting, by the streaming data storage system, a scaling event that creates successor segments from the predecessor segment and seals the predecessor segment, and initializing, by the streaming data storage system, the stream cut expiration time of each successor segment to the expiration time associated with the predecessor segment. 15. The method of claim 11 , wherein the segment is a first predecessor segment, and further comprising detecting, by the streaming data storage system, a scaling event that merges the first predecessor segment and a second predecessor segment into a successor segment, in response to the scaling event, obtaining, by the streaming data storage system, a maximum expiration time of the first predecessor segment and the second predecessor segment, and initializing, by the streaming data storage system, the stream cut expiration time of the successor segment to the maximum expiration time. 16. The method of claim 11 , further comprising detecting, by the streaming data storage system, an empty epoch having no associated segment, and deleting the empty epoch. 17. A non-transitory machine-readable medium, comprising executable instructions that, when executed by a processor of a streaming data storage system, facilitate performance of operations, the operations comprising: appending events to active segments of a group of active segments associated with a data stream;

Assignees

Inventors

Classifications

  • G06F3/0652Primary

    Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket · CPC title

  • Saving storage space on storage systems · CPC title

  • Single storage device · CPC title

  • G06F3/0671Primary

    In-line storage system · CPC title

  • Lifecycle management · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11740828B2 cover?
The described technology is generally directed towards fine-grained data event expiration in a streaming data storage system. An event to append is given an expiration period, and the expiration time for the events in a data stream or segment of a data stream is the largest expiration time among events in the data stream or segment. Different segments can have different expiration times for the…
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F3/0652. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 29 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).