Record deduplication in database systems
US-2022043787-A1 · Feb 10, 2022 · US
US12499110B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12499110-B2 |
| Application number | US-202418770839-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 12, 2024 |
| Priority date | Apr 28, 2023 |
| Publication date | Dec 16, 2025 |
| Grant date | Dec 16, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A database system operates by: determining a set of assigned segments based on data ownership information; generating a plurality of segment handles and a corresponding plurality of segment metadata for the set of assigned segments based on performing a segment activation step for the set of assigned segments; determining a first query for execution requiring access to a first segment of the set of assigned segments; and executing an IO operator of the first query based on loading the first segment by utilizing a corresponding one of the plurality of segment handles.
Opening claim text (preview).
What is claimed is: 1 . A database system comprises: a first set of computing devices operable to: store incoming data of a dataset as data segments in short-term storage memory of the first set of computing devices; generate segment handles for the data segments, wherein a first segment handle of the segment handles is for a first data segment of the data segments, wherein the first segment handle includes information regarding contents of the first data segment; and store the segment handles in the short-term storage memory; a second set of computing devices operable to: receive a query regarding the dataset; and process the query into local level query operators, inner level query operations, and root level query operations; a third set of computing devices operable to: receive the local level query operators, which identify the dataset; request a group of segment handles for a group of data segments of the dataset that are not locally stored in memory of the third set of computing devices; interpret the group of segment handles to identify a set of the data segments of the group of data segments that are needed for at least one of the local level query operators, wherein the set of data segments includes at least some of the data segments stored in the short-term memory; and for the set of the data segments that is needed for the at least one of the local level query operators: schedule retrieval of the set of the data segments from the first set of computing devices; and upon receiving the set of the data segments, execute the at least one of the local level query operators on the set of data segments to produce a partial query result. 2 . The database system of claim 1 , wherein the first segment handle comprises: metadata regarding the first data segment, wherein the metadata includes one or more of a time range, a row count, a time filter, a column count, and a row filter. 3 . The database system of claim 1 further comprises: the first set of computing devices is further operable to: error encode a data segment of the data segments stored in the short-term storage memory to produce an error encoded data segment; transmit the error encoded data segment to a computing device of the third set of computing devices: the computing device of the third set of computing devices is operable to: facilitate storage of the error encoded data segment is long-term storage memory of the third set of computing devices; and the first set of computing devices is further operable to: delete the data segment and the corresponding segment handle from the short-term storage memory upon storage of the error encoded data segment in the long-term storage memory. 4 . The database system of claim 1 , wherein the third set of computing devices is further operable to request the group of segments by: performing a read-ahead operation to identify the group of data segments; accessing long-term storage memory of the third set of computing devices to determine whether the group of data segments is stored as a group of error encoded data segments; when the group of data segments is not stored as the group of error encoded data segments in the long-term storage memory, sending the request for the group of segments to a computing device of the first set of computing devices. 5 . The database system of claim 1 , wherein the third set of computing devices is further operable to: store the group of segments in cache memory until a group of error encoded data segments corresponding the group of segments is stored in long-term storage memory of the third set of computing devices. 6 . The database system of claim 5 further comprises: the second set of computing devices is further operable to: receive a second query regarding the dataset; and process the second query into second local level query operators, second inner level query operations, and second root level query operations; the third set of computing devices is further operable to: receive the second local level query operators, which identify the dataset; retrieve, from the cache memory, a second set of data segments of the group of data segments, wherein the second set of data segments are needed for at least one of the second local level query operators; and for the second set of the data segments, execute the at least one of the second local level query operators on the second set of data segments to produce a second partial query result. 7 . The database system of claim 1 further comprises: an operator of the local level query operators includes one or more operational instructions regarding a query operation. 8 . The database system of claim 1 further comprises one or more of: the first set of computing devices includes one or more computing devices; the second set of computing devices includes one or more computing devices; the third set of computing devices includes one or more computing devices; and the set of the data segments includes one or more data segments. 9 . A computer readable memory system comprises: a first memory that stores operational instructions that, when executed by a first set of computing devices of a database system, causes the first set of computing devices to: store incoming data of a dataset as data segments in short-term storage memory of the first set of computing devices; generate segment handles for the data segments, wherein a first segment handle of the segment handles is for a first data segment of the data segments, wherein the first segment handle includes information regarding contents of the first data segment; and store the segment handles in the short-term storage memory; a second memory that stores operational instructions that, when executed by a second set of computing devices of the database system, causes the second set of computing devices to: receive a query regarding the dataset; and process the query into local level query operators, inner level query operations, and root level query operations; a third memory that stores operational instructions that, when executed by a third set of computing devices of the database system, causes the third set of computing devices to: receive the local level query operators, which identify the dataset; request a group of segment handles for a group of data segments of the dataset that are not locally stored in memory of the third set of computing devices; interpret the group of segment handles to identify a set of the data segments of the group of data segments that are needed for at least one of the local level query operators, wherein the set of data segments includes at least some of the data segments stored in the short-term memory; and for the set of the data segments that is needed for the at least one of the local level query operators: schedule retrieval of the set of the data segments from the first set of computing devices; and upon receiving the set of the data segments, execute the at least one of the local level query operators on the set of data segments to produce a partial query result. 10 . The computer readable memory system of claim 9 , wherein the first segment handle comprises: metadata regarding the first data segment, wherein the metadata includes one or more of a time range, a row count, a time filter, a column count, and a row filter. 11 . The computer readable memory system of claim 9 further comprises: the first memory further stores operational instructions that, when executed by the first set of computing devices, causes the first set of computing devices to: error encode a data segment of the data segments stored in the short-term
Updates performed during online database operations; commit processing · CPC title
Column-oriented storage; Management thereof · CPC title
of operators · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.