Key-value based data storage device and operation method thereof
US-2024370363-A1 · Nov 7, 2024 · US
US2025190417A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2025190417-A1 |
| Application number | US-202418790545-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jul 31, 2024 |
| Priority date | Dec 6, 2023 |
| Publication date | Jun 12, 2025 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present disclosure provides a data collection method and apparatus, a computer device, and a storage medium. The method includes: determining, in response to a garbage collection request, respective first index entries from a first data table to be processed, first key data and storage location information in the first data table of a first key-value pair data corresponding to the first key data are stored in the first index entries; selecting valid target key data from the first key data according to current respective second data tables in the log-structured merge tree; reading target value data corresponding to the respective target key data in the first data table according to storage location information in the first index entries; constructing a new first data table according to the target key data and the target value data, and collecting the first data table to be processed.
Opening claim text (preview).
1 . A data collection method, comprising: determining, in response to a garbage collection request, respective first index entries from a first data table to be processed, wherein first key-value pair data and the first index entries are stored in the first data table, the first key-value pair data is derived from a key-value separated log-structured merge tree, and first key data and storage location information in the first data table of the first key-value pair data corresponding to the first key data are stored in the first index entries; selecting valid target key data from the first key data stored in the respective first index entries according to current respective second data tables in the log-structured merge tree; reading target value data corresponding to the respective target key data in the first data table according to storage location information in the first index entries where the respective target key data are located; and constructing a new first data table according to the target key data and the target value data, and collecting the first data table to be processed, wherein target key-value pair data consisting of the target key data and the target value data, as well as new first index entries, are stored in the new first data table, the target key data and storage location information in the new first data table of the target key-value pair data corresponding to the target key data are comprised in the new first index entries. 2 . The method according to claim 1 , wherein selecting the valid target key data from the first key data stored in the respective first index entries according to the current respective second data tables in the log-structured merge tree, comprises: selecting first key data identical to any one of the second key data, as the valid target key data, from the first key data stored in the respective first index entries according to respective second key data stored in the current respective second data tables in the log-structured merge tree. 3 . The method according to claim 2 , wherein selecting the first key data identical to any one of the second key data as the valid target key data from the first key data stored in the respective first index entries according to the respective second key data stored in current respective current second data tables in the log-structured merge tree, comprises: traversing, with respect to any one of the first key data, the respective second data tables sequentially in accordance with a hierarchy to which the respective second data table belongs in the log-structured merge tree to search a second data table associated with the first key data; and determining that the first key data is the valid target key data in response to that the second data table associated with the first key data is searched and the second key data in the second data table which is identical to the first key data has the same version as the first key data. 4 . The method according to claim 2 , wherein selecting the first key data identical to any one of the second key data as the valid target key data, from the first key data stored in the respective first index entries according to the respective second key data stored in current respective second data tables in the log-structured merge tree, comprises: traversing, with respect to any one of the first key data, the respective second data tables sequentially in accordance with a hierarchy to which the respective second data table belongs in the log-structured merge tree to determine a second data table associated with the first key data; and reading a first target index data block from the second data table associated with the first key data, wherein the first target index data block comprises multiple second index entries, the second index entries are first type of index entries and/or second type of index entries, the first type of index entries are indexes associated with the second key data and table indexes of the first data table in which the first value data corresponding to the second key data is located, the second type of index entries are index entries associated with third key data and storage location information of key-value pair data corresponding to the third key data in the second data table, the key-value pair data corresponding to the third key data has a data volume less than a preset data volume, and the key-value pair data corresponding to the second key data has a data volume greater than or equal to the preset data volume; determining, in response to the multiple second index entries with the first type of index entries, whether second key data matching the first key data exists according to the second key data in the first type of index entries; and taking the first key data as the target key data in response to determining that second key data matching the first key data exists. 5 . The method according to claim 3 , wherein the second data table associated with the first key data contains second key data identical to the first key data, and in response to that there exists multiple second data tables containing the second key data identical to the first key data, a second data table with the highest hierarchical level in the multiple second data tables is token as the second data table associated with the first key data. 6 . The method according to claim 1 , wherein before selecting the first key data identical to any one of the second key data as the valid target key data from the first key data stored in the respective first index entries according to the respective second key data stored in current respective second data tables in the log-structured merge tree, the method comprises: determining, with respect to any one of the first key data, whether matched key data matching the first key data exists from buffered key data with a hot data feature that is contained in a set of key data buffered in a memory, wherein the hot data feature is used for indicating that the buffered key data is key data repeatedly written multiple times; determining whether the matched key data and the first key data are the same version in response to that there exists the matched key data matched with the first key data; and determining that the first key data is invalid key data in response to that the matched key data and the first key data are not the same version; or taking the first key data directly as the target key data in response to that the matched key data and the first key data are the same version. 7 . The method according to claim 6 , wherein selecting the first key data identical to any one of the second key data as the valid target key data, from the first key data stored in the respective first index entries according to the respective second key data stored in current respective second data tables in the log-structured merge tree, comprises: selecting the target key data from the first key data without the matched key data according to the respective second key data stored in the respective second data tables. 8 . The method according to claim 7 , wherein constructing the new first data table according to the target key data and the target value data, comprises: determining the target key data with the matched key data as hot key data with the hot data feature, and/or, determining the target key data without the matched key data as cold key data with a cold data feature, wherein the cold data feature is used for indicating that the cold key data is key data written once; and constructing the new first data table with the hot data feature according to the respective hot key data and target value data corresponding to the hot key data, and/or, constructing the new first data table with the cold data feature according to the re
Tablespace storage structures; Management thereof · CPC title
Trees, e.g. B+trees · CPC title
Garbage collection, i.e. reclamation of unreferenced memory · CPC title
Journaling file systems · CPC title
Indexing structures · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.