Efficient query optimization on distributed data sets
US-11816081-B1 · Nov 14, 2023 · US
US12301700B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12301700-B2 |
| Application number | US-202218061086-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 2, 2022 |
| Priority date | Dec 3, 2021 |
| Publication date | May 13, 2025 |
| Grant date | May 13, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Provided is a method and system for preventing transfer errors of big data, including dividing entire data to be transmitted by a first device into a plurality of pieces of entity data and allocating a resource to first entity data among the plurality of pieces of entity data, applying, by the first device, the first entity data to a filter to query for the first entity data, transmitting, by the first device, the first entity data to a second device based on a result of the query for the first entity data, and upon receiving a sync request for the first entity data from the second device, releasing, by the first device, the resource allocated to the first entity data.
Opening claim text (preview).
What is claimed is: 1. A method of preventing a transmission error of big data, the method comprising: dividing all data to be transmitted by a first device into a plurality of pieces of entity data and allocating a resource to first entity data among the plurality of pieces of entity data; querying, by the first device, the first entity data in a Bloom filter area included in a Bloom filter array according to a plurality of locations of a hashed first entity data by applying the first entity data to a plurality of hash functions; transmitting, by the first device, the first entity data to a second device based on a result of the query for the first entity data; setting, by the first device, a layout bit value for the first entity data in a layout area which has the same number of layout bit values as the plurality of pieces of entity data; and comparing, by the first device, bit values set in the Bloom filter area with the layout bit value to identify a false-positive error for the first entity data. 2. The method of claim 1 , wherein the allocating the resource comprises: transmitting, by the first device, a file request for all data to the second device; transmitting, by the second device, a file ID of all data according to the file request to the first device; and allocating, by the first device, the file ID to all data and dividing all data into the plurality of pieces of entity data. 3. The method of claim 2 , wherein the querying comprises: identifying, by the first device, a plurality of locations of the hashed first entity data in the Bloom filter area; and identifying, by the first device, bit values of the identified plurality of locations in the Bloom filter area. 4. The method of claim 3 , wherein the transmitting of the first entity data to the second device comprises, based on the identified bit values being 0, transmitting the first entity data to the second device. 5. The method of claim 4 , further comprising, after the identifying for the false-positive error, releasing, by the first device, the resource allocated to the first entity data for transmission of at least one second entity data different from the first entity data according to the request from the second device. 6. The method of claim 5 , wherein the setting the layout bit value comprises: setting, by the first device, bit values to 1 of the identified plurality of locations to update the Bloom filter according to a sync request from the second device. 7. The method of claim 6 , further comprising, after setting bit values to 1, synchronizing, by the first device, the set layout bit value to 1 set as the bit values corresponding to the identified plurality of locations. 8. The method of claim 7 , wherein the identifying for the false-positive error comprises: returning, by the first device, false-positive, positive, or negative according to a result of comparison with the layout bit value related to the first entity data and the bit values of the plurality of locations related to the first entity data. 9. A system for preventing a transmission error of big data, the system comprising: a first device configured to allocate a resource to first entity data among a plurality of pieces of entity data formed by dividing all data, query the first entity data in a Bloom filter area included in a Bloom filter array according to a plurality of locations of a hashed first entity data by applying the first entity data to a plurality of hash functions, transmit the first entity data based on a result of the query for the first entity data, set a layout bit value for the first entity data in a layout area which has the same number of layout bit values as the plurality of pieces of entity data, and compare bit values set in the Bloom filter area with the layout bit value to identify a false-positive error for the first entity data; and a second device configured to receive the first entity data from the first device. 10. The system of claim 9 , wherein the first device is configured to: transmit a file request for all data to the second device; allocate a file ID received from the second device to all data; and divide all data into the plurality of pieces of entity data. 11. The system of claim 10 , wherein the first device is configured to: identify a plurality of locations of the hashed first entity data in the Bloom filter area; and identify bit values of the identified plurality of locations in the Bloom filter area. 12. The system of claim 11 , wherein the first device is configured to, based on the identified bit value being 0, transmit the first entity data to the second device. 13. The system of claim 12 , wherein the first device is configured to set the bit values to 1 of the identified plurality of locations to update the Bloom filter according to a sync request from the second device, and release the resource allocated to the first entity data for transmission of at least one second entity data different from the first entity data. 14. The system of claim 13 , wherein the first device is configured to synchronize the set layout bit value to 1 set as the bit values corresponding to the identified plurality of locations. 15. The system of claim 14 , wherein the first device is configured to return false-positive, positive or negative according to a result of comparison with the layout bit value related to the first entity data and the bit values of the plurality of locations related to the first entity data.
Techniques for file synchronisation in file systems · CPC title
using file content signatures, e.g. hash values · CPC title
Hash functions, e.g. MD5, SHA, HMAC or f9 MAC · CPC title
specially adapted for file transfer, e.g. file transfer protocol [FTP] · CPC title
Details of migration of file systems (migration mechanisms in storage systems G06F3/0647) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.