Method and system for optimizing data replication for large scale archives

US11514074B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11514074-B2
Application numberUS-201916597036-A
CountryUS
Kind codeB2
Filing dateOct 9, 2019
Priority dateSep 30, 2015
Publication dateNov 29, 2022
Grant dateNov 29, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and system for processing query requests, including receiving, at a secondary site, a query request from a client and executing the query request to obtain an archive replica package (ARP). The method further includes making a determination that a record associated with the ARP is not stored at the secondary site and based on the determination, transmitting a request to a primary site. The method further includes, in response to the request to the primary site, receiving an archive package and a record where the archive package is associated with the record, and providing the first record to the client.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for processing query requests, comprising: selecting, by a computer processor of a first query node at a primary site, a first archive package (AP) from a plurality of APs; making a determination that the first AP requires a transformation; generating, based on the determination, an archive replica package (ARP) from a first transformed AP, wherein the first transformed AP is obtained by processing the first AP; transmitting the ARP to a second query node at a first secondary site; receiving a first query request from the second query node at the first secondary site; executing the first query request to obtain the first AP; processing the first AP to obtain a first record; and transmitting, in response to the first query request, the first AP and the first record to the second query node at the first secondary site. 2. The method of claim 1 , wherein the first query request comprises an archive unit identifier (ID), wherein the first AP comprises an archive unit comprising the archive unit ID. 3. The method of claim 2 , wherein the archive unit further comprises a record reference referring to a location of the first record, wherein processing the first AP to obtain the first record, comprises: examining the archive unit to identify the record reference; and retrieving, based on the record reference, the first record from a repository on the first query node. 4. The method of claim 3 , wherein processing the first AP to obtain the first record, further comprises: obtaining a compliance rule targeting the first record; and applying the compliance rule to the first record to obtain a first processed record, wherein the first processed record is transmitted to the second query node at the first secondary site in place of the first record. 5. The method of claim 4 , wherein the compliance rule relates to one selected from a group consisting of a geographical location of the second query node and a sensitivity level of the first record. 6. The method of claim 4 , wherein when applied to the first record to obtain the first processed record, the compliance rule enforces at least one from a group consisting of an elimination of a first field from the first record, a masking of a second field in the first record, and a return of a count of a number of entries for a third field, instead of the third field, in the first record. 7. The method of claim 1 , further comprising: prior to generating the ARP: selecting a second AP from the plurality of APs; making a second determination that the second AP does not require another transformation, after generating the ARP: generating, based on the second determination, a second ARP from the second AP; and transmitting the second ARP to the second query node at the first secondary site. 8. The method of claim 1 , wherein processing the first AP to obtain the first transformed AP, comprises at least one selected from a group consisting of removing a field from the first AP, removing a record reference from the first AP, and retaining fields in the first AP required for indexing the first ARP. 9. The method of claim 1 , further comprising: transmitting the ARP to a third query node at a second secondary site. 10. The method of claim 1 , further comprising: processing, based on the determination, the first AP to obtain a second transformed AP; generating a second ARP from the second transformed AP; and transmitting the second ARP to a third query node at a second secondary site. 11. The method of claim 10 , wherein a first compliance rule is enforced on the first AP to obtain the first transformed AP, wherein a second compliance rule is enforced on the first AP to obtain the second transformed AP. 12. The method of claim 11 , wherein the first compliance rule relates to a first geographical location of the second query node, wherein the second compliance rule relates to a second geographical location of the third query node. 13. The method of claim 1 , further comprising: identifying a second AP based on analytics information associated with the first secondary site; processing the second AP to obtain a second record; and transmitting, irrespective of the first query request, the second AP and the second record to the second query node at the first secondary site. 14. The method of claim 1 , wherein a second record is further obtained from processing the first AP, wherein the second record is further transmitted to the second query node at the first secondary site in response to the first query request. 15. The method of claim 1 , further comprising: receiving, by the first query node at the primary site, a second query request from a third query node at a second secondary site; executing the second query request to obtain a second AP; processing the second AP to obtain a second record; and transmitting, in response to the second query request, the second AP and the second record to the third query node at the second secondary site. 16. A system, comprising: a plurality of query nodes operatively connected to one another and comprising: a first query node at a primary site and comprising a first computer processor; and a second query node at a first secondary site and comprising a second computer processor, wherein the first query node is configured to: select a first archive package (AP) from a plurality of APs; make a determination that the first AP requires a transformation; generate, based on the determination, an archive replica package (ARP) from a first transformed AP, wherein the first transformed AP is obtained by processing the first AP; transmit the ARP to the second query node; receive a first query request from the second query node; execute the first query request to obtain the first AP; process the first AP to obtain a first record; and transmit, in response to the first query request, the first AP and the first record to the second query node. 17. The system of claim 16 , further comprising: a plurality of clients operatively connected to the plurality of query nodes, wherein the second query node submits the first query request to the first query node in response to receiving a second query request for the first record from a client of the plurality of clients, and based on a determination that the first record is not stored on the second query node. 18. The system of claim 16 , further comprising: a third query node of the plurality of query nodes, at a second secondary site and comprising a third computer processor, wherein the first query node is further configured to: receive a second query request from the third query node; execute the second query request to obtain a second AP; process the second AP to obtain a second record; and transmit, in response to the second query request, the second AP and the second record to the third query node. 19. A non-transitory computer readable medium (CRM) comprising computer readable program code, which when executed by a computer processor, enables the computer processor to: select, by a first query node at a primary site, a first archive package (AP) from a plurality of APs; make a determination that the first AP requires a transformation; generate, based on the determination, an archive replica package (ARP) from a first transformed AP, wherein the first transformed AP is obtained by processing the first AP; transmit the ARP to a second query node at a first secondary site; receive a first query request from the s

Assignees

Inventors

Classifications

  • Managing data history or versioning (querying versioned data G06F16/2474; querying temporal data G06F16/2477) · CPC title

  • Details of archiving (lifecycle management in storage systems G06F3/0649; point-in-time backing up or restoration of persistent data G06F11/1446) · CPC title

  • Backup restoration techniques · CPC title

  • G06F16/27Primary

    Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title

  • Query processing · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11514074B2 cover?
A method and system for processing query requests, including receiving, at a secondary site, a query request from a client and executing the query request to obtain an archive replica package (ARP). The method further includes making a determination that a record associated with the ARP is not stored at the secondary site and based on the determination, transmitting a request to a primary site.…
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/27. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 29 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).