Chunk aware image locality scores for container images in multi-node clusters

US2024134878A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2024134878-A1
Application numberUS-202218049548-A
CountryUS
Kind codeA1
Filing dateOct 25, 2022
Priority dateOct 25, 2022
Publication dateApr 25, 2024
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure provides methods, systems, and techniques for managing replication in a deployable object, such as a pod (e.g., a group of one or more containers). For example, when a pod is started in a cluster, the deployable object may start one or more virtual computer systems (e.g., containers), which may pull (e.g., initiate and run) container images from a registry server. The processing device may thus identify, such as on the file level, which container image should be pulled first. A scheduler of the one or more virtual computer systems may prioritize, based on the computed scores, a subsequent replication of archived data of the one or more virtual computer systems to be performed. The processing device may then execute, based on the prioritization by the scheduler, the subsequent replication of the archived data of the one or more virtual computer systems in the deployable object.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of managing replication in a deployable object, the method comprising: starting one or more virtual computer systems in the deployable object; computing, by a processing device, a plurality of scores, each indicative of a respective amount of data transfer measured based on a chunk size difference between an existing file in the one or more virtual computer systems and a reference file of a target device to which the one or more virtual computer systems replicate; prioritizing, by a scheduler of the one or more virtual computer systems, based on the computed plurality of scores, a subsequent replication of archived data of the one or more virtual computer systems to be performed; and executing the subsequent replication of the archived data of the one or more virtual computer systems in the deployable object. 2 . The method of claim 1 , wherein starting the one or more virtual computer systems in the deployable object comprises: initiating one or more containers in a pod running on a node of a cluster. 3 . The method of claim 2 , wherein the archived data comprises at least one container image to be replicated in the one or more containers, wherein the at least one container image comprises binary data representing an application and software dependent thereon. 4 . The method of claim 3 , wherein executing the subsequent replication of the archived data comprises pulling the at least one container image for initiating the one or more containers in the pod. 5 . The method of claim 4 , wherein the pod is running on a node of a Kubernetes cluster. 6 . The method of claim 2 , wherein computing the plurality of scores comprises: computing, at the node of the cluster, the plurality of scores based on a chunk size corresponding to missing information in the one or more virtual computer systems, wherein the node of the cluster receives a list of chunks from the scheduler of the one or more virtual computer systems; and sending, from the node of the cluster, the plurality of scores to the scheduler of the one or more virtual computer systems for prioritizing the subsequent replication. 7 . The method of claim 6 , further comprising: retrieving, by the scheduler, metadata associated with the archived data, wherein the metadata comprises a plurality of layers, each including a list of files and associated information, wherein part of the list of files is split into two or more chunks; merging, by the scheduler, the plurality of layers of the metadata into a continuous list for the node of the cluster; and computing, at the node of the cluster, the plurality of scores based on the merged continuous list. 8 . An apparatus for managing replication in a deployable object, the apparatus comprising: a memory; and a processing device coupled to the memory, the processing device and the memory to: start one or more virtual computer systems in the deployable object; compute, by a processing device, a plurality of scores, each indicative of a respective amount of data transfer measured based on a chunk size difference between an existing file in the one or more virtual computer systems and a reference file of a target device to which the one or more virtual computer systems replicate; prioritize, by a scheduler of the one or more virtual computer systems, based on the computed plurality of scores, a subsequent replication of archived data of the one or more virtual computer systems to be performed; and execute the subsequent replication of the archived data of the one or more virtual computer systems in the deployable object. 9 . The apparatus of claim 8 , wherein the processing device and the memory are to start the one or more virtual computer systems in the deployable object by: initiating one or more containers in a pod running on a node of a cluster. 10 . The apparatus of claim 9 , wherein the archived data comprises at least one container image to be replicated in the one or more containers, wherein the at least one container image comprises binary data representing an application and software dependent thereon. 11 . The apparatus of claim 10 , wherein the processing device and the memory are to execute the subsequent replication of the archived data by pulling the at least one container image for initiating the one or more containers in the pod. 12 . The apparatus of claim 11 , wherein the pod is running on a node of a Kubernetes cluster. 13 . The apparatus of claim 9 , wherein the processing device and the memory are to compute the plurality of scores by: computing, at the node of the cluster, the plurality of scores based on a chunk size corresponding to missing information in the one or more virtual computer systems, wherein the node of the cluster receives a list of chunks from the scheduler of the one or more virtual computer systems; and sending, from the node of the cluster, the plurality of scores to the scheduler of the one or more virtual computer systems for prioritizing the subsequent replication. 14 . The apparatus of claim 13 , wherein the processing device and the memory are further to: retrieve, by the scheduler, metadata associated with the archived data, wherein the metadata comprises a plurality of layers, each including a list of files and associated information, wherein part of the list of files is split into two or more chunks; merge, by the scheduler, the plurality of layers of the metadata into a continuous list for the node of the cluster; and compute, at the node of the cluster, the plurality of scores based on the merged continuous list. 15 . A non-transitory computer-readable storage medium having instructions stored thereon that, when executed by a processing device for managing replication in a deployable object, cause the processing device to: start one or more virtual computer systems in the deployable object; compute, by a processing device, a plurality of scores, each indicative of a respective amount of data transfer measured based on a chunk size difference between an existing file in the one or more virtual computer systems and a reference file of a target device to which the one or more virtual computer systems replicate; prioritize, by a scheduler of the one or more virtual computer systems, based on the computed plurality of scores, a subsequent replication of archived data of the one or more virtual computer systems to be performed; and execute the subsequent replication of the archived data of the one or more virtual computer systems in the deployable object. 16 . The non-transitory computer-readable storage medium of claim 15 , wherein to start the one or more virtual computer systems in the deployable object is to: initiate one or more containers in a pod running on a node of a cluster. 17 . The non-transitory computer-readable storage medium of claim 16 , wherein the archived data comprises at least one container image to be replicated in the one or more containers, wherein the at least one container image comprises binary data representing an application and software dependent thereon. 18 . The non-transitory computer-readable storage medium of claim 17 , wherein to execute the subsequent replication of the archived data is to: pull the at least one container image for initiating the one or more containers in the pod. 19 . The non-transitory computer-readable storage medium of claim 18 , wherein the pod is running on a node of a Kubernetes cluster. 20 . The non-transitory comp

Assignees

Inventors

Classifications

  • G06F16/27Primary

    Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title

  • Hypervisor-specific management and integration aspects · CPC title

  • Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title

  • Memory management, e.g. access or allocation · CPC title

  • Creating, deleting, cloning virtual machine instances · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2024134878A1 cover?
The present disclosure provides methods, systems, and techniques for managing replication in a deployable object, such as a pod (e.g., a group of one or more containers). For example, when a pod is started in a cluster, the deployable object may start one or more virtual computer systems (e.g., containers), which may pull (e.g., initiate and run) container images from a registry server. The pro…
Who is the assignee on this patent?
Red Hat Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/27. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Apr 25 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).