Version control system using content-based datasets and dataset snapshots

US2024143815A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2024143815-A1
Application numberUS-202217975919-A
CountryUS
Kind codeA1
Filing dateOct 28, 2022
Priority dateOct 28, 2022
Publication dateMay 2, 2024
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Managing versioning of data objects for a project revised from a first version to a revised version by producing a dataset representing the data objects as a group by scanning the data objects to identify metadata of the grouped data to be processed similarly within a current version of the lifecycle, and storing the identified metadata in the dataset. Data object changed from the first version to the revised version are identified, and the corresponding metadata for changed data objects in the dataset is updated. A version control operation is then performed on the dataset to update all data objects referenced by the dataset from the first version to the revised version. A commit-map and commit-tree are stored in a repository, and version control operations including commit, checkout, merge, branch and merge-branch are performed on the dataset snapshot.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method of managing different versions of data objects for a version control system (VCS) during a lifecycle of the data objects, comprising: producing a dataset representing the data objects as a group by scanning the data objects to identify metadata of the grouped data to be processed similarly within a current version of the lifecycle, and storing the identified metadata in the dataset; identifying data objects that themselves are subject to a change from the current version to a next version during the lifecycle; updating corresponding metadata for changed data objects in the dataset; and applying a version control operation on the dataset to update all data objects referenced by the dataset from the current version to the next version. 2 . The method of claim 1 wherein the dataset is distributed across the plurality of storage devices comprise network attached storage (NAS), object storage, local storage, or cloud networks, the method further comprising; generating, by each provider of a storage device of the plurality of storage devices, a dataset snapshot as a read-only dataset component stored in memory local to the provider, wherein the dataset snapshot comprises a list of snapshot copies provided by each provider; and copying the dataset to a remote storage location using a dataset backup, wherein the remote storage location is different from the local storage location. 3 . The method of claim 2 wherein the lifecycle of the data objects in the VCS comprises checking out data objects of a project to be modified, modifying the data objects to generate a revised version of the project from a first version, and committing the data objects of the revised version to a repository as a VCS datastore. 4 . The method of claim 3 further comprising storing, in the VCS datastore, a commit-map and commit-tree of the next version of the project, wherein the commit map stores commit records for the data objects from the first version to the revised version, and wherein the commit-tree stores a timeline of the commit operations generating the commit records. 5 . The method of claim 4 further comprising: assigning a snapshot-ID to each dataset snapshot for tracking a corresponding snapshot through the commit map and commit-tree; and performing one or more VCS operations on an identified dataset snapshot including at least one of a commit, checkout, merge, branch, or merge-branch operation. 6 . The method of claim 5 further comprising defining a HEAD index that points to a commit operation that the dataset snapshot is based on, and wherein the HEAD index is null at a beginning of a commit-tree for the delete snapshot. 7 . The method of claim 6 wherein, for the commit operation, the method further comprises: creating the dataset snapshot for storage on either the remote or local storage; creating a commit record; and adding a commit identifier in the commit-tree after a position of the HEAD index; and setting the HEAD index to be the commit identifier. 8 . The method of claim 7 wherein, for the checkout operation, the method further comprises: retrieving snapshot-ID from the commit record; copying content of the dataset snapshot to the original dataset; and setting the HEAD index to be a checkout commit identifier. 9 . The method of claim 8 wherein, for the merge operation, the method further comprises: retrieving the snapshot-ID from the commit record merging the content of the dataset snapshot with the original dataset; and performing the commit operation. 10 . The method of claim 9 wherein, for the branch operation, the method further comprises: creating a new dataset snapshot from the original dataset; and creating a new checkout commit-ID to be stored in a new datastore. 11 . The method of claim 10 wherein, for the merge-branch operation, the method further comprises: merging the original dataset into a target datastore; and committing the merge in the target datastore. 12 . The method of claim 11 wherein the VCS manages changes to software programs, documents, web sites, and other content data embodying the data objects, and wherein the first version and revised version are each denoted by successive alphanumeric version character, and wherein each identifier of the snapshot-ID and commit-ID reference the version character. 13 . The method of claim 3 wherein the data objects within each version of the project are encompassed by a respective dataset and are subject to same control rules in each stage of a lifecycle of the project as grouped data, wherein the control rules provide access only to authorized users or perform only authorized operations including data storage operations on the dataset referenced data objects based on a current stage of the lifecycle, and wherein the dataset is processed in the system as a single unit based on data content rather than data location. 14 . The method of claim 13 wherein the dataset is produced by: gathering the identified metadata for storage in a data catalog; and executing a user entered query comprising metadata selectors as dataset tags for matching against the cataloged metadata to generate the dataset, wherein the metadata selectors comprise tags consisting of alphanumeric strings applied to respective data objects based on user-defined rules, and wherein the tags define at least one of a file type, name, location, creation time, or characteristic. 15 . A computer-implemented method of managing different versions of data objects for a version control system (VCS) during a lifecycle of the data objects, comprising: identifying data objects that evolve through the different versions during the lifecycle; producing a dataset for the data objects data as a group by scanning the data objects to identify metadata of the grouped data to be re-versioned together throughout the lifecycle, and storing the identified metadata in the dataset; generating dataset snapshots as read-only dataset components for the dataset as it progresses along the lifecycle; copying the dataset to a remote storage location using a dataset backup; assigning a snapshot-ID to each dataset snapshot for tracking a corresponding snapshot through the commit map and commit-tree; and performing one or more VCS operations on an identified dataset snapshot including at least one of a commit, checkout, merge, branch, or merge-branch operation. 16 . The method of claim 15 further comprising storing, in the VCS datastore, a commit-map and commit-tree of the next version of the project, wherein the commit map stores commit records for the data objects from the first version to the revised version, and wherein the commit-tree stores a timeline of the commit operations generating the commit records. 17 . The method of claim 16 wherein the dataset is distributed across the plurality of storage devices comprise network attached storage (NAS), object storage, local storage, or cloud networks, the method further comprising generating by each provider of a storage device of the plurality of storage devices, a dataset snapshot as a read-only dataset component stored in memory local to the provider, wherein the dataset snapshot comprises a list of snapshot copies provided by each provider. 18 . The method of claim 17 wherein the VCS manages changes to software programs, documents, web sites, and other content data embodying the data objects, and wherein the first version and revised version are each denoted by successive alphanumer

Assignees

Inventors

Classifications

  • to a system of files or objects, e.g. local or distributed file system or database · CPC title

  • characterised by the use of retention policies (retention policies for HSM systems G06F16/185) · CPC title

  • Versioning file systems, temporal file systems, e.g. file system supporting different historic versions of files · CPC title

  • Access rights, e.g. capability lists, access control lists, access tables, access matrices · CPC title

  • Managing data history or versioning (querying versioned data G06F16/2474; querying temporal data G06F16/2477) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2024143815A1 cover?
Managing versioning of data objects for a project revised from a first version to a revised version by producing a dataset representing the data objects as a group by scanning the data objects to identify metadata of the grouped data to be processed similarly within a current version of the lifecycle, and storing the identified metadata in the dataset. Data object changed from the first version…
Who is the assignee on this patent?
Dell Products Lp
What technology area does this patent fall under?
Primary CPC classification G06F21/6218. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu May 02 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).