What technology area does this patent fall under?

Primary CPC classification G06F16/164. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 19 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Storing variations of data across different replication sites

US11762816B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11762816-B2
Application number	US-202117341202-A
Country	US
Kind code	B2
Filing date	Jun 7, 2021
Priority date	Jun 7, 2021
Publication date	Sep 19, 2023
Grant date	Sep 19, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method according to one embodiment includes determining patterns of an application that utilizes a filesystem and/or properties of queries of the application. Data of the filesystem is stored across a plurality of replication sites of a data storage system. Based on the determined patterns of the application and/or the determined proper-ties of the queries of the application, a utility of storing at least some of the data of the filesystem in different variations at more than one of the replication sites is estimated. The estimated utility is compared against a predetermined utility threshold, and in response to a determination that the estimated utility is greater than the predetermined utility threshold, a write system call offered by the filesystem is modified to store the data in different variations at more than one of the replication sites.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: determining patterns of an application that utilizes a filesystem and/or properties of queries of the application, wherein data of the filesystem is stored across a plurality of replication sites of a data storage system; based on the determined patterns of the application and/or the determined properties of the queries of the application, estimating a utility of storing at least some of the data of the filesystem, not already stored in different variations at more than one of the replication sites, in the different variations at more than one of the replication sites; comparing the estimated utility against a predetermined utility threshold; and in response to a determination that the estimated utility is greater than the predetermined utility threshold, modifying a write system call offered by the filesystem to store the at least some of the data in at least some of the different variations at more than one of the replication sites. 2. The computer-implemented method of claim 1 , comprising: storing the at least some of the data of the filesystem in the at least some of the different variations at the more than one of the replication sites, wherein each of the at least some of the different variations of the data is stored with differentiating metadata to be used for reconstructing the variation of the data stored at another one of the replication sites, wherein the differentiating metadata stored with a first variation of the data stored on a first of the replication sites identifies differences between the first variation of the data and a second variation of the data stored on a second of the replication sites; outputting a scheduler interface to the application, wherein the scheduler interface is configured to hint a location to read data for a set of the queries of the application; and instructing the scheduler interface to be used for sending a read system call to one of the replication sites having a variation of the data stored thereon that fulfills the read system call. 3. The computer-implemented method of claim 1 , wherein estimating the utility of storing at least some of the data of the filesystem in the different variations at more than one of the replication sites includes: determining whether the data fulfills a first set of the queries while the data is in a first variation stored on a first of the replication sites and fulfills a second set of the queries while the data is in a second variation stored on a second of the replication sites, wherein the estimated utility is determined to be greater than the predetermined utility threshold in response to a determination that the data fulfills the first set of the queries while the data is in the first variation and fulfills the second set of the queries while the data is in the second variation, wherein the data in the first variation is stored in a different parseable format than a parseable format that the data in the second variation is stored in. 4. The computer-implemented method of claim 1 , wherein the estimated utility is based on time complexities associated with reading and writing the at least some of the data of the filesystem in the different variations at more than one of the replication sites and storage complexities associated with storing the at least some of the data of the filesystem in the different variations at more than one of the replication sites. 5. The computer-implemented method of claim 1 , wherein the determined patterns of the application are selected from the group consisting of: determined characteristics of the application, a determined input/output (I/O) nature of the application and determined replication properties of the filesystem, and estimated times that it takes for the application to access the data from the different replication sites. 6. The computer-implemented method of claim 1 , wherein determining the properties of the queries of the application includes: determining a minimum amount of the data an answer to at least one of the queries includes, and determining an average time that responding to the queries consumes. 7. The computer-implemented method of claim 1 , comprising: in response to a determination that the estimated utility is less than or equal to the predetermined utility threshold, storing copies of the at least some data at more than one of the replication sites. 8. A computer program product, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions readable and/or executable by a computer to cause the computer to: determine, by the computer, patterns of an application that utilizes a filesystem and/or properties of queries of the application, wherein data of the filesystem is stored across a plurality of replication sites of a data storage system; based on the determined patterns of the application and/or the determined properties of the queries of the application, estimate, by the computer, a utility of storing at least some of the data of the filesystem, not already stored in different variations at more than one of the replication sites, in the different variations at more than one of the replication sites; compare, by the computer, the estimated utility against a predetermined utility threshold to determine whether to modify a write system call offered by the filesystem to store the at least some of the data of the filesystem in different variations at more than one of the replication sites; and in response to a determination that the estimated utility is greater than the predetermined utility threshold, modify, by the computer, the write system call offered by the filesystem to store the at least some of the data in the at least some of the different variations at more than one of the replication sites. 9. The computer program product of claim 8 , the program instructions readable and/or executable by the computer to cause the computer to: store, by the computer, the at least some of the data of the filesystem in the at least some of the different variations at the more than one of the replication sites, wherein each of the at least some of the different variations of the data is stored with differentiating metadata to be used for reconstructing the variation of the data stored at another one of the replication sites; output, by the computer, a scheduler interface to the application, wherein the scheduler interface is configured to hint a location to read data for a set of the queries of the application; and instruct, by the computer, the scheduler interface to be used for sending a read system call to one of the replication sites having a variation of the data stored thereon that fulfills the read system call. 10. The computer program product of claim 8 , wherein estimating the utility of storing at least some of the data of the filesystem in the different variations at more than one of the replication sites includes: determining whether the data fulfills a first set of the queries while the data is in a first variation stored on a first of the replication sites and fulfills a second set of the queries while the data is in a second variation stored on a second of the replication sites, wherein the estimated utility is determined to be greater than the predetermined utility threshold in response to a determination that the data fulfills the first set of the queries while the data is in the first variation and fulfills the second set of the queries while the data is in the second variation. 11. The computer program product of claim 8 , wherein the estimated utility is based on time complexities associated wit

Assignees

Inventors

Classifications

G06F16/164Primary
File meta data generation · CPC title
G06F16/1844Primary
Management specifically adapted to replicated file systems · CPC title
G06F3/065
Replication mechanisms · CPC title
G06F3/067
Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title
G06F3/0604
Improving or facilitating administration, e.g. storage management · CPC title

Patent family

Related publications grouped by family.

View patent family 84285205

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11762816B2 cover?: A computer-implemented method according to one embodiment includes determining patterns of an application that utilizes a filesystem and/or properties of queries of the application. Data of the filesystem is stored across a plurality of replication sites of a data storage system. Based on the determined patterns of the application and/or the determined proper-ties of the queries of the applicat…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06F16/164. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 19 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Intelligent Backup Engine for Data Freshness

System and method for automatically managing storage resources of a big data platform

Data compaction in distributed storage system

Utilizing data access patterns to determine compression block size in data storage systems

Replication-based federation of scalable data across multiple sites

Predicting application response time based on metrics

Apparatus and method to reduce a response time for writing data to redundant storage devices

Frequently asked questions