What technology area does this patent fall under?

Primary CPC classification G06F16/113. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue May 27 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).

System and method for data classification using machine learning during archiving

US12314219B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12314219-B2
Application number	US-201816018172-A
Country	US
Kind code	B2
Filing date	Jun 26, 2018
Priority date	Jun 26, 2017
Publication date	May 27, 2025
Grant date	May 27, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed are systems and methods for data archiving using machine learning techniques. The system collects statistical information and event data and processes them using machine learning techniques to classify data and/or predict data access demands. The system receives statistical information related to user access of a plurality of files, which can effectively “train” the system to archive data that is not needed at a certain moment and extract it at other moments. The system identifies, using a machine learning module, a pattern of access in the plurality of files based on the received statistical information. The system modifies, using the identified pattern of access, a threshold value related to file access, and assigns a set of files from the plurality of files an access classification based on the modified threshold value. The system migrates the set of files between hot and cold data areas based on the assigned access classification.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for data archiving using machine learning, comprising: receiving statistical information related to user access of a plurality of files stored in a cold storage device; generating a sequence-to-sequence model that is configured to determine, for an initial sequence of prior file requests, a resulting sequence of subsequent file requests that have a greatest likelihood of being received, wherein the sequence-to-sequence model determines the resulting sequence by identifying patterns of access in the statistical information, and wherein a first pattern of access comprises an execution of an application compatible with a given file type and a subsequent request to access a file of the given file type; executing, by a hardware processor, the sequence-to-sequence model on an input sequence comprising a first ordered set of file requests that includes an execution of the application compatible with the given file type; receiving, from the sequence-to-sequence model, an output sequence comprising a second ordered set of file requests for at least one of the plurality of files having the given file type; modifying a threshold value indicative of whether to store the at least one of the plurality of files in a hot storage device responsive to the resulting sequence of subsequent file requests that have the greatest likelihood of being received, wherein the hot storage device has quicker data retrieval than the cold storage device, wherein the threshold value is a file group specific threshold value calculated for a particular group of files from the plurality of files responsive to the resulting sequence of subsequent file requests that have the greatest likelihood of being received; wherein the particular group of files is determined based on having at least one same file type from among a plurality of file types; and migrating the at least one of the plurality of files from the cold storage device to the hot storage device based on the modified threshold value. 2. The method of claim 1 , wherein modifying the threshold value comprises: increasing the threshold value related to file access in response to the sequence-to-sequence model indicating an increase in likelihood that the at least one of the plurality of files will be accessed. 3. The method of claim 1 , wherein the sequence-to-sequence model identifies the initial sequence and the resulting sequence based on at least one of: a number of access requests, time of access requests, and file size of the plurality of files. 4. The method of claim 1 , migrating the at least one of the plurality of files back to the cold storage device from the hot storage device subsequent to the at least one of the plurality of files being accessed. 5. The method of claim 1 , wherein the sequence-to-sequence model is an encoder-decoder recurrent neural network (RNN), wherein an encoder of the encoder-decoder RNN is configured to receive the initial sequence and a decoder of the encoder-decoder RNN is configured to output the resulting sequence. 6. A system for data archiving using machine learning, comprising: a cold storage device; a hot storage device; and a hardware processor configured to: receive statistical information related to user access of a plurality of files stored in the cold storage device; generate a sequence-to-sequence model that is configured to determine, for an initial sequence of prior file requests, a resulting sequence of subsequent file requests that have a greatest likelihood of being received, wherein the sequence-to-sequence model determines the resulting sequence by identifying patterns of access in the statistical information, and wherein a first pattern of access comprises an execution of an application compatible with a given file type and a subsequent request to access a file of the given file type; execute the sequence-to-sequence model on an input sequence comprising a first ordered set of file requests that includes an execution of the application compatible with the given file type; receive, from the sequence-to-sequence model, an output sequence comprising a second ordered set of file requests for at least one of the plurality of files having the given file type; modify a threshold value indicative of whether to store the at least one of the plurality of files in a hot storage device responsive to the resulting sequence of subsequent file requests that have the greatest likelihood of being received, wherein the hot storage device has quicker data retrieval than the cold storage device, wherein the threshold value is a file group specific threshold value calculated for a particular group of files from the plurality of files responsive to the resulting sequence of subsequent file requests that have the greatest likelihood of being received; wherein the particular group of files is determined based on having at least one same file type from among a plurality of file types; and migrate the at least one of the plurality of files from the cold storage device to the hot storage device based on the modified threshold value. 7. The system of claim 6 , wherein the hardware processor is configured to modify the threshold value by: increasing the threshold value related to file access in response to the sequence-to-sequence model indicating an increase in likelihood that the at least one of the plurality of files will be accessed. 8. The system of claim 6 , wherein the sequence-to-sequence model identifies the initial sequence and the resulting sequence based on at least one of: a number of access requests, time of access requests, and file size of the plurality of files. 9. The system of claim 6 , wherein the hardware processor is further configured to migrate the at least one of plurality of files back to the cold storage device from the hot storage device subsequent to the at least one of the plurality of files being accessed. 10. The system of claim 6 , wherein the sequence-to-sequence model is an encoder-decoder recurrent neural network (RNN), wherein an encoder of the encoder-decoder RNN is configured to receive the initial sequence and a decoder of the encoder-decoder RNN is configured to output the resulting sequence. 11. A non-transitory computer readable medium comprising computer executable instructions for data archiving using machine learning, including instructions for: receiving statistical information related to user access of a plurality of files stored in a cold storage device; generating a sequence-to-sequence model that is configured to determine, for an initial sequence of prior file requests, a resulting sequence of subsequent file requests that have a greatest likelihood of being received, wherein the sequence-to-sequence model determines the resulting sequence by identifying patterns of access in the statistical information, and wherein a first pattern of access comprises an execution of an application compatible with a given file type and a subsequent request to access a file of the given file type; executing, by a hardware processor, the sequence-to-sequence model on an input sequence comprising a first ordered set of file requests that includes an execution of the application compatible with the given file type; receiving, from the sequence-to-sequence model, an output sequence comprising a second ordered set of file requests for at least one of the plurality of files having the given file type; modifying a threshold value indicative of whether to store the at least one of the plurality of files in a hot storage device responsive to the resulting sequence of subsequent file requests that have the greatest likelihood of being received, wherein

Assignees

Acronis Int Gmbh

Inventors

Classifications

G06N3/09
Supervised learning · CPC title
G06N3/0455
Auto-encoder networks; Encoder-decoder networks · CPC title
G06N3/0442
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
G06N5/025
Extracting rules from data · CPC title
G06F16/285
Clustering or classification · CPC title

Patent family

Related publications grouped by family.

View patent family 64692602

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12314219B2 cover?: Disclosed are systems and methods for data archiving using machine learning techniques. The system collects statistical information and event data and processes them using machine learning techniques to classify data and/or predict data access demands. The system receives statistical information related to user access of a plurality of files, which can effectively “train” the system to archive …
Who is the assignee on this patent?: Acronis Int Gmbh
What technology area does this patent fall under?: Primary CPC classification G06F16/113. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue May 27 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).