Data management, reduction and sampling schemes for storage device failure

US11669754B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11669754-B2
Application numberUS-202016872194-A
CountryUS
Kind codeB2
Filing dateMay 11, 2020
Priority dateFeb 25, 2020
Publication dateJun 6, 2023
Grant dateJun 6, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In a method for training a machine learning model, the method includes: segmenting, by a processor, a dataset from a database into one or more datasets based on time period windows; assigning, by the processor, one or more weighted values to the one or more datasets according to the time period windows of the one or more datasets; generating, by the processor, a training dataset from the one or more datasets according to the one or more weighted values; and training, by the processor, the machine learning model using the training dataset.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for training a machine learning model, the method comprising: segmenting, by a processor, a dataset from a database into one or more datasets based on time period windows; assigning, by the processor, one or more weighted values to the one or more datasets according to the time period windows of the one or more datasets; generating, by the processor, a training dataset from the one or more datasets, wherein an amount of data generated from the one or more datasets is based on the one or more weighted values; and training, by the processor, the machine learning model using the training dataset. 2. The method according to claim 1 , wherein the machine learning model comprises a solid-state drive (SSD) failure prediction model. 3. The method according to claim 1 , wherein a most recent dataset from the one or more datasets is assigned a first weighted value and a least recent dataset from the one or more datasets is assigned a second weighted value, wherein the first weighted value is greater than the second weighted value. 4. The method according to claim 3 , wherein the one or more weighted values decrease by a set amount from the first weighted value to the second weighted value. 5. The method according to claim 1 , the method further comprising: identifying, by the processor, anomaly data in the dataset; retrieving, by the processor, the anomaly data in the dataset; and adding, by the processor, the anomaly data to the training dataset. 6. The method according to claim 5 , wherein the anomaly data comprises SSD failure data. 7. The method according to claim 5 , wherein the anomaly data is identified using a rule based method. 8. The method according to claim 5 , wherein the anomaly data is identified using a cluster based method. 9. The method according to claim 1 , the method further comprising generating, by the processor, anomaly data; and adding, by the processor, the generated anomaly data to the training dataset. 10. A data system comprising: a database; a processor coupled to the database; and a memory coupled to the processor, wherein the memory stores instructions that, when executed by the processor, cause the processor to: segment a dataset from the database into one or more datasets based on time period windows; assign one or more weighted values to the one or more datasets according to the time period windows of the one or more datasets; generate a training dataset from the one or more datasets, wherein an amount of data generated from the one or more datasets is based on the one or more weighted values; and train a machine learning model using the training dataset. 11. The data system according to claim 10 , wherein the machine learning model comprises a solid-state drive (SSD) failure prediction model. 12. The data system according to claim 10 , wherein a most recent dataset from the one or more datasets is assigned a first weighted value and a least recent dataset from the one or more datasets is assigned a second weighted value, wherein the first weighted value is greater than the second weighted value. 13. The data system according to claim 12 , wherein the one or more weighted values decrease by a set amount from the first weighted value to the second weighted value. 14. The data system according to claim 10 , wherein the processor is further configured to: identify anomaly data in the dataset; retrieve the anomaly data in the dataset; and add the anomaly data to the training dataset. 15. The data system according to claim 14 , wherein the anomaly data comprises SSD failure data. 16. The data system according to claim 14 , wherein the anomaly data is identified using a rule based method. 17. The data system according to claim 14 , wherein the anomaly data is identified using a cluster based method. 18. The data system according to claim 10 , wherein the processor is further configured to: generate anomaly data; and add the generated anomaly data to the training dataset. 19. A method for training a machine learning model, the method comprising: identifying, by a processor, anomaly data in a dataset from a database; generating, by the processor, additional anomaly data; adding, by the processor, the generated anomaly data to the dataset; identifying, by the processor, a training dataset from the dataset; retrieving, by the processor, the training dataset from dataset; and training, by the processor, the machine learning model using the training dataset. 20. The method according to claim 19 , wherein the machine learning model comprises a solid-state (SSD) failure prediction model.

Assignees

Inventors

Classifications

  • Reliability or availability analysis · CPC title

  • where the computing system component is a memory, e.g. virtual memory, cache (accessing, addressing or allocating within memory systems or architectures G06F12/00; checking stores for correct operation G11C29/00) · CPC title

  • based on approximation criteria, e.g. principal component analysis · CPC title

  • Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP] · CPC title

  • in relation to data integrity, e.g. data losses, bit errors · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11669754B2 cover?
In a method for training a machine learning model, the method includes: segmenting, by a processor, a dataset from a database into one or more datasets based on time period windows; assigning, by the processor, one or more weighted values to the one or more datasets according to the time period windows of the one or more datasets; generating, by the processor, a training dataset from the one or…
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06N5/04. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 06 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).