Systems, methods, and machine-readable media to perform state data collection

US9772894B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9772894-B2
Application numberUS-201615010813-A
CountryUS
Kind codeB2
Filing dateJan 29, 2016
Priority dateJan 29, 2016
Publication dateSep 26, 2017
Grant dateSep 26, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method, computing device, and system for performing a core dump is provided that aggregates core dump data from storage controller components. In some embodiments, the method includes detecting corrupted data corresponding to a data sector included in a storage volume. After detecting the corrupted data, the storage volume is quiesced. Data is collected from a controller processor, I/O controller, controller cache, storage volume, interrupted write recovery portion, trace log, and backup device.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: detecting, by a storage server, that a storage volume includes corrupted data, wherein the storage volume is accessible to a storage controller; after detecting the corrupted data, quiescing the storage volume; collecting, from the storage controller, a first portion of data from a controller processor of the storage controller and an input/output controller (IOC) of the storage controller; after collecting the first portion of data, collecting a second portion of data from a controller cache of the storage controller and the storage volume, wherein the second portion of data includes data corresponding to the corrupted data; storing the first portion of data and the second portion of data in one or more files; and providing data from the one or more files to a second machine for analysis. 2. The method of claim 1 , wherein providing data to the second machine includes copying the one or more files to the second machine via a network. 3. The method of claim 1 , further comprising: compressing the one or more files. 4. The method of claim 1 , wherein the detecting includes receiving an error message from a host, and wherein the error message identifies the storage volume and a data sector that stores the corrupted data. 5. The method of claim 1 , wherein the detecting includes determining, by the storage controller, that a protection information error threshold is exceeded. 6. The method of claim 1 , wherein quiescing the storage volume includes pausing an activity corresponding to the storage volume. 7. The method of claim 1 , the method further comprising: increasing a detail level of a trace log; re-booting the storage controller; reading the corrupted data from the storage volume; and creating a log entry in the trace log corresponding to the reading from the storage volume. 8. The method of claim 1 , the method further comprising: collecting a third portion of data from an interrupted write recovery (IWR) memory of the storage controller, trace log memory of the storage controller, and backup device memory of the storage controller; and storing the third portion of data in the one or more files. 9. The method of claim 8 , wherein the backup device memory includes a shared read-only cache partition of the controller cache. 10. The method of claim 1 , the method further comprising: storing data collected from the controller processor, the IOC, and the storage volume in a file that is formatted according to an Executable and Linkable Format (ELF). 11. The method of claim 1 , wherein the corrupted data is stored in a data sector of the storage volume, wherein the data sector is associated with a storage stripe, wherein the second portion of data collected from the storage volume includes the corrupted data stored in the data sector and data stored in at, least one immediately adjacent data sector of the storage stripe. 12. A non-transitory machine readable medium having stored thereon instructions for performing a method comprising machine executable code which when executed by at least one machine, causes the machine to: detect an error corresponding to a data sector of a storage volume; after the error is detected, quiesce the storage volume; collect controller processor data and input/output controller (IOC) data from a storage controller that accesses the storage volume; after the controller processor data and the IOC data is collected, re-boot the storage controller into a limited operating environment mode; while in the limited operating environment mode, collect controller cache data from the storage controller and storage volume data, wherein the storage volume data includes data corresponding to the data sector of the storage volume; and store the controller processor data, IOC data, controller cache data, and storage volume data in one or more files. 13. The non-transitory machine readable medium of claim 12 , wherein quiescing the storage volume comprises placing incoming requests corresponding to a controller cache of the storage controller in a queue and/or disabling flushing of the controller cache. 14. The non-transitory machine readable medium of claim 12 , wherein the storage volume data that is collected includes data from the data sector and one or more data sectors that are immediately adjacent to the data sector in a storage stripe. 15. The non-transitory machine readable medium of claim 12 , the machine further to: collect interrupted write recovery (IWR) data, trace log data, and backup device data; and store the IWR data, trace log data, and, backup device data in the one or more files. 16. A computing device comprising: a memory containing machine readable medium comprising machine executable code having stored thereon instructions for performing a data collection method; a processor coupled to the memory, the processor configured to execute the machine executable code to cause the processor to: detect an error corresponding to a data sector of a storage volume; collect a first data portion from a controller processor and an input/output controller (IOC); after the first data portion is collected, re-boot the storage controller; after the storage controller is re-booted, collect a second data portion from a controller cache and the storage volume, wherein the second data portion includes data corresponding to the data sector of the storage volume; store the first data portion and the second data portion in one or more files; and communicate the one or more files to a second machine for error analysis. 17. The computing device of claim 16 , the processor further to: quiesce the storage volume by placing incoming requests corresponding to the controller cache in a queue and/or disabling flushing of the controller cache. 18. The computing device of claim 16 , wherein the second data portion includes data collected from the data sector and one or more data sectors that are adjacent to the data sector in a storage stripe of the storage volume. 19. The computing device of claim 16 , the processor further to: collect interrupted write recovery (IWR) data, trace log data, and backup device data; and store the IWR data, trace log data, and backup device data in the one or more files. 20. The computing device of claim 16 , wherein the backup device data include data that is read from a shared read-only cache partition of the controller cache.

Assignees

Inventors

Classifications

  • Boot up procedures · CPC title

  • Parity data used in redundant arrays of independent storages, e.g. in RAID systems · CPC title

  • to protect a block of data words, e.g. CRC or checksum (G06F11/1076 takes precedence; security arrangements for protecting computers or computer systems against unauthorized activity G06F21/00) · CPC title

  • for I/O devices · CPC title

  • in a storage system, e.g. in a DASD or network based storage system (drivers for digital recording or reproducing units G06F3/06; circuits for error detection or correction within digital recording or reproducing units G11B20/18; for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS], H04L67/1097) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9772894B2 cover?
A method, computing device, and system for performing a core dump is provided that aggregates core dump data from storage controller components. In some embodiments, the method includes detecting corrupted data corresponding to a data sector included in a storage volume. After detecting the corrupted data, the storage volume is quiesced. Data is collected from a controller processor, I/O contro…
Who is the assignee on this patent?
Netapp Inc
What technology area does this patent fall under?
Primary CPC classification G06F11/0778. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 26 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).