Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. A system, comprising: a processor; and a memory that stores executable instructions that, when executed by the processor, facilitate performance of operations, the operations comprising: in response to detection of a failing storage device, performing a proactive recovery operation, the performing the proactive recovery operation comprising: in response to determining, by a first process of the system, that the data portion on the failing storage device is not to be processed by a second process of the system, logically moving the data portion to a non-failing storage device; and in response to determining, by the first process, that the data portion on the failing storage device is to be processed by the second process based on the data portion being in a backlog of the second process, delegating recovery of the data portion to the second process.
2. The system of claim 1 , wherein the logically moving the data portion to the non-failing storage device comprises: determining that the data portion is protected by mirroring a copy of the data portion on a second storage device; reading the copy of the data portion from the second storage device; and storing the copy of the data portion that was read from the second storage device to the non-failing storage device.
3. The system of claim 1 , wherein the logically moving the data portion to the non-failing storage device comprises: determining that the data portion is protected using erasure coding; reading a fragment of the data portion; validating the fragment; in response to the fragment being consistent based on the validating, storing the fragment to a non-failing storage device; and in response to the fragment not being consistent based on the validating, recovering the fragment into a recovered fragment; and storing the recovered fragment on the non-failing storage device.
4. The system of claim 3 , wherein the recovering the fragment into the recovered fragment and the storing the recovered fragment on the non-failing storage device comprises: enqueuing a recovery task for the data portion.
5. The system of claim 1 , wherein the data portion on the failing storage device is to be processed by the second process, and wherein the operations further comprise: tracking the data portion as a tracked data portion having recovery delegated to the second process; determining whether the tracked data portion has been processed by the second process within a time duration; in response to determining that the tracked data portion has been processed by the second process within the time duration, denoting the tracked data portion as recovered; and in response to determining that the tracked data portion has not been processed by the second process within the time duration, logically moving the data portion to the non-failing storage device.
6. The system of claim 1 , wherein the operations further comprise: setting a state of the failing storage device to a value that indicates that the storage device is failing.
7. The system of claim 1 , wherein the operations further comprise: in response to determining that the proactive recovery operation is complete, setting a state of the failing storage device to a value that indicates that the storage device is dead.
8. A method, comprising: running, by a system comprising a processor, a proactive recovery process that logically moves data portions from a failing storage device to one or more non-failing storage devices, comprising: selecting a selected data portion on a failing storage device; determining whether the selected data portion is to be directly recovered by the proactive recovery process; in response to determining that the selected data portion is to be directly recovered by the proactive recovery process, determining whether the selected data portion is protected with a stored copy of the selected data portion that is mirrored on a second storage device; and in response to determining that the selected data portion is protected with the stored copy of the selected data portion that is mirrored on a second storage device, copying the stored copy of the selected data portion from the second storage device and to a non-failing storage device of the one or more non-failing storage devices.
9. The method of claim 8 , wherein the determining whether the selected data portion is to be directly recovered by the proactive recovery process comprises: in response to establishing that the selected data portion is not to be directly recovered by the proactive recovery process, delegating proactive recovery of the data portion to a second process.
10. The method of claim 9 , wherein the establishing that the selected data portion is not to be directly recovered by the proactive recovery process comprises: determining that the selected data portion is to be accessed by the second process.
11. The method of claim 9 , further comprising: tracking the selected data portion in response to the establishing that the selected data portion is not to be directly recovered by the proactive recovery process.
12. The method of claim 11 , further comprising: determining that the selected data portion has not been recovered by the system process within a time duration; and, in response to the determining that the selected data portion has not been recovered by the second process within the time duration, logically moving the data portion to the non-failing storage device.
13. The method of claim 8 , wherein the determining whether the selected data portion is protected with the stored copy of the selected data portion that is mirrored on a second storage device comprises: in response to determining that the selected data portion is protected by erasure coding, recovering the selected data portion to the non-failing storage device, comprising: reading a fragment of the selected data portion; validating the fragment; and in response to the fragment being consistent based on the validating, storing the fragment to the non-failing storage device.
14. The method of claim 13 , wherein the recovering the selected data portion to the non-failing storage device further comprises: in response to the fragment not being consistent based on the validating, taking action to recover the selected data portion, comprising enqueuing a recovery task corresponding to the selected data portion.
15. The method of claim 8 , further comprising: detecting the failing storage device; and setting a state of the failing storage device to a value that indicates that the storage device is failing.
16. The method of claim 8 , further comprising: in response to detecting that the selected data portion is the last data portion to be recovered from the failing storage device, setting a state of the failing storage device to a value that indicates that the storage device is dead.
17. A non-transitory machine-readable medium, comprising executable instructions that, when executed by a processor, facilitate performance of operations, the operations comprising: obtaining, using a system process configured for data portion processing, a data portion having data portion components stored to a failing storage device; determining whether the data portion is protected by mirroring or protected using erasure coding; in response to determining that the data portion is protected by mirroring, wherein mirroring comprises storing a copy of the data portion on a second storage device, reading the copy of the data portion from the second storage device and processing the copy of the data portion via the system process; in response to determining that the data portion is protected using erasure coding, reading a fragment of the data portion, and validating the fragment; in response to the fragment being consistent based on the validating, processing the data portion via the system process; and in response to the fragment not being consistent based on the validating, recovering the fragment into a recovered fragment and processing the recovered fragment via the system process.
This invention relates to data recovery techniques for storage systems, specifically addressing the challenge of retrieving and processing data from a failing storage device when the data is protected by either mirroring or erasure coding. The system includes a process for handling data portions stored on a failing storage device. When a data portion is identified, the system first determines whether it is protected by mirroring or erasure coding. If the data is mirrored, meaning a copy exists on a second storage device, the system reads and processes the copy instead of the original. If the data is protected by erasure coding, the system reads a fragment of the data portion and validates its consistency. If the fragment is consistent, the data portion is processed directly. If the fragment is inconsistent, the system recovers the fragment into a valid state before processing. This approach ensures data integrity and availability by leveraging redundancy mechanisms, whether through mirrored copies or erasure-coded fragments, to handle failures in storage devices. The method is implemented via executable instructions stored on a non-transitory machine-readable medium, executed by a processor to perform the recovery and processing operations.
18. The non-transitory machine-readable medium of claim 17 , wherein the recovering the fragment comprises: enqueuing a recovery task for the data portion.
19. The non-transitory machine-readable medium of claim 17 , wherein the operations further comprise: in response to determining that the data portion is protected by mirroring, copying the copy of the data portion as read from the second storage device to a non-failing storage device.
20. The non-transitory machine-readable medium of claim 17 , wherein the operations further comprise: in response to determining that the data portion is protected using erasure coding, and in response to the fragment being consistent based on the validating, storing the data portion copy to a non-failing storage device.
Unknown
February 23, 2021
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.