High Availability White Papers

Recovery of Memory and Process in DSM Systems

Overview In this report, we discuss the recovery of memory and processes on the platform of a shared-memory DSM system. We divide the problem into recovery of unaffected memory (RUM), and recovery of affected processes (RAP). We point out that specially designed fault-tolerant, non-volatile memory is neither sufficient nor necessary to solve the problem of RUM. It is not sufficient that the system can go down when one node goes away, which can be a result of many types of faults: power failure is but one of them. It is not necessary either, because the system is distributed in nature; information redundancy across fault units can be realized, therefore, without using special memory. We discuss several ways of implementing a fault-tolerant memory system using plain memory by modifying the write-back protocols in DSM systems. The proposed techniques include mirroring and RAIM, which stands for Redundant Array of Independent Memory.

Further White Paper Details
PublisherHP Labs File FormatPDF, requires Acrobat Rdr 5
Date PublishedMarch 2001 Downloads2
FormatWhite Papers   
Topics
Thin clients switch on digitally excluded

Thin clients switch on digitally excluded

Case study: Digital inclusion project tackles social exclusion in Liverpool more

Renault goes multilingual

Renault goes multilingual

Case study: Translation tech turns docs into 23 languages… more


Quick Sitemap Links: