PACS is often installed at an institution to resolve data retention problems existing in the analog world, specifically film loss. It is important for institutions to understand, however, that the installation of PACS is no guarantee that images cannot be lost. In fact, with such digital solutions come far greater expectations, as well as legal scrutiny, for secure data retention. This article will compare and contrast the operational and technical options available for legal archiving of medical images that will help one survive a disaster recovery.

DISASTER RECOVERY (DR)

PACS disasters can, and do, appear in a variety of forms. To a system administrator, PACS disasters may be thought of as a result of technical failures that impact overall performance. To the radiologist or clinician, a disaster may be perceived as a single examination that cannot be immediately viewed, the technical reasons being irrelevant; in fact, it may simply be due to a poor PACS architecture or implementation strategy. Their PACS could merely be fraught with insufficient clinical storage, poor archival policies, or limited RIS integration. This article will focus on the prevention of PACS disasters caused by the actual loss of data.

Disk to Disk. Disk to Disk simply refers to the replication of examination data to other disk storage devices ideally located at remote locations. Because of the significant price decreases in disk-based storage devices over the past few years, many disk-only disaster recovery solutions have penetrated the PACS market. Currently, there are three general approaches to applying Disk to Disk disaster recovery:

Figure 1. A schematic representing a PACS medical/legal archive designed with appropriate disaster recovery considerations. (Click the image for a larger version.)

SAN (storage area network). In this approach, a storage device is installed at all remote sites where copies of the examinations are desired. This also necessitates reliable (usually dedicated) network fiber connectivity between the storage sites (not the same network as the institution would use for web/e-mail and other day-to-day business activities). Physical document storage vendors are now providing SAN storage devices as part of their offerings. The replication/storage management is provided by the storage devices: external computers are not required to manage the data replication process (see Figure 1).

NAS (network attached storage). This approach is similar to the SAN approach. The primary difference is that the replication process is managed by software running on computers connected to the storage and the network connecting the devices is not required to be dedicated. The one layer of abstraction, replicating the data at the operating system level, makes a multi-vendor solution more attainable. For example: The primary PACS utilizes Windows 2003 on a Dell server connected to 2TB of disk storage and the remote location has an IBM server running Linux connected to 2TB of storage. The examination data could be replicated between the two locations via standard file sharing protocols available to both operating systems. The replication software would leverage these standard protocols to synchronize the two locations. Nearly all storage companies provide NAS solutions.

CAS (Content addressable storage). Content addressable storage is a scalable storage/computing grid. The idea is to provide scalability and performance by adding self-contained units that act together as a hive. The individual units can be defined as a small server usually containing four disks (~1TB raw). All the units run the same software, which allows them to act as one. The aggregate storage is the sum of all the units. Some manufacturers use elaborate algorithms to provide a unique key that allows for the verification of file integrity and retrieval. This key is often called a token, and the key/integrity generation algorithm commonly employed is MD5. Some manufacturers hide the token complexities behind a file system effectively making the grid device look like a NAS.

Removable Media. In this approach, the primary PACS site hosts a computer running software that manages the process of moving examinations from disk to removable media and removable media back to disk. The solution usually contains a robotic library that allows for near-line access to large numbers of media. The genre of software that provides this service is called HSM (hierarchical storage management). The concept is evolving to better encapsulate storage practices and usage (outside the scope of this article) and the term is being replaced with “life cycle management.” The storage policies available allow for the creation of two or more copies of the media. One copy can stay local to the institution; another could be sent to a remote site; a third copy could be sent to a document storage company. Removable media solutions are the least expensive and easiest to implement.

Table 1 below provides a quick summary of key considerations and how they relate to common archival strategies.

Table 1. A summary of alternate storage solution features.

TECHNIQUE CONSIDERATIONS

The above defines and describes basic strategies for archiving, but how does one select the best approach? This is a difficult question to answer in an article. Institutional experience, IT resources, politics, and vendor relationships all play a factor in selecting an archival solution. Below are some general guidelines to consider that are independent of the internal considerations and address two different sets of needs.

Solution Requirement for Institution A

  • Technically simple
  • Requires copies of examinations in multiple locations
  • Low cost
  • Some manual processes acceptable
  • Slow retrieval from off-site locations (physical transportation of media)
  • Adequate storage exists on the clinical RAID. (If insufficient storage exists on the clinical archive, significant/continuous retrieves of prior examinations from the DR archive can cause media failures and affect the radiologist workflow.)

In this scenario, a removable media solution would probably be the leading candidate. Once it is installed and configured, a human would need only to insert media as required and export filled media for transport to one (or more) safe locations. The cost for multisite storage is the price of media and transport.

Solution Requirement for Institution B

  • No manual interaction
  • Large budget
  • Require copies of examinations in multiple locations
  • Fast retrieval from any location (electronic transportation)

This strategy requires deploying a RAID in each location where it is desired to store a copy of each examination. All copies occur automatically with no human involvement once the solution is configured. In order to sustain this approach each year, disk will need to be purchased in equal increments at all locations. Any budget cuts could seriously jeopardize the integrity of the archive: there may be only enough funds to store examinations at one site instead of the planned two or three. Also, there is an increased risk that an engineer performing maintenance on a storage device could cause accidental data loss that could be replicated to all other sites; even worse, it may be possible for a virus attach to affect all copies of the examinations because all copies are always online.

Using DR to switch vendors. Although the current relationship with a PACS provider may be satisfactory, what happens if that were to change ? What if a new PACS was released to the market that had desired capabilities that the current vendor would not be able to match for years? This topic could consume an entire article, perhaps even an entire book. What is important to consider during the development of an archive strategy is how to prepare for the possible switch to another PACS provider. The answer may be to store all examination data in a nonproprietary format (DICOM part 10) on open standard storage systems. Many PACS/storage products store data in a format that only their software can retrieve, so consider products that are open. This will ensure that your new PACS vendor can read the stored examinations.

FINAL THOUGHTS

In the end, whatever strategy is selected must provide a means to quickly recover (business continuance) from a local disaster. For DR, it is important to consider the pattern of usage in radiologydata (examinations) that are stored in PACS are less likely to be retrieved (reviewed) once dictated. This is an important consideration when a disaster hits. For example, assume a water pipe broke near the computer room and all RAID devices were lost that contained 4 years’ worth of examinations. Conventional approaches would be to repopulate the RAID devices from the archive file by file, usually in reverse chronological order. The problem with this approach is that the usage of the examination data is random, not sequential. Therefore, if a radiologist is waiting to display an examination that is 2 years old, the wait could be days or longer. A better approach is to employ a system that can automatically repopulate the PACS by demand. In this system, a request to display a missing examination would result in triggering automatic processes that would reload the examination back into PACS. This method would import the most relevant (requested) examinations as a priority and populate the unrequested examinations as resources permit.

Thomas Schultz is Chief Engineer in the Department of Radiology, Massachusetts General Hospital, Boston.

Keith Dreyer is vice chairman of radiology, computing and information sciences in the Department of Radiology, Massachusetts General Hospital, Boston.

David Hirschorn, MD, is a computing and information sciences fellow in the Department of Radiology, Massachusetts General Hospital, Boston. This article has been adapted from: Heckman K, Schultz T. PACS architecture. In: Dreyer KJ ed. PACS: The Digital Revolution. 2nd Ed. New York:, Springer Verlag