By Tim Kaschinske
“I’ve looked at clouds from both sides now
From up and down, and still somehow
It’s cloud illusions I recall
I really don’t know clouds at all” – Joni Mitchell, “Both Sides Now”
Most hospital IT directors can relate to Joni Mitchell’s lyrics from “Both Sides Now.” They’ve heard a lot about “clouds”; they know they should be thinking about the cloud, but they really don’t know clouds at all.
First and foremost, hospital IT staff needs to understand that a cloud can be any remote site that is set up either by the hospital itself or by a third-party service provider to act as a primary or secondary data center to receive archive data. In some instances, a hospital may have a remote data center site already established that can be configured to serve as a managed cloud site for the hospital. Alternatively, a hospital may require a third-party cloud service provider that is already established and able to act as their remote site.
Most hospitals do not want to send their data to the public cloud. Given the sensitivity of healthcare data, it cannot be transmitted out of country and, in some cases, out of state. This effectively rules out many public cloud providers who have secondary and/or tertiary data centers in different countries. Hospitals also have great concerns about patient privacy, data security, and data loss and, as of today, the cloud service industry has not convinced healthcare providers that they have this sufficiently covered.
Having said that, there are now a number of “private cloud” providers springing up who, either by their association with particular healthcare providers or through their established presence in the healthcare industry, are trusted to provide these services.
Cloud Archive Architecture
Archive software can sit either in the hospital data center and throw data over the wall to the cloud (archive in the cloud); or in a hybrid archive, it can sit in the cloud and catch data that the hospital throws over the wall. Which model is chosen depends on the requirements of the hospital and the trade-offs they are prepared to make between security, performance, and cost.
The Storage Connector (aka Backup Node) provides the conduit and connectivity to the target storage. This is important when considering working with cloud service providers (internally or externally). If the Storage Connector cannot connect to the target service or device, then data cannot be passed to it.
The design of an archive deployment for a hospital must consider the recovery processes required by the hospital’s disaster recovery (DR) plan. An archive may contain the only copy of a hospital’s data, and, as such, if the local site is down, then the archive must become operational in the cloud. This will require that the necessary servers “stand up” remotely in the cloud. These may be up and running with preloaded software (hot standby), or they may be brought up in the event of a disaster and have software loaded onto them (cold standby). In any event, these servers are likely to be virtual, running on a VMware ESX server or similar in the cloud. Essential in any solution where data is going off-premise, encryption is required both “in flight” as the data crosses the network and “at rest” as the data resides at the secondary data center.
Both Sides to Cloud-Enabled Archive
There are a number of ways that an archive can be deployed in a cloud environment. Using a baseball analogy, the hospital can be the pitcher or the catcher for the data, or it can be both. The model chosen depends on whether the hospital wants local copies of data and/or single or multiple copies in the cloud, and what their recovery/failover requirements are in the event of a disaster.
The primary archive engine for data movement to the cloud will be situated on the hospital site. This engine pushes the data over the network to an archive-enabled storage device at the remote site. The archive engine can be configured either with an enterprise repository so that multiple copies can be made from the hospital (one or more of which can be sent to the cloud), or with a local repository that has a single feed to the cloud site where multiple copies will be taken by the receiving software.
Data being archived will be in one or more dedicated on-site file systems, which operate as a “local cache” for data that will pass to the archive. The data in the local cache will be archived and then stubbed, as required, to manage the disk space. The exact stubbing policies can be tuned to specific site requirements, from simple first-in-first-out space management to more sophisticated rules for particular types of data.
Storage in the Cloud
An archive-enabled storage device is an appliance that receives data and catalogs it in some way so that it can be found and retrieved thereafter. In relation to cloud, the Storage Connector (aka Backup Node) writes to archive-enabled storage devices. Key features required for an archive-enabled store include:
- Multiple copies—Each copy should be independent so that if one becomes corrupted, that corruption is not mirrored to other copies;
- Long-term archive—Provides resilience over time, an ability to migrate data to new devices (data outlives hardware), and support for many storage device types (device agnostic);
- Efficient storage of small files—With compression and containerization;
- Massive scalability;
- Individual file retention;
- Housekeeping—Storage consolidation (like defragmentation), repairing damage (recreate one copy/volume from an alternative copy);
- Security—Including encryption, access control, and digital signature verification.
Some of these attributes may be offered from an object storage system such as OpenStack (SWIFT), ATMOS, DX6000/Caringo, and DCCA where the multiple copy and resilience features are covered. However, the Data Repository features are additionally required in order to provide such things as the long-term migration between devices, the retention management back to the Data Repository, the individual item encryption etc.
Ideally, the storage should be efficiently accessible over a network using a secure link—this could be vendor neutral archive (VPN), or native support for https. Where this isn’t available, a Storage Connector (Backup Node) can be deployed at the cloud site to interface with the storage and provide a secure connection.
Archive in the Cloud
In this use case, the data connector is fully installed in the cloud and makes its multiple copies from that point, probably to different data centers. It presents a cache in the cloud that can receive data from multiple hospitals. Any data sent to this cache will be archived in the normal way in accordance with the policy for the data type. The data connector is also installed on each hospital site, but with a local repository. In this form, the data connector acts as a pure data mover that takes, in accordance with its local policies, a copy of the file from the hospital cache and places an identical copy on the cache in the cloud. The data connector in the cloud then moves that data into the archive based on the policies for that file type.
Because the cache in the cloud operates in just the same way as the cache in the hospital, any application in the hospital can switch quickly to using the cache in the cloud to access its files in the event that it is unable to access the local Data Connector cache. However, there is no provision for a local (hospital) copy of the archive. Plus, there is no option to use the Data Connector’s encryption in-flight or in the cloud cache as it would invalidate the failover mechanism. In-flight encryption can, however, be done by the network connection.
Whether the target in the cloud is presented as storage in the cloud or archive in the cloud, because that element of storage (albeit taken from a pool) is dedicated to each data stream coming from each hospital, there is separation between them. That means one hospital cannot cross over into another’s data and multitenancy is achieved. In addition, the fact that each hospital controls its own encryption keys through the data connector that is local to them means that even if data crossover into another’s storage was possible, the resultant view would be indecipherable.
However the backend is configured (storage in the cloud or archive in the cloud), or the “Gateway” moves data (enterprise or local repository), the setup of the archive process in the local hospital is the same. The Data Connector moves a single instance of the data from the “local” cache into the archive in accordance with policies that the user can define.
Archiving in this way to the cloud is useful in a number of circumstances. It can provide alternative storage capability that is cheaper, on demand, accommodates growth, and alleviates the local real estate footprint. It provides data protection via independently written copies, multiple locations, and a recovery site. It also assists in application retirement and migration by providing a vendor neutral archive in the full sense, a regulated place to “park” data, and a central point from which to view data for secondary use. Cloud provides added flexibility and usability that can greatly enhance the solution.
Tim Kaschinske is a Consultant, Healthcare Solutions, at BridgeHead Software.