By Josh Gluck 
Sustainable and measurable progress in precision medicine represents a major milestone in a journey that has been as arduous as it has been important for the healthcare community. In cancer care, for example, it is becoming routine to analyze tumors for known gene mutations or expressions to select treatments likely to be most effective.

Yet despite its tangible benefits, the application of precision medicine, specifically in cancer care, still presents a multitude of challenges. This approach can be expensive, invasive, and time-intensive as tumors are analyzed through traditional pathology and genomics. However, AI-enabled imaging presents a powerful opportunity to accelerate the identification and application of personalized treatments in ways that are often less invasive, faster, and potentially more cost-effective.

In the last few years, we’ve started to see several promising applications of AI in imaging to support precision medicine initiatives. A study1 published in the Journal of Neuro-Oncology in April 2019 shows strong potential for using machine-learning algorithms to reveal multimodal MRI patterns to accurately and rapidly predict the presence of genotypes and mutations in glioma—specifically isocitrate dehydrogenase (IDH) and 1p19q codeletion status—which are good predictors of treatment efficacy. This is just one example, and the potential for others is limitless. But, how can we, in the healthcare community, accelerate the use of AI in imaging to advance personalized medicine?

The key lies in understanding the value of imaging data as an asset and better democratizing it across organizations and the broader healthcare ecosystem. This entails data sharing as well as local training for algorithms. The equation for improved model inference accuracy— which will drive expanded application of AI in precision medicine—combines vast amounts of annotated data plus multiple training runs.

AI and training algorithms must not just be relegated to the large research institutions—healthcare organizations of diverse types and sizes can and must play a role if we are to move faster. It is increasingly clear that not all models trained out of the box do well with data from vastly different populations—utilization of data related to local patients and protocols is essential for fine tuning.

From the Ground Up

When beginning the journey to mainstream AI in imaging from the lab to the bedside and integrating it into existing workflows, organizations must start with their infrastructure—and an understanding that what may have worked yesterday may not (and likely will not) work tomorrow.

Several factors are at play. As noted previously, local training of algorithms is essential to leveraging AI at scale in clinical settings. As such, healthcare organizations require an infrastructure that can support this training. When introducing AI models into clinical practice, organizations have to integrate them into existing workflows—and they need to do so without introducing new latency to existing workflows and clinical apps.

Most organizations, however, encounter challenges on these fronts as they continue to operate in a highly siloed environment, with separate systems for clinical data (such as PACS and vendor-neutral archives) and research initiatives. Moving data between these silos is slow, expensive, and tedious. Organizations repeatedly moving data must deal with multiple copies of data, greater complexity, higher risk, elevated costs … and less accurate and usable data at the enterprise level.

In “Better Medicine Through Machine Learning: What’s Real, and What’s Artificial?” (PLOS Medicine Journal, December 2018),2 the authors reinforce the notion of training models locally for the most precise results. To truly democratize AI and effectively train models locally to avoid irrational extrapolation, healthcare organizations need to start with a new foundation, one that focuses on data and places it at the center of everything to enable a modern data experience.

Creating this new type of data experience requires an architectural strategy that consolidates islands and silos of data infrastructure and ultimately simplifies the data foundation. This data-centric architecture is defined by five key attributes:

  1. Real-time. It supports the capability to find the right insight at the right time to drive improved clinical and operational outcomes.
  2. On-demand and self-driving. It prioritizes automation at its core and leverages machine learning to provide high levels of availability and proactive support. It should be easy to provision and evolve with your needs.
  3. Exceptionally reliable and secure. This is a mustespecially when it comes to critical patient data and protected health information.
  4. Support for multi-cloud environments. It should easily allow storage volumes to be moved to and from the cloud and between cloud providers, making application and data migration simple and enabling hybrid use cases for application development, deployment, and protection. A data-centric architecture should support the flexibility to take advantage of the cloud when and how an organization chooses.
  5. Constantly evolving and improving. Users should expect their IT infrastructure to continuously get better, without downtime, delivering more value every year for the same or lower cost. Healthcare organizations should expect the same for their storage infrastructure. They must design systems so that storage services can be constantly and seamlessly improved, without ever bringing applications or users offline.

Bringing It All Together

The modern data experience requires a new type of data hub—one that allows organizations to consolidate all applications on a single storage platform to unify and share data across the applications that need them for better insight. Rather than being merely a repository, it must be intended to share and deliver data within an organization for modern analytics and AI so patients and clinicians can benefit from the insights the data hold.

A data hub must have four qualities, which are essential to unifying data: high throughput, both file and object; native scale-out; multidimensional performance; and massively parallel architecture that mimics the structure of GPUs to deliver performance to tens of thousands of cores accessing billions of objects.

A data hub may have other features, such as snapshots and replication, but if any of the four features are missing from a storage platform, it is not a data hub. For example, if a storage system delivers high throughput file and is natively scaled-out, but it needs another system with S3 object support for cloud-native workloads, then the unification of data is broken, and the velocity of data is crippled.

A modern data experience—powered by a data-centric architecture and a data hub that possesses all four of the necessary qualities—is integral for healthcare organizations, large and small, that are looking to optimize the use of AI-enabled imaging in personalized medicine. This approach ensures that the data at the heart of this opportunity is truly democratized and enables the effective training of local models that can be used to improve algorithm accuracy. While AI-enabled imaging will be a driving force behind precision medicine that is less invasive and potentially more cost-effective, the data is the engine that will continue to power these advances, with the proper architecture being crucial.

Josh Gluck is vice president for global healthcare technology strategy at Pure Storage in Mountain View, Calif. Questions and comments can be directed to AXIS Imaging News chief editor Keri Forsythe-Stephens at


  1. ZhouH, Chang K, Bai HX, et al. Machine learning reveals multimodal MRI patterns predictive of isocitrate dehydrogenase and 1p/19q status in diffuse low- and high-grade gliomas. J Neurooncol. 2019;142(2):299-307. doi:10.1007/s11060-019-03096-0.
  1. Saria S, Butte A, Sheikh A. Better medicine through machine learning: what’s real, and what’s artificial. PLoS Med. 2018;15(12):e1002721. doi:1371/journal.pmed.1002721.