Predictive Modeling isn't Really that Hard. Analytics is.

By Chad Konchak, Senior Director-Analytics, NorthShore University HealthSystem

Chad Konchak, Senior Director-Analytics, NorthShore University HealthSystem

A lot of attention has been focused on predictive modeling in healthcare lately.  I think this is great. Predictive modeling has the capacity to transform healthcare delivery and allow us to identify catastrophic events, better risk adjust our quality metrics, and keep patients out of the healthcare system through population health monitoring systems. The latter is especially important as we begin to see the revenue paradigm shift to an increasingly risk-based model that focuses our attention on managing patients outside our four walls. New data sources too are beginning to emerge as we look at genomics and social determinants data in addition to the valuable clinical data from our EMRs that we have now all begun to leverage. All these data sources provide the raw materials for predictive modeling but present new challenges with how we will store and manage these data. At NorthShore University HealthSystem we have 21 predictive models in production workflows that have led to improved care and lower costs for the system that rely on a range of clinical, administrative, and external data sources. We have a team of data scientists and a 200 page clinical analytics standard operating procedure manual  that standardizes the complex math and artful considerations required to build a model in healthcare. I, by no means, intend to minimize the incredibly complicated work this entails. I do, however, need to point out the much larger teams of data warehousing architects, ETL developers, and informaticists that allow us to feed the model, run it on all our populations every day, and deliver it to front-line decision makers. Like the offensive line in a great passing attack, they are the reason the ball even gets out to the receiver in the first place.

"You need to either have a staff in house that knows the EMR or have to rely on the vendor to help you to optimize its functionality"

The fact is, predictive modeling gets a lot of the attention but represents a small fraction of the overall effort required to use data to enable better patient care, with higher satisfaction, and at a lower cost. The level of effort to get good data represents a large part of the modeling effort and requires an often overlooked and underappreciated competency for an organization: data acquisition and enrichment. In my experience here at NorthShore, the process to cultivate, enrich, and organize data to build a predictive model can sometimes represent 80 percent of the total effort and is critical to the success of the model.   A bad predictive model on good data will always win out over a great model on bad data (in fact the latter could be outright dangerous). Any vendor that claims to have a successful model but has no ability to run and test the performance of that model on your data has not fully understood these complexities and the uniqueness of every organization’s workflows and data infrastructure. Unfortunately, data hygiene just isn’t that fun to talk about so the blocking and tackling of this critical element gets diminished amongst the glamour of the data science. Furthermore, even when you are able to collect, normalize, and enrich all the right data in a highly accurate and reliable way you need to process that data on every patient, every day, or even every second in order to operationalize the model. Even then, you still do not have anything that can impact patient care. You have a beautiful model that nobody is using.

When you have a reliably built model on sound data that is consistently flowing through the production life-cycle, you need to put it in a place that decision makers can easily access in the context of their existing workflows. Too often, I have seen wonderful applications that have incredible predictive power that are inserted into clunky workflows. We’ve even built them. No clinician is going to be willing to leave the cockpit of their EMR, go to a different system, and enter in more data in order to identify if a patient is at high risk of an outcome: the four worst words you can tell any clinician in informatics is “you just go to”.  The model needs to be fed directly into the EMR and presented cleanly in the cockpit’s instrument panel with the results feeding back into the EMR for continuous improvement. The knowledge required to make sure you understand the clinical workflows and the various places where you can insert a new piece of information (like a predictive model) is not trivial. You need to either have a staff in house that knows the EMR or have to rely on the vendor to help you to optimize its functionality. This is increasingly complex when you have a multitude of different workflows and EMRs scattered across an organization.

Ultimately, predictive modeling is a differentiator in healthcare. However, we need to conduct proper data hygiene to ensure that the model is built and operationalized on high quality, reliable data. We need to monitor the data flow and alert for data anomalies and be able to retest the performance of the model as populations change and data evolves. Then, we need to integrate the model into the context of existing workflows so that this highly enriched information can most optimally impact a clinician’s decision making process. Finally, we need to measure how that model results in better outcomes and continually improve the workflow and the data behind the model. That’s hard. That’s analytics.