Big Data NLP for improved healthcare outcomes

Similar documents
NextGen Population Health TEN TEN TEN TEN TE. Prevent Patients from Falling Through the Cracks in 10 Easy Steps

2017 Oncology Insights

Jumpstarting population health management

WHITE PAPER. Maximizing Pay-for-Performance Opportunities Proven Steps to Making P4P a Proactive, Successful and Sustainable Part of Your Practice

Adopting Accountable Care An Implementation Guide for Physician Practices

Population Health Management. Shaping the future of healthcare. How health systems can move beyond sick care to proactively keep populations healthy

Maximize the value of CHF population management programs with advanced analytics PLAYBOOK

ACO Practice Transformation Program

Artificial Intelligence Changes Evidence Based Medicine A Scalable Health White Paper

CPC+ CHANGE PACKAGE January 2017

How an ACO Provides and Arranges for the Best Patient Care Using Clinical and Operational Analytics

Maryland s Integrated Care Network. Heading into Year Three

Insights as a Service. Balaji R. Krishnapuram Distinguished Engineer, Director of Analytics, IBM Watson Health

COLLABORATING FOR VALUE. A Winning Strategy for Health Plans and Providers in a Shared Risk Environment

The Feasibility of Using Electronic Health Records (EHRs) and Other Electronic Health Data for Research on Small Populations

Publication Development Guide Patent Risk Assessment & Stratification

Population Health: Physician Perspective. Kallanna Manjunath MD, FAAP, CPE Medical Director AMCH DSRIP September 24, 2015

Launching an Enterprise Data Warehouse to Rapidly Reduce Waste in Asthma Care

Value-based Care Report. February How Value-based Care is improving quality and health.

Digitizing healthcare Digital Innovation Forum Henk van Houten Chief Technology Officer, Philips

Executive Summary 1. Better Health. Better Care. Lower Cost

Harry Reynolds IBM June 6, 2011

Pulse on the Industry: Interoperability and Population Health Management

Improving patient outcomes & health economics through connected health innovation

The NHS Confederation s Decisions of Value

The Best Approach to Healthcare Analytics

The Four Steps of Population Health Management

THE ALPHABET SOUP OF MEDICAL PAYMENTS: WHAT IS MACRA, VBP AND MORE! Lisa Scheppers MD FACP Margo Ferguson MT MSOM

Adopting a Care Coordination Strategy

Using Data for Proactive Patient Population Management

Texas ACO invests in the Quanum portfolio to improve patient care

Bad Data s Effect on Population Health Performance

A strategy for building a value-based care program

WHY WHAT RISK STRATIFICATION. Risk Stratification? POPULATION HEALTH MANAGEMENT. is Risk-Stratification? HEALTH CENTER

Population Health Management Tools to Improve Care for Individuals and Populations of Patients

REGISTRIES IN ACCOUNTABLE CARE: WHITE PAPER. Draft White Paper for Fourth Edition of AHRQ Registries for Evaluating Patient Outcomes: A User's Guide

Decreasing Medical. Costs. Are your members listening to you? PRESENTED BY: September 22, 2016

Examining the Differences Between Commercial and Medicare ACO Models

Value-based Care Report. February How Value-based Care is improving quality and health.

Population Health. Collaborative Care. One interoperable platform. NextGen Care

Payer Perspectives On Value-based Contracting

OVERVIEW. Helping people live healthier lives and helping make the health system work better for everyone

ALBANY MEDICAL CENTER, PPS LEADS REGIONAL INITIATIVE to Boost Care Quality and Slow Medicaid Costs

Transformational paradigms poised to redefine healthcare delivery. November 2016

7/7/17. Value and Quality in Health Care. Kevin Shah, MD MBA. Overview of Quality. Define. Measure. Improve

Patient Payment Check-Up

The Value of Integrating EMR and Claims/Cost Data in the Transition to Population Health Management

IBM Advanced Care Insights: Analytics and Care Management to Reduce Readmissions Paul Hake MSPA 2013 IBM Corporation

Executive Insights. Using AI to meet operational, clinical goals

Caring for the Whole Patient Predictive Analytics Technology, Socio-demographic Insights, and Improved Patient Outcomes Randy K.

Using A Data Warehouse and Analytics to Drive Population Health Management


Informatics, PCMHs and ACOs: A Brave New World

Streamlining care processes with a data-driven approach

PATIENTS + DOCTORS + MACHINES

Smarter Care: The Impact of Social Determinants on Health

How to Improve HEDIS Reporting Among Providers and Improve Your Health Plan Rankings

Analytics: The Key Ingredient for the Success of ACOs

Integrated Health System

IMPROVING TRANSITIONS OF CARE IN POPULATION HEALTH

Risk Adjustment Methods in Value-Based Reimbursement Strategies

For fully insured groups of 100 or more eligible employees. HealthyOutcomes. A fully-integrated health management solution that works for you

Self Care in Australia

How Better Intervention Targeting Improves Care Management

Building the Universal Roadmap to Population Health Management

A Model for Value-Based Provider/Payer Partnerships

improvement program to Electronic Health variety of reasons, experts suggest that up to

Value Based Care An ACO Perspective

APPENDIX 2 NCQA PCMH 2011 AND CMS STAGE 1 MEANINGFUL USE REQUIREMENTS

INNOVATIONS IN CARE MANAGEMENT. Michael Burcham, Narus Health

CLINICAL PRACTICE EVALUATION II: CLINICAL SYSTEMS REVIEW

HIE Data: Value Proposition for Payers and Providers

Program Overview

Note: Accredited is the highest rating an exchange product can have for 2015.

Dr Stephen Pavis NHS National Services Scotland

Advocate Cerner Partnership Creates Big Data Analytics for Population Health

Digital Disruption meets Indian Healthcare-the role of IT in the transformation of the Indian healthcare system

RED SIGNAL REPORTSM RADIOLOGY. August 2018 Vol. 1 No. 1. Claims Data Signals & Solutions to Reduce Risks and Improve Patient Safety.

Staying Connected with Patient-Generated Health Data

Chapter 2. At a glance. What is health coaching? How is health coaching defined?

Potential of the use of electronic patient information for clinical research in the pharmaceutical industry

Core Item: Hospital. Cover Page. Admissions and Readmissions. Executive Summary

A Battelle White Paper. How Do You Turn Hospital Quality Data into Insight?

Transforming traditional case management through local provider partnerships

Use Case Study: Remote Patient Monitoring for Chronic Disease

Comprehensive Primary Care: What Patient Centred Medical Home models mean for Australian primary health care

What is Data Mining in Healthcare?

Helmholtz-Inkubator INFORMATION & DATA SCIENCE

Financial Planning, Implementation, and Control to Support Payment and Care Delivery Reform Insights for Safety Net Providers

Transforming Delivery Systems for Population Health

Better Medical Device Data Yield Improved Care The benefits of a national evaluation system

Perspective: Case Study Emerging Care Management Models in Developing Countries

Banner Health Friday, February 20, 2015

SWAN Alerts and Best Practices for Improved Care Coordination

Healthy Aging Recommendations 2015 White House Conference on Aging

A Patient Centered Infrastructure. Dr. Christian Seebode ORTEC medical ICCS 2013 Hong Kong

BCBSM Physician Group Incentive Program

Defying Distance: How Unified Communications Is Transforming Health Care

Thought Leadership Series White Paper The Journey to Population Health and Risk

The ins and outs of CDE 10 steps for addressing clinical documentation excellence

Transcription:

Big Data NLP for improved healthcare outcomes A white paper

Big Data NLP for improved healthcare outcomes Executive summary Shifting payment models based on quality and value are fueling the demand for comprehensive insights from Big Data into the health of patient populations. However, because 80% of clinical data is unstructured, natural language processing (NLP) is becoming a vital part of any analytics department, to support risk stratification, quality measures, and care coordination applications. Both providers and payers are focused on improving health at the population level through better care coordination, especially for chronic conditions like diabetes and obesity. To impact outcomes at the individual level, stakeholders need an in-depth understanding of outcomes within populations that incorporate a 360-degree view of all available data. Historically, healthcare groups have relied heavily on electronic health records (EHRs) and claims data when trying to make sense of the health of their patient populations, and when making clinical treatment decisions. Unfortunately, neither EHRs nor insurance claims alone are ideal for analyzing the health of populations, nor for providing a comprehensive view of an individual s health. Providers and payers are now combining data from EHRs, claims, genetic sequencing, wearables, and other sources into data lakes, to gain a comprehensive population view. This in turn has led to technology innovations and the rapid adoption of Hadoop-type environments to store and analyze structured and unstructured healthcare data. These Big Data systems are supplementing data warehousing solutions for population health, and platforms for care coordination. Current Big Data warehousing and care coordination solutions mainly use structured data taken from the structured sections of EHRs and claims. While structured data is undoubtedly valuable, an estimated 80% of the clinical data stored in EHRs is in an unstructured format and thus difficult to analyze on a large scale. Typically, the unstructured data contains a wealth of clinical information based on physician narratives; pathology, radiology, and discharge reports; and, increasingly, patient-reported information. In order to thoroughly assess population risk and identify the care needs of individuals from Big Data, providers and payers need to extract insights from the data stored in unstructured text. Social determinants of health and lifestyle choices, for example, play a major part in the clinical risk for populations and individuals, and yet they are trapped in text-based form. Incorporating unstructured data into a 360-degree view of patients and populations provides the analytical material for improved population health. Use of an artificial intelligence (AI) technique like NLP is key to transforming text into structured forms, and can be aligned with Hadoop and data warehousing approaches to support existing investments. NLP can be used to support multiple application areas including predictive risk models, population stratification, quality measures, and care coordination, using a variety of technology enhancements such as semantic search, data discovery, and information extraction. As a result of the demand for population-level insights and real-time patient surveillance, the adoption of NLP technologies will accelerate as healthcare organizations seek to unlock value from Big Data. NLP is an important tool to ensure that we can still use the narrative to perform the analytics, develop the predictive models, and generate the population health insights needed to support a learning health system. Doug Fridsma MD PhD FACP FACMI, President and CEO, AMIA 2

New payment models Healthcare has traditionally relied on fee-for-service compensation models to pay providers. Today the government and private payers are shifting to alternative pay-for-value models that offer providers financial incentives for proactively monitoring the health of their patients, achieving quality clinical outcomes, and controlling the cost of care. To meet their quality and performance objectives, providers must analyze vast amounts of Big Data on the health of their patient populations. With more access to EHR data, payers are also offering care coordination services to improve member wellness. As with providers, payers are targeting high-risk populations and extending additional opportunities for education and support. Before healthcare organizations can implement pre-emptive care programs, they must identify the relative risk of their patient population based on a variety of clinical, financial, and lifestyle factors. This is where Big Data technologies such as Hadoop are now found, allowing large, aggregated data to be managed and analyzed. As illustrated in Figure 1, a healthcare population typically includes a relatively small percentage of the highest-risk patients, though these leasthealthy patients usually account for the biggest percentage of overall healthcare costs. Once healthcare organizations have stratified their populations based on their relative health, they can then initiate evidence-based care plans to improve outcomes for the at-risk individuals, and implement preventive programs for the healthier patients. By following best-practice protocols, payers and providers can help patients avoid costly complications and hospitalization. Analysis of population health data also helps organizations to assess how a particular value-based care plan will impact their bottom line. For both providers and payers, population health analysis is often difficult because of the heterogeneous nature of Big Data in healthcare. Much of the critical information Figure 1: Level of patient risk associated with population segments and their cost implications; a relatively small segment of the population accounts for a disproportionate percentage of healthcare costs. The top 5% of patients account for 50% of all medical costs. 0.25% 1% 5% Rising risk 6 20% 21 100% Source: http://amatihealth.com/blog/ Advanced illness: Requiring ongoing and complex case management At-risk, multiple chronic conditions: Requiring ongoing care management Healthiest: Utilizing preventive and wellness services, some acute care is stored as unstructured text that cannot be easily accessed or analyzed across a population without advanced technologies such as NLP. The ability to automatically extract precise data from unstructured text is invaluable for organizations participating in value-based payment models, for both risk stratification and quality measures extraction. By leveraging NLP, providers can look at both the structured and unstructured data for a complete, 360-degree population view, and identify and extract the specific details to assess risk or improve population health. Similarly, they can assess critical details on individual patients related to lifestyle choices such as smoking behavior and alcohol consumption, as well as insights into a patient s social determinants such as living arrangements, access to care, and mobility status. This insight from unstructured data will be vital in identifying and supporting the rising-risk population shown in Figure 1. No longer will healthcare be about how many patients you can see, how many tests and procedures you can order, or how much you can charge for these things. Instead, it will be about costs and patient outcomes: quicker recoveries, fewer readmissions, lower infection rates, and fewer medical errors, to name a few. In other words, it will be about value. Toby Cosgrove, Harvard Business Review (https://hbr.org/2013/09/value-based-health-care-is-inevitable-and-thats-good/) 3

NLP and population health management NLP technologies are used to extract structured information from unstructured patient-related documentation. For example, providers can leverage NLP to extract discrete values of left ventricular ejection fraction from progress notes to support problem list reconciliation and ACO quality measures, or extract a patient s cancer stage from a pathology report. While EHRs commonly have fields for such clinically relevant details, records may include major gaps because data is not consistently entered, impacting clinical care and outcomes analysis. Applying NLP in a focused manner enables the capture of information from unstructured patient data in a timely manner, and facilitates its use for analytical purposes in the distributed data warehouse capabilities of HIVE. Unlike early NLP systems, the latest NLP tools enable open and flexible development of queries, and are not as reliant on expensive data sets manually annotated by clinicians. Interest in this field is expanding rapidly, as noted in the KLAS report, Natural language processing: Glimpses into the future of unstructured data mining (April, 2016). Population health is about people and the ways in which they are both unique and the same. To fully understand the health of each person requires an assessment of a much wider set of data than that needed to analyze a patient s current clinical status, as shown in Figure 2. Such broad data sets are well suited to Big Data environments such as Hadoop. Only 20% of a person s health status is associated with their clinical care; other major factors that contribute to their status include health behaviors, social and economic factors, and physical environment. Lifestyle choices such as tobacco, alcohol, and drug use can all be extracted from unstructured text using NLP, as can sexual activity, diet, and exercise. NLP can also filter additional, clinically relevant information, such as a patient s environmental, Figure 2: Patient wellness is impacted by many factors, not just clinical care. Health outcomes Length of life 50% Quality of life 50% Tobacco use Health behaviors 30% Diet and exercise Alcohol and drug use Sexual activity Clinical care 20% Access to care Quality of care Health factors Policies and programs Social and economic factors 40% Physical environment 10% Education Employment Income Family and social support Community safety Air and water quality Housing and transit Source: Robert Wood Johnson Foundation (2014) 4

housing, and mobility status. Because of NLP s ability to unlock critical details from unstructured text, it is a powerful tool for organizations as they manage the health of their patient populations. Consider an ACO that wants to assess the risk of Type 2 diabetes in its patient population. An analysis of structured data can reveal risk factors associated with weight, race, and age, but might miss risk factors that are typically noted in the physician narrative. Using NLP, the ACO could identify the prevalence of other known risk factors, such as limited access to healthy foods, barriers to physical activity, high stress levels, and social isolation. Alternatively, think about how NLP can help populationlevel cohort selection. A good example is the CMS code for lung cancer screening, which targets 55 77 year-olds who are current or past smokers, have no lung cancer diagnosis, and have more than 30 pack-years of smoking. While some of these details may be captured in modern EHRs, certain critical risk factors such as smoking packyears are typically stored only in unstructured text. NLP can extract these factors to derive a much deeper understanding of clinical risk. People who meet the criteria are invited for CT screening to identify early signs of lung cancer. Data sources are evolving and so is analysis Traditionally, healthcare organizations have relied on medical records and claims data to analyze patient populations and the health of individuals. While claims and EHRs have been an adequate source of data for healthcare analytics in the past, the demand for detailed, actionable information has escalated. Today, patient engagement portals and consumer health monitoring devices provide a wealth of additional information. As patient engagement and interaction with their own electronic medical records grows, there will be more electronic communication about progress, lifestyle changes, medication adherence, and adverse events, for example. New sources of patient insights are growing rapidly, but patient-reported information is often in an unstructured format. Providers can use NLP to review this new patient-reported data and gain insights on everything from mental state to fall risk, and even to access to firearms. Similarly, payers can leverage NLP to analyze member-supplied data, including online chat and notes between members and nurses. NLP can even be used to review social media posts and provide relevant insights about exercise routines, diet, and social behaviors. With such a wide range of data types, the increasing use of Hadoop, data lakes, and other Big Data technologies is no surprise. Clinical information in documents and notes is stored in the Hadoop Distributed File System (HDFS) and, as mentioned previously, needs to be converted into a structured form to support statistical and machine learning techniques. Consequently, NLP systems need to be able to mine HDFS to process bulk documents sets, as well as mine streams of data as they are received. It is good practice to have a bulk document set to run discovery queries against (to assess how data is represented), develop queries against, and evaluate performance. The resulting queries can then be used to mine insights from streams of new documents as part of an NLP Extract, Transform, and Load (ETL) process. Once transformed into structured data by NLP, these varying concepts can be loaded into an analytical environment like HIVE, or brought into data warehousing or analytical tools such as SAS. It is vital that NLP technologies complement investments in systems like Hadoop and SAS. This can be achieved using NLP pipelines that run automatically to extract data from new clinical documents or from bulk data sets. As mentioned previously, performing cohort selection is a common requirement that can be supported by direct querying in NLP tools. To enable broader access to NLP insights, it is also possible to incorporate NLP results into search tools such as SOLR. Use of SOLR to search data in Hadoop allows any user to search for specific terms, and filter the results to find relevant patients or members. To support SOLR, NLP is used to pre-process documents and create additional metadata that enhances search in two ways: providing synonym expansion and additional concepts for filtering. In the first enhancement, NLP identifies synonyms in the text using drug and disease ontologies, and normalizes these concepts together: for 5

example, finding rosuvastatin calcium and tagging that as CRESTOR and as a drug class of statin. The second enhancement is to provide additional factors to use in filters; this includes extracting complex linguistic patterns such as smoking status, or ejection fraction or BMI values. By providing this additional metadata, search is improved in a search engine (e.g. SOLR) and users can easily identify clinical characteristics for their target population. Conclusion Changing reimbursement models are forcing a fundamental shift in how providers and payers utilize patient data. New clinical risk models based on population-level Big Data will rely on both structured and unstructured data, and create a more comprehensive view of the patient s health and wellness status over time. Population-based models and insights will power patient-level care coordination, and spur more precisely focused education and support efforts on the part of both providers and payers. In addition, the use of these data sets to support automated extraction of quality measures provides significant reuse and additional value. As healthcare continues to shift to value-based care, the use of Big Data and NLP will be an essential technology for providers and payers as they work to make sense of the growing volume of unstructured data from EHRs, claims, notes, patient-reported outcomes, and social media. If you are interested in learning more about NLP and Big Data, please email enquiries@linguamatics.com About Linguamatics Linguamatics transforms unstructured Big Data into big insights to advance human health and well-being. A world leader in deploying innovative NLP-based text mining for high-value knowledge discovery and decision support, Linguamatics solutions are used by top commercial, academic, and government organizations, including 18 of the top 20 global pharmaceutical companies, the US Food and Drug Administration (FDA) and US National Cancer Institute, Cancer Research UK, and leading US healthcare organizations. 2017 Linguamatics Ltd. The Linguamatics logo is a trademark of Linguamatics Ltd. All rights reserved. All other trademarks mentioned in this document are the property of their respective owners.