1 Epic System Failure: Pain at Plainsboro Medical Center Zaiyong Tang, Ph.D., Salem State University Dennis Comeau, Salem State University Authors: Zaiyong Tang is an MIS Professor in the Department of Marketing and Decision Sciences at Salem State University. He obtained his Ph.D. in MIS from the University of Florida; an M.S. from Washington State University; an M.E. from Chengdu University of Science and Technology; and a B.E. from Chongqing University, P.R. China. Dr. Tang has over 40 refereed journal and conference publications. Dennis Comeau is an MIS senior in the Department of Marketing Decision Sciences at Salem State University. He works full time as an IT manager at Cambridge Health Alliance. His previous work experience includes being an IT manager at Digital Equipment and later Hewlett-Packard. This case is kindly sponsored by: A concise case from the International Journal of Instructional Cases www.ijicases.com Copyright 2017: International Journal of Instructional Cases This case is only intended for use by the purchaser within a pedagogic setting and sharing with other third parties, or republication, is expressly forbidden. 1
2 Epic System Failure: Pain at Plainsboro Medical Center 1 Zaiyong Tang, Ph.D., Salem State University Dennis Comeau, Salem State University It was 5:00 am and Jenny, the head nurse at Plainsboro Hospital, was doing her routine rounds in the intensive care unit. She noticed a commotion in ICU Room 3 where a patient was experiencing severe respiratory difficulty. Jenny immediately pressed a code button to summon the doctor on duty. The doctor, suspecting that the patient was suffering from a drug interaction, instructed one of the nurses to get on the computer and provide him with all the meds the patient had been administered on the last shift. The nurse was frenetically hitting the keyboard, but the system was not responding. Jenny pushed the nurse aside and tried to login herself while several other nurses were watching nervously. There was still no response. The doctor shouted, Jenny, I need the patient s records and last blood test report, STAT! Jenny quickly made a call to Beth Brown, Director of Clinical Systems, and was told that the connection to the EPIC system, which provided real-time patient data to the hospital, had been down since 2:00 am. Jenny was aware of the system downtime procedures, so she rushed down the hallway to the Business Continuation Access (BCA) station, the backup system designed to provide patient data when the network was down, and started the process printing out the patient s record. Within seconds, her face turned pale the BCA device was not working. Plainsboro Medical Center Plainsboro Medical Center (PMC) was a well-respected health system in eastern Massachusetts, providing a full range of services to the surrounding communities. Serving nearly 150,000 patients, PMC offered emergency services, various types of specialty care, primary care, and other types of health services. It delivered these programs and services at three locations, and many of its programs were nationallyrenowned. Plainsboro Hospital was the flagship facility for the Medical Center. It was a teaching site for all the major medical schools in the New England area and served the community as a primary provider of maternity care. The Epic System Epic Systems was an industry leader in the medical software market, used for more than half of the electronic medical records (EMR) in the US. Epic offered an integrated suite of modules that supported all functions related to patient care, including registration and scheduling; clinical systems for doctors, nurses, emergency personnel, and other care providers; systems for lab technicians, pharmacists, and radiologists; and billing systems for insurers. 1 Individuals names and hospital name have been disguised. 2
3 The Epic application servers were housed in a secure third party datacenter in Boston. The Epic application and database at Landford were built with multiple redundancies for every conceivable event. A third party network vendor provided the backbone network connection from the datacenter to the hospital. The PMC hospital network ran a shadow Epic server that provided real-time clinic data as well as data feed to the BCA system. Hospital Epic stations were connected via network switches and the Epic station router to the server. Mobile devices were supported via a Wi-Fi network. All the network devices and computer stations on each hospital floor were equipped with emergency power, so that in the event of a local power outage there would be no interruption to data access. There were multiple redundancies with servers, power, and network connectivity to cover every anticipated type of systems failure that they could control. See Exhibit 1 for a summary diagram of the Epic system at Plainsboro. Business Continuity Plan Because uninterrupted information access to patient data was critical, Plainsboro had a three layered business continuity system. Plan A was to provide continuous data access from the centralized Epic system. If Plan A failed, Plan B, the BCA system, would automatically kick in. Plan C would be activated if Plan B failed. Plan C was the manual data retrieval system based on hard copy patient files. To ensure the availability of clinical information, five years earlier Beth had led the IT department to implement a system disruption prevention project that was aimed at supporting normal care even during a large-scale system failure. The outcome of that project was the Business Continuality Access system. The BCA process was based on best practices provided by Epic Corporation, the vendor of the Epic system. The BCA hardware consisted of computers that were networked to retrieve patient electronic medical record reports. The BCA server downloaded patient data from the Epic shadow server every fifteen minutes. Then, individual reports were created for each floor/department and clinic. These reports were encrypted and pushed to the BCA stations. In the clinics there was only one BCA station per clinic. In the hospital there was one for each patient floor and some departments such as pharmacy and lab. There was also a master BCA station in an administrative area for each hospital that contained all the reports for that hospital. Any unit that could not print from their BCA station could get the reports from the master station. Local laser printers were used to print these reports as needed. The reports included the most recent details about all medications, allergies, notes and other pertinent information on each patient s care and status for all patients in the inpatient units and for all patients scheduled for visits in the ambulatory sites. The stations were connected to UPS (uninterruptible power supply) appliances to protect them in the event of a power failure. See Exhibit 2 for a diagram showing the components of a BCA station setup. The BCA stations were located away from the Nursing Stations. When first implemented, the nursing staff were trained on the use of the BCA system. Documentation and user manuals were kept at the Central Nurse Stations. Initially, there was a lot of pushback by the nurses to practice using the BCA system because it was not in their job description. Eventually, the nursing leadership and the nursing union agreed that they would have to do it because it was a patient care issue. 3
4 The Crisis Jenny called the Helpdesk as the patient s condition deteriorated rapidly. The person on the Helpdesk recognized the gravity of the situation and paged the IT emergency response team and called Beth to report the problems with the BCA system. Some systems and network personnel had been onsite and working on the issue since the first report at 2 am. They had determined that a construction crew had severed the main data cable to the Hospital and there was nothing they could do to fix the problem. They were counting on the BCA process to sustain patient care during the event. Beth listened to the Helpdesk technician and called into the ICU to try and talk Jenny through her problem. She could not fix the problem remotely, so she put out a wider page to all Desktop technicians to report to their sites for the emergency. The technicians responded and fixed issues that were fixable, such as a nurse not remembering the downtime procedure. But many BCA stations did not even have the patient information on them so it could not be recovered. In these cases, the units had to go back to their paper charts to provide patient care. Beth didn t realize at this point that almost 50% of the BCA stations were failing and that this was causing life threatening problems around the entire hospital. The patient in the ICU survived, but this episode destroyed everyone s confidence in the BCA process. The Responses For four days, Beth and her entire IT crew were on high alert around the clock, responding to calls from medical professionals in need of patient records. The hospital survived the system failure without a single death of a patient. Beth realized that the outcome could have been much worse. However, the large scale system breakdown had shattered many nerves. Immediately after the Epic connection was restored, Beth called a post-mortem meeting with all units of the IT department to re-examine their IT system design. They decided to conduct a detailed investigation to determine where the systems and processes failed. The results were disturbing. The Problems Systems audits and investigation reports indicated that failure occurred at all three layers of EMR access. Despite all the planning, this catastrophic network failure left the entire hospital without Epic system access for four days. The BCA stations had been set up years before and were supposed to be checked regularly to insure that they were operational. When the failure occurred, many of the devices and the nursing personnel floundered terribly. There were also problems with hospital units that admitted patients after Plans A and B failed, such as the emergency room and the maternity ward. The BCA system only reported and downloaded information on patients who had already been admitted. The BCA process had been designed to handle downtimes of the Epic system in hours but not in days. Many of the nurse managers who had been trained earlier had forgotten how to implement the BCA system downtime procedures, and in some cases could not find the machines that were set up as BCA devices or the documentation describing how to use them. IT and the nurse managers had shared responsibility to keep the BCA system running at satisfactory levels. IT personnel were responsible to 4
5 carry out regular system verification tests that reviewed technologies, procedures, documentation, and user training. The nurse managers were responsible to be up to speed with BCA downtime procedures through planned downtime rehearsals. It was obvious to Beth that the entire process had to be reviewed and re-engineered to ensure that the system would work flawlessly in the event of another disaster. 5