Using Secondary Datasets for Research. Learning Objectives. What Do We Mean By Secondary Data?

Similar documents
1A) National-level Data Examples: Free or Inexpensive NHANES - National Health and Nutrition Examination Survey (NHANES). .

Basic Concepts of Data Analysis for Community Health Assessment Module 5: Data Available to Public Health Professionals

MERMAID SERIES: SECONDARY DATA ANALYSIS: TIPS AND TRICKS

Chapter VII. Health Data Warehouse

Total Cost of Care Technical Appendix April 2015

Hospital Discharge Data, 2005 From The University of Memphis Methodist Le Bonheur Center for Healthcare Economics

NATIONAL HEALTH INTERVIEW SURVEY QUESTIONNAIRE REDESIGN

Definitions/Glossary of Terms

DAHL: Demographic Assessment for Health Literacy. Amresh Hanchate, PhD Research Assistant Professor Boston University School of Medicine

AHRQ Quality Indicators. Maryland Health Services Cost Review Commission October 21, 2005 Marybeth Farquhar, AHRQ

3M Health Information Systems. 3M Clinical Risk Groups: Measuring risk, managing care

Understand the current status of OAS CAHPS related to

The Role of the Agency for Healthcare Research and Quality (AHRQ) in the US Drug Safety System

Development of Emergency Department (ED) Community Health Indicators

Issue Brief. Findings from HSC INSURED AMERICANS DRIVE SURGE IN EMERGENCY DEPARTMENT VISITS. Trends in Emergency Department Use

HIDD 101 HOSPITAL INPATIENT AND DISCHARGE DATA IN NEW MEXICO

2014 MASTER PROJECT LIST

Comparison of Care in Hospital Outpatient Departments and Physician Offices

Prior to implementation of the episode groups for use in resource measurement under MACRA, CMS should:

How to Approach Data Collection and Evaluation in SBHCs

Outpatient Hospital Facilities

Research Design: Other Examples. Lynda Burton, ScD Johns Hopkins University

Appendix 5. PCSP PCMH 2014 Crosswalk

Measuring Comprehensiveness of Primary Care: Past, Present, and Future

Use of Information Technology in Physician Practices

Appendix #4. 3M Clinical Risk Groups (CRGs) for Classification of Chronically Ill Children and Adults

BCBSM Physician Group Incentive Program

Meaningful Use: Review of Changes to Objectives and Measures in Final Rule

AVAILABLE TOOLS FOR PUBLIC HEALTH CORE DATA FUNCTIONS

The Florida KidCare Evaluation: Statistical Analyses

2017 Survey of Nurse Practitioners and Certified Nurse Midwives

CASE-MIX ANALYSIS ACROSS PATIENT POPULATIONS AND BOUNDARIES: A REFINED CLASSIFICATION SYSTEM DESIGNED SPECIFICALLY FOR INTERNATIONAL USE

About the National Standards for CYSHCN

SDRC Tip Sheet Public Use Files

NURSING (MN) Nursing (MN) 1

Community Health Needs Assessment: St. John Owasso

Aging in Place: Do Older Americans Act Title III Services Reach Those Most Likely to Enter Nursing Homes? Nursing Home Predictors

Measures Reporting for Eligible Providers

Databases for Research Use

Roll Out of the HIT Meaningful Use Standards and Certification Criteria

Medicaid HCBS/FE Home Telehealth Pilot Final Report for Study Years 1-3 (September 2007 June 2010)

2. What is the main similarity between quality assurance and quality improvement?

Population and Sampling Specifications

Computer Provider Order Entry (CPOE)

APPENDIX 2 NCQA PCMH 2011 AND CMS STAGE 1 MEANINGFUL USE REQUIREMENTS

STEUBEN COUNTY HEALTH PROFILE. Finger Lakes Health Systems Agency, 2017

THE QUALITATIVE AND QUANTITATIVE EFFECTS OF PATIENT CENTERED MEDICAL HOME IN THE VETERANS HEALTH ADMINISTRATION

Improving Care and Managing Costs: Team-Based Care for the Chronically Ill

Using An APCD to Inform Healthcare Policy, Strategy, and Consumer Choice. Maine s Experience

Medicare & Medicaid EHR Incentive Programs. Stage 2 Final Rule Jason McNamara Technical Director for Health IT HIMSS Meeting April 25, 2013

Case-mix Analysis Across Patient Populations and Boundaries: A Refined Classification System

Measures Reporting for Eligible Hospitals

Total Cost of Care in Action

POSITION DESCRIPTION

The Healthy Michigan Plan Handbook

Appendix 4 CMS Stage 1 Meaningful Use Requirements Summary Tables 4-1 APPENDIX 4 CMS STAGE 1 MEANINGFUL USE REQUIREMENTS SUMMARY

Eligible Professionals (EP) Meaningful Use Final Objectives and Measures for Stage 1, 2011

Is there an impact of Health Information Technology on Delivery and Quality of Patient Care?

Oklahoma Health Care Authority. ECHO Adult Behavioral Health Survey For SoonerCare Choice

Data Sources for Medical Device Epidemiology

Implementing and Improving: Behavioral Health Quality

Quality Based Impacts to Medicare Inpatient Payments

STEUBEN COUNTY HEALTH PROFILE

LIVINGSTON COUNTY HEALTH PROFILE. Finger Lakes Health Systems Agency, 2017

A strategy for building a value-based care program

Recommendations to Improve Data Collection to Monitor, Track, and Evaluate State Approaches to Family Support Services

MONROE COUNTY HEALTH PROFILE. Finger Lakes Health Systems Agency, 2017

PCMH 2014 Recognition Checklist

Health Management Information Systems: Computerized Provider Order Entry

Appendix: Data Sources and Methodology

ONTARIO COUNTY HEALTH PROFILE. Finger Lakes Health Systems Agency, 2017

Place of Service Code Description Conversion

Overview of Six Texas Demonstrations

Household survey on access and use of medicines

The Healthcare Cost and Utilization Project (HCUP)

The Patient Centered Medical Home Guidelines: A Tool to Compare National Programs

Appendix. We used matched-pair cluster-randomization to assign the. twenty-eight towns to intervention and control. Each cluster,

Implementation of the System of Health Accounts in OECD countries

STATE OF CONNECTICUT

JH-CERSI/FDA Workshop Clinical Trials: Assessing Safety and Efficacy for a Diverse Population

P: E: P: E:

New Alignments in Data-Driven Care Coordination & Access for Specialty Products: Insights from the DIMENSIONS Report

MEDICARE ENROLLMENT, HEALTH STATUS, SERVICE USE AND PAYMENT DATA FOR AMERICAN INDIANS & ALASKA NATIVES

Family Practice Clinic

2015 Hospital Inpatient Discharge Data Annual Report

INTERNATIONAL MEETING: HEALTH OF PERSONS WITH ID SPONSORED BY THE CDC AND AUCD

Medical Management. G.2 At a Glance. G.3 Procedures Requiring Prior Authorization. G.5 How to Contact or Notify Medical Management

NGA Paper. Using Data to Better Serve the Most Complex Patients: Highlights from NGA s Intensive Work with Seven States

CHEMUNG COUNTY HEALTH PROFILE. Finger Lakes Health Systems Agency, 2017

3M Health Information Systems. The standard for yesterday, today and tomorrow: 3M All Patient Refined DRGs

Minnesota CHW Curriculum

PCSP 2016 PCMH 2014 Crosswalk

Supplementary Online Content

An Overview of NCQA Relative Resource Use Measures. Today s Agenda

2016 Hospital Inpatient Discharge Data Annual Report

HEALTH NET S IT S YOUR LIFE WELLSITE It s Your Life online tools and resources plus the personal support of Decision Power SM

Minnesota Statewide Quality Reporting and Measurement System: Quality Incentive Payment System Framework

UTILIZATION MANAGEMENT AND CARE COORDINATION Section 8

Cathy Schoen. The Commonwealth Fund Grantmakers In Health Webinar October 3, 2012

Stage 1 Meaningful Use Objectives and Measures

Transcription:

Using Secondary Datasets for Research José J. Escarce January 26, 2015 Learning Objectives Understand what secondary datasets are and why they are useful for health services research Become familiar with specific issues related to the research use of secondary data Planning the study Importance of a conceptual framework Evaluating the data Conducting the analyses and interpreting the results Reporting the research Gain familiarity with several frequently used secondary datasets What Do We Mean By Secondary Data? Data collected for purposes other than the particular research project that you are planning Two types are most commonly used: General purpose health and health care surveys Administrative data Public health data Data from public programs Private sector data 1

Advantages and Disadvantages of Secondary Data Advantages Can address a wide range of research questions Relatively inexpensive Nationally representative data (often) Wide variation in contexts Large sample sizes Rich set of variables (often) Can be linked to other data sources Disadvantages Cross-sectional (usually) May not have variables you need/want Very limited clinical information Using Secondary Data for Research Planning the study Ideally, the research questions and study hypotheses come first, followed by looking for secondary datasets that can be used to address the questions In practice, there is almost always some iteration and both questions and hypotheses are refined based on available data elements and measures Key implication: A sound conceptual framework is essential i.e., you need to know how the phenomenon that interests you works before you start the research Both disciplinary and institutional knowledge matter Biggest risk is designing a study in a theory-free environment after you have run a bunch of associations What Is A Conceptual Framework? A conceptual framework, or model, provides a relatively simple description of a phenomenon of interest, usually the relationships between particular outcomes of interest and their determinants Tries to get inside the black box A conceptual framework often breaks down a complex system into its component parts Conceptual frameworks are usually concerned with specific types of behavior in specific contexts 2

Are Conceptual Models Important? but as important as a good data infrastructure is having sophisticated conceptual models that direct data collection and analysis, and that represent the true complexity of health care provision. much of the value of health services research comes through the conceptual work that guides how researchers understand and pose research and policy issues. --David Mechanic, Milbank Q 2001; 79: 459-477. Why Is A Conceptual Famework Essential to Good Research? A conceptual framework imposes discipline and rigor in your thinking about the phenomenon you are planning to study A good model acts like a map that gives coherence to empirical inquiry by: Forcing you to consider all the factors that might affect the outcomes of interest Providing an understanding of the causal linkages among these factors Why Is A Conceptual Famework Essential to Good Research? A conceptual model, therefore, is indispensable in: Generating plausible and justifiable research hypotheses Designing appropriate and defensible empirical analyses to test them Interpreting empirical findings and assessing their generalizability 3

Using Secondary Data for Research (cont.) Evaluating the data: Surveys Read the documentation Who sponsored the survey and why was it done? What is the sample design? Who is in or out? Have the data been cleaned and edited? Were imputations done? Read the questionnaire Become familiar with items related to the dimensions and constructs that interest you Why were these items chosen? How do they compare with related items in other surveys? Is there information on their validity and other measurement properties? Are there items with high non-response? What are the skip patterns in the questionnaire? Using Secondary Data for Research (cont.) Evaluating the data: Administrative data Read the documentation Who collects the data and why? How are different data elements collected i.e., what is their source? Who (What) is in the data and who (what) is out? Under what circumstances? Learn as much as you can about the data What types of studies have the data been used for? Is there information on the validity of different data elements? How have different data elements been used in research by others? Which data elements should not be used? Are there established and well-accepted methods for measuring constructs of interest using the data? Do these work in your application? How well do the measures capture what you want? Using Secondary Data for Research (cont.) Evaluating the data: General principles Run frequencies and distributions on all variables of interest Check every number and ask yourself whether it makes sense in light of what you know about: The dataset The phenomena and institutions you are studying The population group, area, state, or country When necessary, benchmark your data against other sources Assume that you ve made a mistake and that your job is to find it 4

Using Secondary Data for Research (cont.) Conducting the analyses and interpreting the results Understand potential sources of bias in your analyses and consider whether there are ways to mitigate them or at least to assess whether bias could overturn your findings Plan and conduct meaningful sensitivity analyses The best sensitivity analyses assess whether potential sources of bias could overturn your results Feel free to be creative in your sensitivity analyses Distinguish statistical from clinical or policy significance Stay humble: Don t over-interpret or over-conclude Using Secondary Data for Research (cont.) Reporting the research Describe the dataset Level of detail should be inversely proportional to how familiar your audience will be with the dataset and how much information about the dataset is readily accessible Include response rate and comparison of respondents and nonrespondents; mention weighting scheme Describe how you selected study sample Describe key measures and provide references Note advantages/disadvantages relative to alternatives Describe proxy measures and why you chose them Describe approach to missing data Describe data aggregation, scale construction, and data linkages Be clear about unit of analysis General Purpose Health and Health Care Surveys National Health Interview Survey California Health Interview Survey Medical Expenditure Panel Survey National Ambulatory Medical Care Survey National Ambulatory Medical Care Survey National Health and Nutrition Examination Survey National Longitudinal Study of Adolescent Health Medicare Current Beneficiary Survey Health and Retirement Study 5

National Health Interview Survey (NHIS) Principal source of information on health of noninstitutionalized population; face-to-face interviews; administered annually since 1957 Nationally representative; oversamples blacks and Hispanics; response rate > 90% One child and one adult from each sampled household; sample size can range up to 100,000, but often less Core questions include household composition, sociodemographics, insurance, basic health status indicators, health behaviors, access and utilization (limited), preventive care Supplementary questions of interest to co-sponsors or to respond to new public health data needs California Health Interview Survey (CHIS) Information on health of California population Biennial survey 2001-2011; starting in 2012, continuous survey model Random digit dial (RDD) telephone survey; one child, one adolescent, and one adult from each sampled household; sample size around 50,000; 000; response rate about 35% Administered in English, Spanish, Mandarin, Cantonese, Korean, and Vietnamese Household composition, socio-demographics, insurance, health status, health conditions, health behaviors, access to and use of services, health and development of children Many questions taken from NHIS Medical Expenditure Panel Survey (MEPS) Household component is principal source of information on health care utilization and expenditures for non-institutionalized population; overlapping panel design (2 years of data for each respondent); launched in 1996 Nationally representative; oversamples blacks and Hispanics Face-to-face interviews; all members of each sampled household; h sample size 15,000-20,000; response rate 65-70% Household data supplemented by Medical Provider Component (hospitals, physicians, home health agencies, pharmacies) Health care use and expenditures (office and ED visits, hospitalizations, prescription drugs, other), with diagnoses Household composition, socio-demographics, insurance, health status, health conditions, health behaviors, access barriers, satisfaction, health care ratings 6

National Health and Nutrition Examination Survey (NHANES) NHANES III (1988-1994); 1999-2012 NHANES Designed to assess health and nutritional status of population Combines interviews, physical examinations, and laboratory tests Nationally representative; oversamples blacks and Hispanics Face-to-face interview; one adult and one child in each sampled household Examinations and lab tests in Mobile Examination Centers Wave sample size 5,000; response rate >80% Socio-demographics, insurance, health status, health conditions, health behaviors; detailed dietary history Medical and dental examinations, anthropometrics, vision and hearing, fitness tests, bone mineral density Blood and urine tests National Ambulatory Medical Care Survey (NAMCS) Designed to provide information about the provision and use of ambulatory medical care services; annual survey since 1973 Nationally representative Sample of visits to nonfederal office-based physicians (all officebased specialties); physicians report data on special forms Each physician reports on sample of visits during randomly chosen one-week reporting period Patient socio-demographics Patients symptoms, physicians diagnoses, medications Services provided, diagnostic procedures, planned treatment Selected lab values (new in 2011) National Longitudinal Study of Adolescent Health (Add Health) Designed to provide information on the health habits and behaviors of adolescents as they make the transition to adulthood, as well as on their outcomes in young adulthood 20,000 subjects in grades 7-12 at the start of data collection Five waves: 1994, 1995, 1996, 2001-02, 2008 (age 24-32) Designed to allow analyses of influence of social contexts (families, friends, schools, neighborhoods) Nationally representative In-school questionnaires in Wave I; home interviews in subsequent waves; questionnaires for parents, siblings, friends, school administrators Diet, physical activity, health service use, injury, violence, sexual behavior, contraception, sexually transmitted infections, pregnancy, suicidal intentions/thoughts, substance use/abuse, runaway behavior, height and weight, chronic conditions, mental health 7

Medicare Current Beneficiary Survey (MCBS) Designed to assist CMS in administering, monitoring, and evaluating the Medicare program; began in 1991; linked to Medicare administrative data Nationally representative Uses rotating panel design in which subjects are interviewed every four months for up to four years Samples obtained from Medicare enrollment files and data collected through personal interviews; oversamples oldest old and disabled Design permits both cross-sectional and longitudinal analyses Socio-demographics, insurance, health services utilization and expenditures, sources of payment including out-of-pocket costs, health status and functioning, health behaviors Frequently Used Administrative Data State hospital discharge data Healthcare Cost and Utilization Project (HCUP) Medicare administrative data Enrollment files Hospital discharge files Physician services files Inpatient rehabilitation and skilled nursing facility files Hospital outpatient department files Medicaid administrative data Healthcare Cost and Utilization Project HCUP is a family of databases and related software tools and products developed through a Federal- State-Industry partnership and sponsored by AHRQ HCUP includes the largest collection of hospital care data in the U.S., with all-payer, encounter-level information beginning in 1988 8

Healthcare Cost and Utilization Project (cont.) Database components of HCUP: Nationwide Inpatient Sample (NIS): Inpatient data from a national sample of over 1,000 hospitals (starting in 1988) Kids' Inpatient Database (KID): Nationwide sample of pediatric (age < 21) inpatient discharges State Inpatient Databases (SID): Universe of inpatient discharge abstracts from participating states (starting in 1995) State Ambulatory Surgery Databases (SASD): Data from ambulatory care encounters from hospital-affiliated and sometimes freestanding ambulatory surgery sites (starting in 1997) State Emergency Department Databases (SEDD): Data from hospital-affiliated emergency departments for visits that do not result in hospitalizations (starting in 1999) Healthcare Cost and Utilization Project (cont.) Software components of HCUP: AHRQ Quality Indicators (QIs): Measures of health care quality that make use of hospital inpatient administrative data Consist of three modules measuring various aspects of quality: ACSCs, inpatient QIs, patient safety indicators Software and user guides for modules are available to assist users in applying the Quality Indicators to their own data Clinical Classifications Software (CCS): Provides a method for classifying diagnoses or procedures into clinically meaningful categories Other Secondary Datasets Behavioral Risk Factor Surveillance System (BRFSS) Surveillance, Epidemiology and End Results (SEER) Program Community Tracking Study Surveys Claims data from private health plans or employers American Hospital Association Annual Survey of Hospitals AND MANY, MANY OTHERS! 9