Institutionalizing a Culture of Statistical Thinking in DoD Testing

Similar documents
UNCLASSIFIED R-1 ITEM NOMENCLATURE

UNCLASSIFIED. UNCLASSIFIED Office of Secretary Of Defense Page 1 of 8 R-1 Line #163

Department of Defense DIRECTIVE

I n t r o d u c t i o n

Test and Evaluation of Highly Complex Systems

Test and Evaluation (T&E) is essential to successful system

I n t r o d u c t i o n

Test and Evaluation and the ABCs: It s All about Speed

OSD RDT&E BUDGET ITEM JUSTIFICATION (R2 Exhibit)

Testing in a Joint Environment. Janet Garber Director Test and Evaluation Office Office of the Deputy Under Secretary of the Army

Mission Based T&E Progress

UNCLASSIFIED OSD RDT&E BUDGET ITEM JUSTIFICATION (R2 Exhibit)

Department of Defense Fiscal Year (FY) 2013 President's Budget Submission

Developmental Test & Evaluation OUSD(AT&L)/DDR&E

COMPLIANCE WITH THIS PUBLICATION IS MANDATORY

UNCLASSIFIED. UNCLASSIFIED R-1 Line Item No. 4 Page 1 of 6

Developmental Test and Evaluation Is Back

FY 2016 Annual Report

Inside the Beltway ITEA Journal 2008; 29: Copyright 2008 by the International Test and Evaluation Association

JOINT TEST AND EVALUATION METHODOLOGY (JTEM) PROGRAM MANAGER S HANDBOOK FOR TESTING IN A JOINT ENVIRONMENT

FY 2015 Annual Report

AMRDEC. Core Technical Competencies (CTC)

DoDI ,Operation of the Defense Acquisition System Change 1 & 2

FY 2010 Annual Report

UNCLASSIFIED FY 2016 OCO. FY 2016 Base

Defense Science Board Task Force Developmental Test and Evaluation Study Results

Subj: DEPARTMENT OF THE NAVY CYBERSECURITY/INFORMATION ASSURANCE WORKFORCE MANAGEMENT, OVERSIGHT, AND COMPLIANCE

GUARDING THE INTENT OF THE REQUIREMENT. Stephen J Scukanec. Eric N Kaplan

Longbow Apache and GMLRS had Nunn McCurdy but did not have any delays

Cybersecurity TEMP Body Example

UNCLASSIFIED. Cost To Complete Total Program Element JA6: Joint Air-To-Ground Missile (JAGM)

DEPARTMENT OF DEFENSE Developmental Test and Evaluation FY 2013 Annual Report

Mission-Based Test & Evaluation Strategy: Creating Linkages between Technology Development and Mission Capability

Fiscal Year 2009 National Defense Authorization Act, Section 322. Study of Future DoD Depot Capabilities

UNCLASSIFIED FY 2016 OCO. FY 2016 Base

Joint Test & Evaluation Program


Prepared for Milestone A Decision

UNCLASSIFIED. R-1 ITEM NOMENCLATURE PE D8Z: Central Test and Evaluation Investment Program (CTEIP) FY 2013 OCO

COMPLIANCE WITH THIS PUBLICATION IS MANDATORY

UNCLASSIFIED. UNCLASSIFIED Air Force Page 1 of 9 R-1 Line #44

Opera onal Test & Evalua on Manual

Acquisition. Air Force Procurement of 60K Tunner Cargo Loader Contractor Logistics Support (D ) March 3, 2006

The Role of T&E in the Systems Engineering Process Keynote Address

NATIONAL AIRSPACE SYSTEM (NAS)

OUTSOURCING TRENDS THAT WILL HELP YOU PREPARE FOR 2017

FY 2017 Annual Report on Cost Assessment Activities. February 2018

Department of Defense Investment Review Board and Investment Management Process for Defense Business Systems

The current Army operating concept is to Win in a complex

Inspector General FOR OFFICIAL USE ONLY

Department of Defense Developmental Test and Evaluation and Systems Engineering FY 2011 Annual Report. Washington, DC: DASD(DT&E) and DASD(SE), 2012.

Test and Evaluation Resources

Test and Evaluation Strategies for Network-Enabled Systems

Department of Defense INSTRUCTION

Code 85 Weapons Analysis Facility (WAF) Technical Engineering Services Pre-Solicitation Conference

ARMY TACTICAL MISSILE SYSTEM (ATACMS) BLOCK II

Statement of Vice Admiral Albert H. Konetzni, Jr. USN (Retired) Before the Projection Forces Subcommittee of the House Armed Services Committee

COMPLIANCE WITH THIS INSTRUCTION IS MANDATORY (AETC)

Department of Defense DIRECTIVE

HQMC 7 Jul 00 E R R A T U M. MCO dtd 9 Jun 00 MARINE CORPS POLICY ON DEPOT MAINTENANCE CORE CAPABILITIES

Department of Defense DIRECTIVE. DoD Modeling and Simulation (M&S) Management

MILITARY STRATEGIC AND TACTICAL RELAY (MILSTAR) SATELLITE SYSTEM

Gerry Christeson Test Resource Management Center 20 October 2010

Test and Evaluation Policy

AIRBORNE LASER (ABL)

Overview of the Chemical and Biological Defense Program Requirements Process

JAVELIN ANTITANK MISSILE

Development Planning Working Group Update

DoD M-4, August 1988

(111) VerDate Sep :55 Jun 27, 2017 Jkt PO Frm Fmt 6601 Sfmt 6601 E:\HR\OC\A910.XXX A910

TRAINEE CLINICAL PSYCHOLOGIST GENERIC JOB DESCRIPTION

Cybersecurity FY16 CYBERSECURITY. Cybersecurity 441

Subj: NAVY ENTERPRISE TEST AND EVALUATION BOARD OF DIRECTORS

United States Government Accountability Office GAO. Report to Congressional Committees

SYSTEM DESCRIPTION & CONTRIBUTION TO JOINT VISION

Merging Operational Realism with DOE Methods in Operational Testing NDIA Presentation on 13 March 2012

Making the Case for Distributed Testing

This is definitely another document that needs to have lots of HSI language in it!

Experimenting into the future Mr Ed Gough Deputy Commander Naval Meteorology and Oceanography Command

UNCLASSIFIED UNCLASSIFIED

UNCLASSIFIED R-1 ITEM NOMENCLATURE FY 2013 OCO

2014 MASTER PROJECT LIST

MANAGING LARGE DISTRIBUTED DATA SETS FOR TESTING IN A JOINT ENVIRONMENT

Glossary. Approval for Full Production (AFP). The decision for full production of a system. Normally occurs at the final MS-C.

MILITARY STRATEGIC AND TACTICAL RELAY (MILSTAR) SATELLITE SYSTEM

STATEMENT OF THE HONORABLE PETER B. TEETS, UNDERSECRETARY OF THE AIR FORCE, SPACE

US Special Operations Command

MCO D C Sep 2008

17 th ITEA Engineering Workshop: System-of-Systems in a 3rd Offset Environment: Way Forward

Middle Tier Acquisition and Other Rapid Acquisition Pathways

Department of Defense INSTRUCTION. Non-Lethal Weapons (NLW) Human Effects Characterization

DEPARTMENT OF DEFENSE Developmental Test and Evaluation FY 2016 Annual Report

DEPARTMENT OF THE NAVY COMMANDER OPERATIONAL TEST AND EVALUATION FORCE 7970 DIVEN STREET NORFOLK, VIRGINIA

Department of Defense INSTRUCTION

New DoD Approaches on the Cyber Survivability of Weapon Systems

Task Force Innovation Working Groups

We Produce the Future. Air Force Doctrine

1.0 Executive Summary

SUBJECT: Army Directive (Implementation of Acquisition Reform Initiatives 1 and 2)

Insider Threat Webinar Series Defense Personnel Security and Research Center

Transcription:

Institutionalizing a Culture of Statistical Thinking in DoD Testing Dr. Catherine Warner Science Advisor Statistical Engineering Leadership Webinar 25 September 2017

Outline Overview of DoD Testing Improving Operational Testing Statistical Analysis Methods for Improving Mission Characterization Continuing the Path Forward Bayesian Methods for Maximizing Information Defensible Surveys Capturing Human Interactions Improving Modeling and Simulation Looking to the Future 2

Goal of Operational Test: Evaluate Operational Effectiveness and Suitability Operational Environment Representative Users Real Threats Conducting Missions 3

DoD Test Paradigm Contractor Testing Developmental Testing Operational Testing Test Timeline Tend to be requirements driven 4

Requirements documents are often missing important mission considerations

OT characterizes mission capability Contractor Testing Developmental Testing Operational Testing Test Timeline Poor Acoustic Propagation; High Ambient Noise, High Density Traffic Difficulty of Environment Virginia vs. Georgia (ASW-3) ARCI APB-03 OT 688I vs. Gotland ARCI APB-06 OT 688I vs. Todaro Favorable Acoustic Propagation; Low Traffic; Low Ambient Noise Virginia vs. Albany (ASW-2) Snorkeling Diesel (includes most older SSK threats) Fast/Noisy SSN Slow SSN or SSBN (SSN threat equivalent) Target Source Level (Decreasing) Gotland (SSK threat equivalent) Todaro (Quietest SSK threats) 6

By the early 1980s, Congress concerns were growing

Congress established DOT&E separate from the Services operational testing agencies Department of Defense Office of the Secretary of Defense Director, Operational Test and Evaluation (DOT&E)

Congress established DOT&E separate from the Services operational testing agencies Department of Defense Office of the Secretary of Defense Army Navy & Marines Air Force Director, Operational Test and Evaluation (DOT&E) Service Operational Testing Agencies

Congress established DOT&E separate from the Services operational testing agencies Congress Department of Defense Office of the Secretary of Defense Army Navy & Marines Air Force Director, Operational Test and Evaluation (DOT&E) Service Operational Testing Agencies

Operational testing provides critical information to warfighters about new systems Before warfighters lives and missions depend on them

Operational testing provide critical information to warfighters about new systems Before warfighters lives and missions depend on them Time to correct problems Time to restrict missions

Improving Operational Testing

Why did we need to improve test methods? Figure Figure from DOT&E from DOT&E EA-18G EA-18G BLRIP BLRIP Percent Success Percent Success

DOT&E Sets Policy and Guidance for Conducting Operational Testing The goal of the experiment. This should reflect evaluation of end-to-end mission effectiveness in an operationally realistic environment. Quantitative mission-oriented response variables for effectiveness and suitability. (These could be Key Performance Parameters but most likely there will be others.) Factors that affect those measures of effectiveness and suitability. Systematically, in a rigorous and structured way, develop a test plan that provides good breadth of coverage of those factors across the applicable levels of the factors, taking into account known information in order to concentrate on the factors of most interest. A method for strategically varying factors across both developmental and operational testing with respect to responses of interest. Statistical measures of merit (power and confidence) on the relevant response variables for which it makes sense. These statistical measures are important to understanding "how much testing is enough?" and can be evaluated by decision makers on a quantitative basis so they can trade off test resources for desired confidence in results.

Laying the foundations for statistical methods in T&E Research Consortium Offsite Meeting Charter Statistical Engineering with NASA AO Training, OTA Training

Puzzled??

Sharing lessons learned advanced our mutual understanding

Without a destination, any path will do Institutionalizing Statistical Thinking in Test and Evaluation National Research Council Study Design of Experiments endorsed as a sound methodology for OT&E OTA MOA on DOE DOT&E Initiatives Guidance on DOE in TEMPs DOT&E Policy Issued OTA Test Design Processes Updated DOT&E Science Advisor Established Test Science Roadmap effort DOT&E/ TRMC funded Science of Test Research Consortium DOT&E TEMP Guide Published DASD (DT&E) STAT Implement ation Plan STAT COE DOT&E Roadmap Report Two Additional DOT&E Guidance memos on Application of DOE to OT&E Survey Best Practices Memo Cybersecurity Procedures Additional Survey and cyber work Modeling and simulation validation guidance Cyber priorities Updated TEMP Guidance M&S Guidance

Lessons Learned from Implementing DOE Strong leadership Communicate, communicate, communicate Find partners Compromise Be open to new ideas Create quick successes and highlight them Support the workforce 20

Statistical Analysis Methods for Improving Mission Characterization

Statistical analyses maximize information 8 Non-DOE approach - data in bins segregated DOE method - data in all bins used to construct model 7 6 Response Variable 5 4 3 2 1 0 A B C D Rollup

Statistical models capture important interactions Apache FOT&E 80% confidence intervals shown

Continuing the Path Forward

Bayesian methods provide flexibility in combining information Stryker Family of Vehicles Reliability 10000 Miles Between System Abort 8000 6000 4000 2000 Traditional Analysis Bayesian Combining Information 0

Sometimes mission outcome is subjective Survey regarding improved situational awareness Strongly agree Agree Slightly agree Slightly disagree Disagree Strongly disagree

Guidance highlighted key concepts for improving surveys Surveys are appropriate for quantitatively measuring operator and maintainer thoughts and opinions Have an administration plan for surveys and only use surveys when appropriate Use the right survey Empirically vetted surveys should be used to measure known constructs (e.g., workload, usability, trust) Custom surveys should be used appropriately Follow best practices for writing questions Always pre-test Avoid asking questions without a clear analysis plan Use interviews and focus groups for problem identification and general context Do not develop lengthy exhaustive surveys about every problem that could occur

Live-Virtual Constructive simulations can help us learn more but it needs better validation Range of threat types Operational Space Threat lethality

Live-Virtual Constructive simulations can help us learn more but it needs better validation Range of threat types Operational Space Modeling and simulation Threat lethality

Live-Virtual Constructive simulations can help us learn more but it needs better validation Range of threat types Operational Space Modeling and simulation live testing Threat lethality

We continue to increase the statistical defensibility of DoD Test and Evaluation National Research Council Study Design of Experiments endorsed as a sound methodology for OT&E OTA MOA on DOE DOT&E Initiatives Guidance on DOE in TEMPs DOT&E Policy Issued OTA Test Design Processes Updated DOT&E Science Advisor Established Test Science Roadmap effort DOT&E/ TRMC funded Science of Test Research Consortium DOT&E TEMP Guide Published DASD (DT&E) STAT Implement ation Plan STAT COE DOT&E Roadmap Report Two Additional DOT&E Guidance memos on Application of DOE to OT&E Survey Best Practices Memo Cybersecurity Procedures Additional Survey and cyber work Modeling and simulation validation guidance Cyber priorities Updated TEMP Guidance M&S Guidance

Future Test Challenges

We need to think carefully about the challenges we face in the future Space Cyber Autonomy Big data Workforce