Statistical Thinking in DoD Test & Evaluation: F-35 Case Study. Dr. Laura Freeman

Similar documents
Institutionalizing a Culture of Statistical Thinking in DoD Testing

UNCLASSIFIED R-1 ITEM NOMENCLATURE

UNCLASSIFIED. UNCLASSIFIED Office of Secretary Of Defense Page 1 of 8 R-1 Line #163

Department of Defense DIRECTIVE

UNCLASSIFIED. R-1 ITEM NOMENCLATURE PE F: Requirements Analysis and Maturation. FY 2011 Total Estimate. FY 2011 OCO Estimate

F/A-18 E/F SUPER HORNET

A FUTURE MARITIME CONFLICT

FY 2016 Annual Report

Test and Evaluation of Highly Complex Systems

UNCLASSIFIED. FY 2016 Base FY 2016 OCO

UNCLASSIFIED FY 2016 OCO. FY 2016 Base

UNCLASSIFIED. UNCLASSIFIED Army Page 1 of 7 R-1 Line #9

RDT&E BUDGET ITEM JUSTIFICATION SHEET (R-2 Exhibit)

ARMY TACTICAL MISSILE SYSTEM (ATACMS) BLOCK II

UNCLASSIFIED R-1 ITEM NOMENCLATURE

STATEMENT J. MICHAEL GILMORE DIRECTOR, OPERATIONAL TEST AND EVALUATION OFFICE OF THE SECRETARY OF DEFENSE BEFORE THE SENATE ARMED SERVICES COMMITTEE

Request for Solutions: Distributed Live Virtual Constructive (dlvc) Prototype

Developmental Test & Evaluation OUSD(AT&L)/DDR&E

FY 2015 Annual Report

Prepared for Milestone A Decision

Merging Operational Realism with DOE Methods in Operational Testing NDIA Presentation on 13 March 2012

Inside the Beltway ITEA Journal 2008; 29: Copyright 2008 by the International Test and Evaluation Association

Department of Defense Fiscal Year (FY) 2013 President's Budget Submission

UNCLASSIFIED FY 2016 OCO. FY 2016 Base

UNCLASSIFIED FY 2016 OCO. FY 2016 Base

I n t r o d u c t i o n

AIRBORNE LASER (ABL)

UNCLASSIFIED. UNCLASSIFIED Navy Page 1 of 8 R-1 Line #77

OSD RDT&E BUDGET ITEM JUSTIFICATION (R2 Exhibit)

SYSTEM DESCRIPTION & CONTRIBUTION TO JOINT VISION

I n t r o d u c t i o n

UNCLASSIFIED R-1 ITEM NOMENCLATURE

ARMY MULTIFUNCTIONAL INFORMATION DISTRIBUTION SYSTEM-LOW VOLUME TERMINAL 2 (MIDS-LVT 2)

UNCLASSIFIED R-1 ITEM NOMENCLATURE FY 2013 OCO

Cybersecurity TEMP Body Example

UNCLASSIFIED. R-1 Program Element (Number/Name) PE D8Z / Prompt Global Strike Capability Development. Prior Years FY 2013 FY 2014 FY 2015

UNCLASSIFIED. R-1 ITEM NOMENCLATURE PE N: ASW Systems Development

UNCLASSIFIED R-1 ITEM NOMENCLATURE

Test and Evaluation Policy

Challenges of a New Capability-Based Defense Strategy: Transforming US Strategic Forces. J.D. Crouch II March 5, 2003


UNCLASSIFIED. UNCLASSIFIED Navy Page 1 of 7 R-1 Line #16

UNCLASSIFIED FY 2016 OCO. FY 2016 Base

FIGHTER DATA LINK (FDL)

REQUIREMENTS TO CAPABILITIES

UNCLASSIFIED. FY 2016 Base FY 2016 OCO

UNCLASSIFIED R-1 ITEM NOMENCLATURE

UNCLASSIFIED. R-1 ITEM NOMENCLATURE PE D8Z: Central Test and Evaluation Investment Program (CTEIP) FY 2013 OCO

Joint Test and Evaluation Program

COMPLIANCE WITH THIS PUBLICATION IS MANDATORY

UNCLASSIFIED. UNCLASSIFIED Air Force Page 1 of 9 R-1 Line #44

F-22 RAPTOR (ATF) BACKGROUND INFORMATION

9 th Annual Disruptive Technologies Conference

UNCLASSIFIED FY 2016 OCO. FY 2016 Base

UNCLASSIFIED. R-1 Program Element (Number/Name) PE J / Joint Integrated Air & Missile Defense Organization (JIAMDO) Prior Years FY 2013 FY 2014

JAVELIN ANTITANK MISSILE

UNCLASSIFIED. Cost To Complete Total Program Element P857: Joint Deployable Analysis Team (JDAT)

UNCLASSIFIED. R-1 ITEM NOMENCLATURE PE D8Z: Central Test and Evaluation Investment Program (CTEIP) FY 2012 OCO

The Role of T&E in the Systems Engineering Process Keynote Address

(111) VerDate Sep :55 Jun 27, 2017 Jkt PO Frm Fmt 6601 Sfmt 6601 E:\HR\OC\A910.XXX A910

Challenges and opportunities Trends to address New concepts for: Capability and program implications Text

GAO. QUADRENNIAL DEFENSE REVIEW Opportunities to Improve the Next Review. Report to Congressional Requesters. United States General Accounting Office

AMRDEC. Core Technical Competencies (CTC)

Testing in a Joint Environment. Janet Garber Director Test and Evaluation Office Office of the Deputy Under Secretary of the Army

Detect, Deny, Disrupt, Degrade and Evade Lethal Threats. Advanced Survivability Suite Solutions for Mission Success

The Armed Forces Communications and Electronics Association (AFCEA)

First Announcement/Call For Papers

The overall goals of MILITARY F-35A LIGHTNING II AT RED FLAG

Middle Tier Acquisition and Other Rapid Acquisition Pathways

UNCLASSIFIED R-1 ITEM NOMENCLATURE FY 2013 OCO

Developmental Test and Evaluation Is Back

JOINT AIR-TO-SURFACE STANDOFF MISSILE (JASSM)

UNCLASSIFIED UNCLASSIFIED. EXHIBIT R-2, RDT&E Budget Item Justification February 2007 RESEARCH DEVELOPMENT TEST & EVALUATION, NAVY / BA-4

Advanced Technology Overview for the Huntsville Aerospace Marketing Association

Soldier Division Director David Libersat June 2, 2015

UNCLASSIFIED. R-1 ITEM NOMENCLATURE PE F: Joint Strike Fighter Squadrons

UNCLASSIFIED R-1 ITEM NOMENCLATURE. FY 2014 FY 2014 OCO ## Total FY 2015 FY 2016 FY 2017 FY 2018

Mission Based T&E Progress

Defense Science Board Task Force Developmental Test and Evaluation Study Results

EXHIBIT R-2, RDT&E Budget Item Justification RESEARCH DEVELOPMENT TEST & EVALUATION, NAVY / BA4

Precision Strike Winter Roundtable

UNCLASSIFIED FY 2016 OCO. FY 2016 Base

THINKING DIFFERENTLY ABOUT NETWORK RESILIENCE

SERIES 1300 DIRECTOR, DEFENSE RESEARCH AND ENGINEERING (DDR&E) DEFENSE RESEARCH AND ENGINEERING (NC )

Chapter 13 Air and Missile Defense THE AIR THREAT AND JOINT SYNERGY

The current Army operating concept is to Win in a complex

Technical Supplement For Joint Standard Instrumentation Suite Missile Attitude Subsystem (JMAS) Version 1.0

Test and Evaluation and the ABCs: It s All about Speed

Summary: FY 2019 Defense Appropriations Bill Conference Report (H.R. 6157)

UNCLASSIFIED FY This program develops and demonstrates advanced technologies, including Electromagnetic (EM) Rail Gun for naval weapon systems.

ARMY RDT&E BUDGET ITEM JUSTIFICATION (R2 Exhibit)

UNCLASSIFIED. FY 2017 Base FY 2017 OCO

Exhibit R-2, RDT&E Budget Item Justification Date: February 2008 Appropriation/Budget Activity RDT&E, Dw BA 07

UNCLASSIFIED OSD RDT&E BUDGET ITEM JUSTIFICATION (R2 Exhibit)

STATEMENT OF THE HONORABLE PETER B. TEETS, UNDERSECRETARY OF THE AIR FORCE, SPACE

UNCLASSIFIED FY 2008/2009 RDT&E,N BUDGET ITEM JUSTIFICATION SHEET DATE: February 2007 Exhibit R-2

UNCLASSIFIED. FY 2017 Base FY 2017 OCO

17 th ITEA Engineering Workshop: System-of-Systems in a 3rd Offset Environment: Way Forward

WikiLeaks Document Release

AVW TECHNOLOGIES, INC.

Transcription:

Statistical Thinking in DoD Test & Evaluation: F-35 Case Study Dr. Laura Freeman

Improving Operational Testing: A case study from my past 8 years

Goal of Operational Test: Evaluate Operational Effectiveness, Suitability, and Survivability Operational Environment Representative Users Real Threats Conducting Missions 3

DoD Test Paradigm In Terms of Your New Corolla Contractor Testing Developmental Testing Operational Testing Test Timeline Tend to be requirements driven 4

Requirements documents are often missing important mission considerations

DoD Test Paradigm In Terms of Your New Corolla Contractor Testing Developmental Testing Operational Testing Test Timeline 6

DoD Test Paradigm In Terms of Your New Corolla Contractor Testing Developmental Testing Operational Testing Test Timeline 7

DoD Test Paradigm In Terms of Your New Corolla Contractor Testing Developmental Testing Operational Testing Test Timeline 8

Congress established DOT&E separate from the Services operational testing agencies Congress Department of Defense Office of the Secretary of Defense Army Navy & Marines Air Force Director, Operational Test and Evaluation Service Operational Testing Agencies

DOT&E Sets Policy and Guidance for Conducting Operational Testing The goal of the experiment. This should reflect evaluation of end-to-end mission effectiveness in an operationally realistic environment. Quantitative mission-oriented response variables for effectiveness and suitability. (These could be Key Performance Parameters but most likely there will be others.) Factors that affect those measures of effectiveness and suitability. Systematically, in a rigorous and structured way, develop a test plan that provides good breadth of coverage of those factors across the applicable levels of the factors, taking into account known information in order to concentrate on the factors of most interest. A method for strategically varying factors across both developmental and operational testing with respect to responses of interest. Statistical measures of merit (power and confidence) on the relevant response variables for which it makes sense. These statistical measures are important to understanding "how much testing is enough?" and can be evaluated by decision makers on a quantitative basis so they can trade off test resources for desired confidence in results.

Kotter s Process for Leading Change 1. Establish a sense of urgency 2. Form a powerful coalition 3. Create a vision 4. Communicate the vision 5. Empower others to act 6. Create short term wins 7. Consolidate improvements and produce more change 8. Institutionalize new approaches

Project Campions

Project Campions

Strategic Plan

Design of Experiments for Test Planning F-35 Case Study 15

The F-35 Program is Complex even by DoD Standards Conventional Short takeoff/vertical landing Carrier variant

And Required to Accomplish Many Diverse Missions Conventional Short takeoff/vertical landing Carrier variant Mission Areas Air Threat Ground Threat Air-Surface Strike Destruction/Suppression of Enemy Air Defenses Defensive counter air Offensive counter air Close air support Search and rescue

Problem Identification How do you evaluate the F-35 s ability to accomplish a diverse set of operational missions with limited test resources?

Characterization across operational envelope Strike, Offensive Counter Air, and Destruction/Suppression Enemy Air Defense Weapons Production Facility

Characterization across operational envelope Strike, Offensive Counter Air, and Destruction/Suppression Enemy Air Defense Surface to Air Missile Weapons Production Facility Surface to Air Missile

Characterization across operational envelope Strike, Offensive Counter Air, and Destruction/Suppression Enemy Air Defense Surface to Air Missile Weapons Production Facility Surface to Air Missile

Characterization across operational envelope Strike, Offensive Counter Air, and Destruction/Suppression Enemy Air Defense Surface to Air Missile Weapons Production Facility Surface to Air Missile

Characterization across operational envelope Strike, Offensive Counter Air, and Destruction/Suppression Enemy Air Defense Surface to Air Missile Weapons Production Facility Surface to Air Missile

Characterization across operational envelope Strike, Offensive Counter Air, and Destruction/Suppression Enemy Air Defense Surface to Air Missile Weapons Production Facility Surface to Air Missile

Characterization across operational envelope Strike, Offensive Counter Air, and Destruction/Suppression Enemy Air Defense Surface to Air Missile Weapons Production Facility

Characterization across operational envelope Response Variables Lots of measures to capture: Mission outcomes Air to Air Performance Air to Surface Performance System sensor capabilities Targeting Accuracy Striker Striker First Track Range Striker First Hostile Declaration Range Striker First Shot Range Red Air First Detection Range Red Air First Shot Range Striker SAM Track Time Proportion of Valid Weapon Releases to Number of Valid Weapon Releases Required to Meet Mission Tasking Proportion of Assigned Air to Surface Targets Removed Proportion of Striker Kill Removed Striker to Red Air Exchange Ratio Geolocation Find Time Fix Time DEAD Time Targeting Accuracy Escort Escort SAM Track Time Proportion of Assigned SAM Elements Removed Proportion of Assigned SAM Elements Engaged Exchange Ratio Closest Red Air Range to Strike Package Blue Striker Encroachment Range Escort First Track Range Escort First Hostile Declaration Range Escort First Shot Range Red Air First Detection Range Red Air First Shot Range Proportion of Escort Blue Strikers that reach their Weapons Release Point Proportion of Protected Aircraft (Strikers) Not Kill Removed Proportion of Escort F-35 Kill Removed Escort to Red Fighter Exchange Ratio

Experimental designs determine test adequacy 24 Run, D-Optimal 2 nd Order Design Disallowed Combinations

Two mission designs, executed in a 5 th generation scenario

Power calculations provided justification for number of trials 1.8 Power Target location power Variant power Environment power (in/out of band) 0 1 2 Signal-to-Noise Ratio

We took a scientific approach to all operational testing Conventional Short takeoff/vertical landing Carrier variant Mission Areas Air Threat Ground Threat Air-Surface Strike Destruction/Suppression of Enemy Air Defenses Defensive counter air Offensive counter air Close air support Search and rescue

Impact so far Congressional review of Close Air Support Testing

Still to come Test Execution and Analysis Execution Considerations Challenges with aircraft availability Confounding variables Analysis Considerations Demand for quick answers Big Data, Little Information

Statistical Engineering Shortcomings Initial focus was on tools Processes are still highly dependent on individuals involved Adherence to statistical rules Leadership changes & final solution not fully deployed Failing to see the big picture

We continue to increase the statistical defensibility of DoD Test and Evaluation National Research Council Study Design of Experiments endorsed as a sound methodology for OT&E OTA MOA on DOE DOT&E Initiatives Guidance on DOE in TEMPs DOT&E Policy Issued OTA Test Design Processes Updated DOT&E Science Advisor Established Test Science Roadmap effort DOT&E/ TRMC funded Science of Test Research Consortium DOT&E TEMP Guide Published DASD (DT&E) STAT Implement ation Plan STAT COE DOT&E Roadmap Report Two Additional DOT&E Guidance memos on Application of DOE to OT&E Survey Best Practices Memo Cybersecurity Procedures Additional Survey and cyber work Modeling and simulation validation guidance Cyber priorities Updated TEMP Guidance M&S Guidance

Needed a larger focus for statistical engineering efforts

Thank you!

Innovation Adoption Dr. Eric Schmidt, Testimony to House Armed Services Committee April 17, 2018 37

Laura s conjecture Statistician s are uniquely equipped to lead & implement change, especially in data-centric fields! 38