Explaining Navy Reserve Training Expense Obligations. Emily Franklin Roxana Garcia Mike Hulsey Raj Kanniyappan Daniel Lee

Similar documents
Palomar College ADN Model Prerequisite Validation Study. Summary. Prepared by the Office of Institutional Research & Planning August 2005

Statistical methods developed for the National Hip Fracture Database annual report, 2014

Department of Defense Education Activity PROCEDURAL GUIDE. Procedures for Permanent Change of Station at the Department of Defense Education Activity

Predicting Medicare Costs Using Non-Traditional Metrics

Statistical Analysis for the Military Decision Maker (Part II) Professor Ron Fricker Naval Postgraduate School Monterey, California

Enhancing Sustainability: Building Modeling Through Text Analytics. Jessica N. Terman, George Mason University

Scottish Hospital Standardised Mortality Ratio (HSMR)

2013 Workplace and Equal Opportunity Survey of Active Duty Members. Nonresponse Bias Analysis Report

Tree Based Modeling Techniques Applied to Hospital Length of Stay

Statistical Methods in Public Health III Biostatistics January 19 - March 10, 2016

Comparing the Value of Three Main Diagnostic-Based Risk-Adjustment Systems (DBRAS)

FTA Reporting Requirements. Circular D Grant Management Requirements Chapter III, Section 3 available. way of communicating grant

DEPARTMENT OF THE NA VY COMMANDER NAVY RESERVE FORCE 1915 FORRESTAL DRIVE NORFOLK. VIRGINIA

FOR OFFICIAL USE ONLY. Naval Audit Service. Audit Report. Navy Reserve Southwest Region Annual Training and Active Duty for Training Orders

Subj: POLICY AND PROCEDURES FOR RESERVE COMPONENT SAILORS SERVICE BEYOND 16 YEARS OF ACTIVE-DUTY SERVICE

Researcher: Dr Graeme Duke Software and analysis assistance: Dr. David Cook. The Northern Clinical Research Centre

Analysis. of DoD's Commercial Activities Program

The Hashemite University- School of Nursing Master s Degree in Nursing Fall Semester

Differences in employment histories between employed and unemployed job seekers

Nowcasting and Placecasting Growth Entrepreneurship. Jorge Guzman, MIT Scott Stern, MIT and NBER

Assessing the Effects of Individual Augmentation on Navy Retention

Nonprofit Organizations & Social Media Fundraising: An Analysis of the GoodGiving Guide Challenge

Note, many of the following scenarios also ask you to report additional information. Include this additional information in your answers.

Report No. D February 9, Internal Controls Over the United States Marine Corps Military Equipment Baseline Valuation Effort

Continuously Measuring Patient Outcome using Variable Life-Adjusted Displays (VLAD)

An evaluation of ALMP: the case of Spain

UNITED STATES PATENT AND TRADEMARK OFFICE The Patent Hoteling Program Is Succeeding as a Business Strategy

How Does Sea Duty Affect First-Term Reenlistment?: An Analysis Using Post-9/11 Data

Policy Brief. Nurse Staffing Levels and Quality of Care in Rural Nursing Homes. rhrc.umn.edu. January 2015

SAFER CALIFORNIA UNIVERSITIES PROJECT

HIV/AIDS Monitor: Guide to the Data Analyzed in The Numbers Behind The Stories

Applying client churn prediction modelling on home-based care services industry

Guest Presenter Jay Bottelson

Report No. D May 14, Selected Controls for Information Assurance at the Defense Threat Reduction Agency

Factors influencing patients length of stay

George A. Zangaro. TriService Nursing Research Program Final Report Cover Page. Bethesda MD 20814

Statistical Methods in Public Health II Biostatistics October 28 - December 18, 2014

2013, Vol. 2, Release 1 (October 21, 2013), /10/$3.00

Department of Defense DIRECTIVE

Joint Replacement Outweighs Other Factors in Determining CMS Readmission Penalties

Validation Process. Logistics Solutions for the Warfighter 1

University of Michigan Health System. Current State Analysis of the Main Adult Emergency Department

Fertility Response to the Tax Treatment of Children

Multi-Criteria Evaluation of Broadband Internet Access in Poland

Public Funding and Its Relationship to Research Outcomes. Paula Stephan Georgia State University & NBER UNU-MERIT/MGSoG Conference November 2014

time to replace adjusted discharges

AN ANALYSIS OF FACTORS AFFECTING HCAHPS SCORES AND THEIR IMPACT ON MEDICARE REIMBURSEMENT TO ACUTE CARE HOSPITALS THESIS

A Study on the Satisfaction of Residents in Wuhan with Community Health Service and Its Influence Factors Xiaosheng Lei

U.S. Naval Officer accession sources: promotion probability and evaluation of cost

Troubleshooting Audio

18th International XBRL Conference

Chronic Risk and Disease Management Model Using Structured Query Language and Predictive Analysis

Impact of Scholarships

NAVAL POSTGRADUATE SCHOOL THESIS

Forecasting U.S. Marine Corps reenlistments by military occupational specialty and grade

DOD INVENTORY OF CONTRACTED SERVICES. Actions Needed to Help Ensure Inventory Data Are Complete and Accurate

Aviation Ordnanceman

Predicting U.S. Army Reserve unit manning using market demographics

PUBLIC LAW OCT. 1, 1986

Family Structure and Nursing Home Entry Risk: Are Daughters Really Better?

Data Project. Overview. Home Health Overview Fraud Indicators Decision Trees. Zone Program Integrity Contractor Zone 4 Decision Tree Modeling

University of Michigan Health System Analysis of Wait Times Through the Patient Preoperative Process. Final Report

DEPARTMENT OF DEFENSE Defense Contract Management Agency INSTRUCTION. Graphics, Framing and Engraving Services

1. Report No. 2. Government Accession No. 3. Recipient s Catalog No.

DEPARTMENT OF THE NAVY HEADQUARTERS UNITED STATES MARINE CORPS WASHINGTON, DC MCO B DFAS-KC/FJSP 21 Oct 93

Understanding Reserve Pay Processing in Direct Access Overview

Information Systems Technician Training Series

ACS NSQIP Modeling and Data, July 14, Mark E. Cohen, PhD Continuous Quality Improvement American College of Surgeons

Are R&D subsidies effective? The effect of industry competition

The Life-Cycle Profile of Time Spent on Job Search

The role of Culture in Long-term Care

U.S. Army Audit Agency

A QUANTITATIVE ACQUISITION PROCESS MODELING APPROACH TOWARD EXPEDITING SYSTEMS ENGINEERING Yvette Rodriguez

CONAE Microwave Radiometer (MWR) Counts to Tb Algorithm (Version 6.0) and On-orbit Validation

Knowledge Discovery in Databases: Improving Quality in Homecare

Settling for Academia? H-1B Visas and the Career Choices of International Students in the United States

Change Agent Network Orientation. November 9 th, 2016

Paper Getting to Know the No-Show: Predictive Modeling of Missing a Medical Appointment

REPORT DOCUMENTATION PAGE

The Intangible Capital of Serial Entrepreneurs

Army Regulation Management. RAND Arroyo Center. Headquarters Department of the Army Washington, DC 25 May 2012 UNCLASSIFIED

NURSES PROFESSIONAL SELF- IMAGE: THE DEVELOPMENT OF A SCORE. Joumana S. Yeretzian, M.S. Rima Sassine Kazan, inf. Ph.D Claire Zablit, inf.

Innovation-Driven Entrepreneurial Ecosystems: A New Agenda for Measurement and Policy. Professor Scott Stern MIT and NBER

Determining Like Hospitals for Benchmarking Paper #2778

REPORT DOCUMENTATION PAGE

Profit Efficiency and Ownership of German Hospitals

Predicting Hospital Patients' Admission to Reduce Emergency Department Boarding

Prediction of High-Cost Hospital Patients Jonathan M. Mortensen, Linda Szabo, Luke Yancy Jr.

Joint Electronics Type Designation Automated System

The Glasgow Admission Prediction Score. Allan Cameron Consultant Physician, Glasgow Royal Infirmary

COMPLIANCE WITH THIS PUBLICATION IS MANDATORY

NHS Dental Services Quarterly Vital Signs Reports

Discordance in self-report and observation data on mistreatment of women by providers during childbirth in Uttar Pradesh, India

UTILIZING SWIFT AND CERTAIN SANCTIONS IN PROBATION: FINAL RESULTS FROM DELAWARE S DECIDE YOUR TIME PROGRAM.

Electronic Attack/GPS EA Process

Subj: MISSION, FUNCTIONS, AND TASKS OF NAVY PAY AND PERSONNEL SUPPORT CENTER

Henry Ford Hospital Inpatient Predictive Model

* LDO L II *

The Internet as a General-Purpose Technology

Southwest Minnesota Emergency Communications Board

Transcription:

Explaining Navy Reserve Training Expense Obligations Emily Franklin Roxana Garcia Mike Hulsey Raj Kanniyappan Daniel Lee

Agenda Defining The Problem Data Analysis Data Cleaning Exploration Models & Methods Model Performance Recommendations 2

Defining The Problem Explanation or Prediction? Explain the outstanding travel obligations within the US Navy Reserve. What is the analysis going to be used for? Determine whether travel policy changes are needed. Who will be the users? Navy Reserve Headquarters staff What is the currently implemented? Access tool implemented by contractors Travel Responsibility Manual 3

Data Analysis Data Source Navy Reserve Order Writing System (NROWS) database Data Quality Directly entered by reservist in NROWS and approved by appropriate official. Pay disbursements fed from Navy Reserve financial system Size of the Data Training and travel records for fiscal year 2009 86,000 records in total (liquidated and unliquidated costs) 10,000 sample dataset used for modeling and visualizations 4 Security and Privacy Social security numbers and other personal information were removed prior to obtaining dataset

Data Cleaning Dataset Generation Expense report generated from three separate reports from NROWS Report generated on August 28, 2009 86,000 total records Created random sample of 10,000 records as final data set Incomplete Records Removed records with missing data elements Dummy Variables: created dummy variables for the following categorical variables Two Order Type Ref Variables: ADT as reference value Two ACRN Ref Variables: AA as reference value Five Region Ref Variables: RCC MA as reference value One Travel System Ref Variables: DTS as reference value 5 Data Record Adjustment Created new variables (i.e. Log[Reservation Amount]) Removed insignificant variables

Exploration Treemap Chart: Number of Unliquidated Records Unliquidated Records Only Hierarchy: Document Status, Order Type Interpretation: Of the unliquidated data records, the majority of the outstanding expense records on Annual Training and then Active Duty Training Order Type Active Duty Training Annual Training Inactive Duty Training 6

Exploration Scatter Plot: When & Where Unliquidated Records Occur Liquidated & Unliquidated Records Hierarchy: Document Status and Then Region Interpretation: After determining when the highest amount of unliquidated data records occur, we determined that the majority of the records occur in Region RCC SW 7

Exploration Scatter Plot, Box Plot: Amount of Unliquidated Expenses Unliquidated Records Only Hierarchy: Order Type, Size By Reservation Amount Interpretation: Of the unliquidated records, the highest level of reservation amounts are tied to Active Duty Training 8

Models & Methods With the goal of explaining, our team ran the following Models: Logistic Regression, Discriminant Analysis, Classification Tree Our team began with more than 86,000 records. Using XLMiner, we took a random sample of 10,000 records so that our dataset was more manageable using the Explanatory Models in XLMiner. The "Y" output variable we used is 'Document Status' - Resulting in either Liquidated (L) or Unliquidated (U) data records. The input variables consisted of numerical and non-numerical data, and the nonnumerical data, such as ACRN, Region and Order Type were converted to dummy variables. 9

Model Performance 10 Model Significant Input Variables Overall Error Error in Classifying Unliquidated Naïve Rule Majority Rule Predicts Liquidated. 26.25% 100% Logistic Regression #1 Logistic Regression #2 Logistic Regression #3 Logistic Regression #4 Days Outstanding, Number of Days, Order Type, Travel System, Reservation Amount, Advance Amount, Region Days Outstanding, Number of Days, Order Type, Reservation Amount, Advance Amount, Region Days Outstanding, Order Type, Reservation Amount, Advance Amount, Region Days Outstanding, Order Type, Reservation Amount, Advance Amount, ACRN Multiple R-Squared 2.59% 9.83% 0.08751 2.59% 9.83% 0.87511 2.59% 9.83% 0.87506 2.52% 9.56% 0.87484 Logistic Regression #5 Days Outstanding, Order Type, Reservation Amount, Advance Amount 2.46% 9.44% 0.87409 Logistic Regression #6 Days Outstanding, Order Type, Log(Reservation Amount) 2.46% 9.49% 0.87344 Discriminant Analysis #1 Discriminant Analysis #2 Discriminant Analysis #3 Days Outstanding, Number of Days, Order Type, Travel System, Reservation Amount, Advance Amount, Region Days Outstanding, Number of Days, Order Type, Reservation Amount, Advance Amount, Region Days Outstanding, Order Type, Reservation Amount, Advance Amount 11.56% 43.85% 11.58 43.89% 11.53% 43.70% Classification Tree Number of Days, Reservation Amount, Order Type, Advance Amount 25.89% 100%

Model Performance Logistical Regression Model Best Model Input Variables: Outstanding, Order Type_AT, Order Type_IN Input variables Constant term Days Outstanding Order Type_AT Order Type_IN Coefficient Std. Error p-value Odds 3.48984122 0.13432698 0 * Residual df 6996-0.64695567 0.02618866 0 0.52363747 Residual Dev. 1375.511719-0.40805581 0.15914348 0.01034511 0.66494173 % Success in training data 74.11428571 0.78317082 0.25056556 0.00177435 2.18840027 # Iterations used Multiple R-squared 8 0.87344289 Training: Error Report Class # Cases # Errors % Error L 5188 0 0.00 U 1812 172 9.49 Overall 7000 172 2.46 11 Validation: Error Report Class # Cases # Errors % Error L 2187 0 0.00 U 813 80 9.84 Overall 3000 80 2.67

Model Performance Discriminant Analysis Model Best Model Input Variables: Days Outstanding, Order Type_AT, Order Type_IN, Reservation Amount, Advance Amount Variables Classification Function L U Constant Days Outstanding Order Type_AT Order Type_IN Reservation Amount Advance Amount -2.29217935-3.93947172 0.0042434 0.05690328 3.3345499 3.82253385 4.12011194 3.94162393 0.00022529 0.00024252-0.00029611 0.00059535 Error Report Class # Cases # Errors % Error L 7375 6 0.08 U 2625 1147 43.70 Overall 10000 1153 11.53 12

Model Performance Classification And Regression Trees Input Variables: Number of Days, Order Type_AT, Order Type_IN, Reservation Amount, Advance Amount Pruned Tree = Naïve Rule, predicting all as Liquidated. Training: Error Report Class # Cases # Errors % Error L 5188 0 0.00 U 1812 1812 100.00 Overall 7000 1812 25.89 Validation: Error Report Class # Cases # Errors % Error L 2187 0 0.00 U 813 813 100.00 Overall 3000 813 27.10 11.5 0.5 5195.66 Order Type_I Number of Da 4225 2775 Reservation 2709 1516 1695 1080 2141.95 551.68 3763.99 7191.82 Reservation Reservation Reservation Reservation 1612 1097 638 878 1117 578 502 578 1467.97 3330.91 L L 2558.42 L L L Reservation Reservation Reservation 1099 513 595 502 527 590 U L L L L L 13

Recommendations Use & Deployment: Based upon our team s Data Mining Analysis Project, we encourage the Navy Reserve to focus its attention on the following to reduce unliquidated training instances 14 Review our team s linear regression model #6 and focus its attention to re-evaluate the training efforts for both Annual Training and Training for Inactive Reservists as these are the most significant variables along with Days Outstanding Review the training strategy in Region RCC SE since this region has the largest number of outstanding unliquidated instances Review the schedule for when expense training is given to reservists since most of the unliquidated records occurred in August Training Emphasis Examples: Trainers who can review the status of orders and work with reservists Trainers who can my be contacted to assist reservists having issues submitting travel claims Training on the Travel Claim System Escalation channels to officers superior to reservists with outstanding travel claims

15 Questions / Discussion