Using Crowds to Crack Algorithmic Problems

Similar documents
Let Hospital Workforce Data Talk

Heartland Security 2007 Conference & Exhibition SBIR/STTR Betsy Lulfs Program Director

Corporate Services Employment Report: January Employment by Staff Group. Jan 2018 (Jan 2017 figure: 1,462) Overall 1,

winning in US commercial staffing

Peraproposal for EWG Task

The New Clinical Research Landscape Incentives, Opportunities and Support Offered by the NIHR

Avoiding the Cap Trap What Every Hospice Needs to Know. Matthew Gordon, CPA Principal Consultant / Founder Cap Doctor Associates, Inc.

JANUARY 2018 (21 work days) FEBRUARY 2018 (19 work days)

HIMSS Nicholas E. Davies Award of Excellence Case Study Nebraska Medicine October 10, 2017

Enlisted Professional Military Education FY 18 Academic Calendar. Table of Contents COLLEGE OF DISTANCE EDUCATION AND TRAINING (CDET):

BOROUGH OF ROSELLE PUBLIC NOTICE ANNUAL NOTICE OF CALENDAR YEAR 2018 WORKSHOP SESSIONS, PRE-AGENDA MEETINGS AND REGULAR MEETINGS

Webinar Control Panel

Operational Excellence: Lean

From Big Data to Big Knowledge Optimizing Medication Management

Quality and Efficiency Support Team (QuEST) Directorate for Health Workforce and Performance

Invest in the event that will define the 21st century

Co-Sourcing Lab Services Maximizing Service Partners in a Lab Environment

SCIENCE ADVISORY COMMITTEE TERMS OF REFERENCE

Open Innovation: Crowdsourcing and Prize Competitions. Nancy Merritt NIST PSCR Open Innovation Program

Power. Jan 04 Plan. Goal June 04: 6,000 MW Production and Transmission. Data as of 26 Feb UNCLASSIFIED

Becoming a Data-Driven Organization: Journey to HIMSS EMRAM Stage 7

SEEK NZ Employment Indicators, May Commentary

2017 HIMSS DAVIES APPLICANT

BIOMEDICAL SCIENTIST MEDIA INFORMATION 2017

Tina Nelson, MBA, BSN Lisa Stepp, BSN, RN Rebecca Fyffe, BSN, RN Jessica Coughenour, LPN

Overview of a new study to assess the impact of hospice led interventions on acute use. Jonathan Ellis, Director of Policy & Advocacy

H2020 possibilities for SMEs. Dr. Laura Kauhanen Green Growth Programme coordination team

Intergovernmental Working Group of Experts on International Standards of Accounting and Reporting (ISAR) Sustainability Reporting

2017 Workforce Education Conference

Please place your phone line on mute.

Improving the Chemotherapy Appointment Experience at the BC Cancer Agency

MONTHLY JOB VACANCY STUDY 2016 YEAR IN REVIEW PARRY SOUND DISTRICT MONTHLY JOB VACANCY STUDY YEAR IN REVIEW - PARRY SOUND DISTRICT

Driving the value of health care through integration. Kaiser Permanente All Rights Reserved.

Fundraising Opportunities in Your Community

UNIVERSITY OF DAYTON DAYTON OH ACADEMIC CALENDAR FALL Incoming First Year students move into UD Housing

SSF Call for Proposals: Framework Grants for Research on. Big Data and Computational Science

Compliance Division Staff Report

IoT Hackathon. Get your smart on. 4 th & 5 th July, 2015 Location: ITC Infotech Park, Bangalore. To register, log on to:

From Implementation to Optimization: Moving Beyond Operations

MONTHLY JOB VACANCY STUDY 2016 YEAR IN REVIEW NIPISSING DISTRICT MONTHLY JOB VACANCY STUDY YEAR IN REVIEW

The Handshake that Says Glad to Work With You

VPAC Productions. Managing the Venice Performing Arts Center. Maximizing cultural and educational return on investment

LESSONS LEARNED IN LENGTH OF STAY (LOS)

ALBERTA TRANSPORTATION North Central Region Edson Area Instrumentation Monitoring Results

REASSESSING THE BED COORDINATOR S ROLE SHADY GROVE ADVENTIST HOSPITAL

Colorado Medical-Dental Integration Project (CO MDI)

Change Management at Orbost Regional Health

User Group Meeting. December 2, 2011

The Toyota Foundation 2018 International Grant Program Application Form

Workflow. Optimisation. hereweare.org.uk. hereweare.org.uk

Activity Based Cost Accounting and Payment Bundling

STATISTICAL PRESS NOTICE MONTHLY CRITICAL CARE BEDS AND CANCELLED URGENT OPERATIONS DATA, ENGLAND March 2018

Linking Researchers with their Research: Persistent identifiers, registries, and interoperability standards

will now display archived data going back to January This will Interested in seeing how your organization is trending against The

Average monthly IT jobs growth in 2015 beating 2014 numbers by more than 2,000 new jobs per month

Small Business Programs Office (SBPO) Susan Nichols Program Director

CHC-A Continuity Dashboard. All Sites Continuity - Asthma. 2nd Qtr-03. 2nd Qtr-04. 2nd Qtr-06. 4th Qtr-03. 4th Qtr-06. 3rd Qtr-04.

Camp SEA Lab. Strategic Plan July June Adopted 7/17/2013 by the Friends of Camp SEA Lab Board of Directors

Jun 03 Jul 03 Aug 03 Sep 03 Oct 03 Nov 03 Dec 03 Jan 04 Feb 04 Mar 04 Apr 04 May 04

SMART ENERGY DISTRIBUTION SYSTEMS

RECOMMENDED CITATION: Pew Research Center, July, 2015, A Year Later, U.S. Campaign Against ISIS Garners Support, Raises Concerns

A total 52,886 donations were given during the 24-hour, online giving day raising more than $7.8 million from 18,767 donors.

AGENCY: General Services Administration (GSA), Office of. Citizen Services, Innovative Technologies and 18F

Aurora will expand its geographic coverage within Wisconsin to achieve its mission to: Aurora Health Care 1991 Strategic Plan

Big Data & Effective Utility Programs

NATIONAL OCEANIC AND ATMOSPHERIC ADMINISTRATION NOAA ADMINISTRATIVE ORDER SERIES TABLE OF CONTENTS. as of December 8, 2008

PERFORMANCE MANAGEMENT MEETING. Health Department:

Improving health care Nigel Livesley MD, MPH

Patient-centered care - from buzz word to meaningful reality. Current Health Care System

Integrating Community and Primary Care: the eyes and ears of general practice

Readmission Reduction: Patient Interviews. KHA Quality Conference March, 2018

Regional Competitiveness Project. October 21, 2009

Future of Logistics Civil Augmentation Program

Remote Allocation in a Centralized Transfusion Service

Schedule and Results:

Operations report. August 12, 2016

Managing Healthcare Payment Opportunity Fundamentals CENTER FOR INDUSTRY TRANSFORMATION

National Trends Winter 2016

Quality Management Report 2017 Q2

Oregon Community Development Block Grant Program 2018 Annual Action Development September 22, 2017

Strategies to Reduce Readmissions, Sepsis, and Health-Care Associated Infections

BOARD OF DIRECTORS PAPER COVER SHEET. Meeting Date: 1 st December 2010

UNCLASSIFIED. LandWarNet Army Request for IT (ARFIT) Information Exchange Forum (IEF)

Chapter 02 Sources of Innovation

Military Radar Applications

PATIENT CARE SERVICES REPORT Submitted to the Joint Conference Committee, August 2016

Astrophysics Research Program. NASA Advisory Council Astrophysics Subcommittee

SUPPLY CHAIN MANAGEMENT AND PROJECT MANAGEMENT

Advancing Accountability for Improving HCAHPS at Ingalls

DoD Unmanned Systems Integrated Roadmap

National Council on Radiation Protection and Measurements Homeland Security Recommendations Related to Nuclear and Radiological Terrorism

Cryptologic & Cyber Systems Division Contract/Acquisition Forecast

CWE FB MC project. PLEF SG1, March 30 th 2012, Brussels

Cognitive Triangle. Dec The Overall classification of this Briefing is UNCLASSIFIED

Water Conservation Industrial,Commercial,Institutional (ICI) Audit and Rebate Program. City of Dallas Water Utilities February 13, 2012

Improving Quality of Care in Anesthesiology Session # 182, March 7, 2018

Foote Partners, LLC Foote Research Group Foote Partners LLC News Analysis April 4, 2014

Improving Communication Openness in BWHC Ambulatory: Update

2011 Ground Robotics Capability Conference. OSD Perspective

DIME-GAFSP First Quarterly Progress Report

Transcription:

Using Crowds to Crack Algorithmic Problems Rinat Sergeev - NASA Tournament Lab at Harvard University soon to become Crowd Innovation Lab at Harvard University

Lab Structure Current Staff JIN PAIK Manager RINAT SERGEEV Senior Data Scientist MICHAEL MENEITTI Post Doctoral Fellow ANDREA BLASCO Post Doctoral Fellow Post Doctoral Alums INA GANGULI U.Mass. Amherst PATRICK GAULE CERGE-EI (Prague) CHRIS REIDL Northeastern University

Crowds Can Be Organized as Contests or Communities (Boudreau & Lakhani 2013; King and Lakhani 2013) Innovation problem requires diversity of approaches and broad experimentation Sponsor not sure what combination of skills and approaches might be useful in solution generation Clear rules for participation and winning Innovation problem requires cumulative knowledge building and aggregation of diverse inputs Contributions range from mix & match to co-production with modular tasks and functions Informal, norms-based governance

Innovation Field Experiments Aim To Identify Causal Mechanisms Underlying Innovating with Crowds Collaborations Search costs in finding collaborators - HMS-Advanced Imaging Grant Program ~ 450 researchers - $800,000 Self-organization in online teams - NASA/TopCoder- Imaging/OCR in Documents ~ 432 coders - $50,000 Contests Prizes vs signals - NASA/TopCoder - Autonomous Robots ~ 1200 coders - $30,000 Incentives for internal public goods - HMS/MGH-Idea Competition ~ 350 employees - $27,000 Expert evaluation of scientific ideas - HMS Grant Process ~ 150 Proposals :142 Evaluators - $25,000-$1M Comparing Contests & Collaborations Incentives & search - HMS/TopCoder- Computational Biology ~ 700 coders - $6,000 Selection vs treatment effects - NASA/TopCoder - Space Medical Kit Development ~ 900 coders - $25,000

Crowds and Development The Crowd Competition balance Structure Self-selection Optimality Incentive diversity Attractiveness Evaluation Quality The Problem Broad-scope problems Innovation Narrow-scope problems Extreme quality Inconvenient problems Flexibility Large problems Bandwidth Diverse-skill problems Specialization

Development The Product Software Applications Algorithms The Partners NASA multiple departments USAID National Geographic Department of Energy QuakeFinder Smithsonian Center for Astrophysics Universities Commercial companies

Innovation Tournaments are Historically Important & Currently Popular The Duomo - Florence 1418 - Up to 2,000 Florins The Longitude Prize 1714 - Up to 20,000 Invention of Food Canning 1800 - Up to 12,000 Francs Ansari X-Prize Space Travel 1996 $10,000,000 Scientific Problem Solving 2001 Average $30,000 Local Motors Car Design 2008 Over 35000 Submits

Why does it work?

Crowdsourcing gives access to smart people No Matter Who You Are Most of the Smartest People Work for Someone Else Bill Joy (Sun Microsystems, BSD Unix, Java)

incentivized to do the task Extrinsic Cash, Job Market Signals, Community Prestige Intrinsic Fun, Enjoyment, Learning, Autonomy, Taste Prosocial Community Belonging Identity

with matching skills

especially if you don t set the requirement too strong

Multiple attempts can produce Extreme Value Outcomes Probability Density 0.06 0.05 0.04 0.03 0.02 0.01 0-20 -10 0 10 20 30 40 50 60 Value of Innovation Outcomes

Broad participation can bring a valuable idea, missed by the experts

Even many ideas!

Comparative Evaluation and Peer Pressure

Comparative Evaluation and Peer Pressure

Is there a lot of cheese for free? Well, there are trade-offs: Management overhead Knowledge transfer Performance variability Wasted resource on non-winners Legal questions Resistance to innovations from problem stakeholders Crowd Specifics, Preferences and Limitations

Lab has Designed & Executed Over 100 Challenges 35" Total&Consulta+ons&and&Challenges" 30" Total"Consults" 25" Total"Challenges" 20" 15" 10" 5" 0" Jan*11" Feb*11" Mar*11" Apr*11" May*11" Jun*11" Jul*11" Aug*11" Sep*11" Oct*11" Nov*11" Dec*11" Jan*12" Feb*12" Mar*12" Apr*12" May*12" Jun*12" Jul*12" Aug*12" Sep*12" Oct*12" Nov*12" Dec*12" Jan*13" Feb*13" Mar*13" Apr*13" May*13" Jun*13" Jul*13" Aug*13" Sep*13" Oct*13" Nov*13" Dec*13" Jan*14" Feb*14" Mar*14" Apr*14" May*14" Jun*14" Jul*14" Aug*14"

TopCoder/Appirio Contest Engine Development Assembly Testing Bug Races Architecture Concepts Design Wireframes More than 30+ Specialized contest types Storyboards Prototype Marathon Matches Single-Round Match

Massive Parallel Production of Innovative Assets Copilots Competitors Architecture Client Assembly Testing UX Idea Gen Rapid Prototyping Big Data Challenge Optimization Algo Storyboard Wireframes Concept

We like them because They Collect a Lot of Data and Are Open for Experiments!

Leverage Competition to Optimize Complex Big Data Algorithmic Problems NTL Algorithmic Projects for Science

Harvard Algorithm Challenges 2014 2013 2012 NASA PDS Cassini Rings NASA Asteroid Data Hunter 2 NASA Asteroid Tracker EPA Cyanobacteria Modeling EPA Toxcast NASA Asteroid Data Hunter 1 USAID Atrocity Prevention NatGeo Collective Minds & Machines NASA Robonaut 1 NASA Robonaut 2 NASA Longeron NASA Robots (Signals v. Prizes) 2011 USPTO Patten Imaging NIH/HMS Megablast

Antibody Sequence Annotation Algorithm 122 654 89 5 CODERS SUBMITTED SOLUTIONS DIFFERENT APPROACHES TO SOLVE PROBLEM IDENTIFIED WINNING COUNTRIES RUSSIA, FRANCE, EGYPT, BELGIUM & US Higher accuracy and 120x speedup!

Optimizing Genome-Wide Association Studies (GWAS) Algorithm, implemented in PLINK package Genome associations Links genetic variants (SNPs) to observed health conditions Helps target proteins for future investigation Speedup: contest by contest ~30x speedup in logistic regression ~300x speedup over basic use case ~1000x speedup with multi-threading Streamline: Complete runs reveal all SNP correlations From 5 hours per GWAS down to ~20s

Maximizing the energy output from International Space Station solar panels

Maximizing the energy output from International Space Station solar panels Crowdsourced model of ISS Contest winners Energy output from different solutions 4,056 Registrants 459 Competitors 2185 Submissions 4.76 Avg. submissions per competitor 124,025 Views for Longeron Video Top solutions have been added to NASA-ISS reserve pool

Can we train an algorithm to follow the annotation crowd in a search for Genghis Khan Tomb? Crowd-Archaeology from Space Winning Algorithm Scientist and Explorer Albert Lin on Horseback Checking the Predictions Onsite

Can we use open-source data to predict atrocities with sub-country level of resolution? Winner s Algorithm Used GDELT and PITF to forecast PITF Implemented Method Random Forest Used 23 predictive patterns from PITF Used 13 predictive patterns from GDELT Example of the Prediction Data search crowdsourced Data preparation crowdsourced Ideas crowdsourced Main algorithmic contest successful 1077 Registrants 93 Competitors 618 Submissions 6.65 Avg. submissions per competitor Growing atrocity risk in Aleppo province, Syria 2010-12 Top solution outperformed baseline model by 62%

Can we use open-source data to predict atrocities with sub-country level of resolution? Sometimes you meet them! nhzp339 as a winning green dot nhzp339 alive

KaBoom: Teach NASA radar array how to track asteroids! The radar is real The contest: 174 Registrants 37 Countries 43 Competitors 299 Submissions The plans: To make it bigger and stronger

Asteroid Data Hunter: Find Asteroids on Space Images The orbits of known asteroids The main way to detect those The challenge to automatize it

NASA: Find New Moons of Saturn on Cassini Images 62 large moons of Saturn are known There is way to find smaller ones! The challenge to automatize a search for propeller perturbations of Saturn Rings

Our Team Thank You!