Accelerating Science Engagement Camille Crittenden, PhD CITRIS and the Banatao Institute University of California Internet2 Technology Exchange September 26, 2016
Outline Pacific Research Platform: what is it? Science DMZs Science engagement: use cases, domain areas Socio technical engineering: multi stakeholder alliances, professional development Broader Impacts: Access to cyberinfrastructure 2
3
Science DMZ Developed by ESnet, the Science DMZ concept integrates four key components into a unified whole that together serve as a foundation for this model. These include: A network architecture explicitly designed for high performance applications, where the science network is distinct from the generalpurpose network The use of dedicated systems for data transfer Performance measurement and network testing systems that are regularly used to characterize the network and are available for troubleshooting Security policies and enforcement mechanisms that are tailored for high performance science environments 4
High Speed Science Networks Context Universities, Labs, and other institutions connect using science networks CENIC (California research institutions, including UC and CSU campuses) ESnet (Department of Energy National Labs) Internet2 (interconnects regional networks) GEANT (interconnects European science networks) Science networks are much more capable Engineered for data intensive science High speed data movement Advanced services Science networks connect with each other to build a fabric In most cases, no need to traverse commodity Internet for research data Connect to global Internet as well for global access 5
The Pacific Research Platform Creates a Regional End-to-End Science-Driven Big Data Freeway System NSF CC*DNI Grant $5M 10/2015-10/2020 PI: Larry Smarr, UC San Diego, Calit2 Co-PIs: - Camille Crittenden, UC Berkeley, CITRIS - Tom DeFanti, UC San Diego, Calit2 - Philip Papadopoulos, UC San Diego, SDSC - Frank Wuerthwein, UC San Diego Physics and SDSC 6
What Is Science Engagement? Technology specialists working with scientists to improve Data transfer performance Data workflows (e.g., to require less human effort) Experiment operations and more Using experience gained from collaborations to improve Network design Tool design System design 7
Old Model of Science Engagement: Scientist as Integrator Requires scientists to Discover new technologies Become expert in new technologies Assemble distinct technologies into an integrated solution that works for them Some scientists do this brilliantly most do not 8
New Model of Science Engagement: Scientist as Collaborator New team members for team science. Technologists Understand technology Understand enough of the science to see how technology fits Help scientists adopt a useful solution Result: much more efficient and productive research outcome 9
Science Engagement in the PRP Identify science collaborations that would benefit from increased access to high speed networking Help coordinate efforts at different campuses, national labs, supercomputing centers Everybody is already doing some of this Let s work together, learn from each other Everyone benefits when multi institution collaborations succeed 10
PRP Science Drivers Data intensive domains Particle physics Astronomy and astrophysics Biomedical applications Earth sciences Virtual reality, high resolution video, data visualization Future areas 11
PRP Science Drivers: Particle Physics LHC at CERN Datasets increase 10x, after experimental upgrades. Inclusive data streams shared by all collaborators for analysis. 12
PRP Science Drivers: Particle Physics LHC experiments expect 5x data in next 2 years Paradigm shift in data access methods Several typical use cases for end user physicists Downloading large roughly filtered datasets through Science DMZs on research networks Streaming datasets from Tier 2 sites via xrootd protocol, possibly with local data caching Accessing extra CPU power in federated analysis clusters, with all data pulled over WAN Main challenges are tuning applications/servers and preparing end users for new workflow 13
PRP Science Drivers: Astronomy, Astrophysics Astronomical telescope survey data, galaxy formation and evolution Intermediate Palomar Transient Factory LSST (Chile) DESI (LBL) 14
PRP Science Drivers: Biomedical Applications Precision Medicine Genome sequencing Microbiome research Telehealth Telemedicine Telesurgery http://www.practicalpainmanagement.com/meeting summary/you are your microbiome 15
16
From IEEE Spectrum Decoding a Baby s Genome in 26 Hours, http://spectrum.ieee.org/biomedical/diagnostics/decoding a babys genome in 26 hours Referencing PLOS Big Data: Astronomical or Genomical? http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002195 17
Network demands of collaborative genomics research Genomics defines a new order of BIG Speed of light will not increase but number of genomic data repositories or distance between them will Internet routing is inefficient in its use of network resources Internet protocol stack was not designed for BIG DATA transfer over destination based paths with large bandwidth delay Content centric networking offers at least part of a solution 18
Microbiome Research The US$121 million National Microbiome Initiative will attempt to map and investigate these collections of microorganisms over the next two years with help from multiple federal agencies, the White House Office of Science and Technology Policy said today. [13 May 2016] 19
National Microbiome Initiative aims to Support interdisciplinary research to answer fundamental questions about microbiomes in diverse ecosystems. Develop platform technologies that will generate insights and help share knowledge of microbiomes in diverse ecosystems and enhance access to microbiome data. Expand the microbiome workforce through citizen science, public engagement, and educational opportunities. 20
PRP Science Drivers: Earth Sciences Earthquake Engineering Research Hybrid simulation and modeling NEES earthquake engineers at UC Berkeley have developed a novel hybrid simulation method for the seismic testing of highvoltage disconnect switches and other complex structures. The research team was able to reduce computation time by a factor of 3. (Image: Khalid Mosalam, UC Berkeley) 21
PRP Science Drivers: VR, hi resolution video Thomas E. Levy (at left) interacts with 3D archaeological data in Calit2 s Qualcomm Institute StarCAVE. 22
PRP Science Drivers: VR, hi resolution video WAVE@UC San Diego MerWAVE @UC Merced 23
What s Next? Global Research Platform 24
PRP Science Drivers: what s next? Application areas in Health: Telemedicine, personalized medicine (genomic research) Environmental monitoring and disaster response Public open data 25
Environmental Monitoring 26
How to prepare for next stage, data intensive collaborative research What considerations should research institutions keep in mind to prepare for the infrastructure needs of science ahead? Budget (hardware, human resources) Workforce development, technical and soft skills for facilitating multi disciplinary research Curriculum, degree programs 27
PRP Broader Impacts Democratizing access to data Partnerships with CENIC (State of CA, Cities of Sacramento, Los Angeles) Open data movements Open Science Grid Democratizing computation 28
PRP Broader Impacts Reaching Minority Serving Institutions for access to data intensive networks and training Collaborating with community colleges, diploma, certification and non degree training programs 29
Summer program 2011, MSI Cyberinfrastructure Empowerment Coalition, Calit2, Qualcomm Institute 30
Realizing the Vision: What Will It Take? Building a new career path for science engagement Bit.ly/PRP engage Active interdisciplinary cooperation Active institutional cooperation Imagine what is possible. Demonstrate opportunities for innovation. 31
Acknowledgements US National Science Foundation (NSF) awards CNS 0821155 and CNS 1338192, CNS 1456638, ACI 1540112, and ACI 1541349 UC Office of the President CIO UCSD Chancellor s Integrated Digital Infrastructure Program and Next Generation Networking initiative Calit2 Qualcomm Institute, Calit2 UCSD Division CENIC and Pacific Wave 32
Thank you! Camille Crittenden, PhD CITRIS & the Banatao Institute ccrittenden@citris uc.org http://citris uc.org This material is based upon work supported by the National Science Foundation under Grant No. ACI 1541349. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.