The SDSC Summer Institute: Big Data Supercomputing will be held Monday – Friday, August 5 – 9, 2013, at the San Diego Supercomputer Center (SDSC) on the University of California, San Diego (UCSD) campus. Light refreshments and lunch will be provided throughout.
August 5 – 9, 2013
SDSC Auditorium at UC San Diego
Agenda
Day 1 | Day 2 | Day 3 | Day4 | Day 5
| 8:30-12:00 | AM: INTRODUCTION |
| 8:30-8:45am: Content | Introduction to Summer Institute; |
| 8;45-9:15am: | XSEDE-NSF National Cyberinfrastructure; |
| 9:15-10:00am: | Attendee introductions |
| Break | 10-10:15am |
| 10:15am-12pm: Demo/Hands-on | SDSC Computational and Data Resources |
| 12-1:30PM | LUNCH |
| 1:30-5PM | PM: INTRODUCTION (cont'd) |
| 1:30-2:00pm: Content | Technology presentation on Globus Online (GO), with data transfer demo |
| 2:00:3:30pm: Presentation | Use cases: The Compact Muon Solenoid (CMS) high-energy physics project; Gateways: Computational Gateways: Cyberinfrastructure for Phylogenetic Research (CIPRES); OpenTopography |
| 3:30-3:15pm | Break |
| 3:15-5pm: Hands-on/demo | Hands-on session: Introduction to data analytics software packages to be in the Summer Institute |
| 6pm | RECEPTION |
| 8:30-12:00 | AM: DATA MANAGEMENT |
| 8:30-9:30am: Content | Basics of storage technologies and filesystems including, parallel filesytsems, distributed filesystems, and cloud storage, e.g. Lustre, HDFS, OpenStack Swift ObjectSore. Pros and cons. |
| 9:30-10:30am: Presentation | Use case: IntegromeDB search engine for biomedical, biochemical, drug, disease and health related data |
| 10:30-10:45am | Break |
| 10:45am-12pm: Demo/Hands-on | Demo, myHadoop on Gordon and Hadoop |
| 12-1:30PM | LUNCH |
| 1:30-5PM | PM: DATA GENRES |
| 1:30-2:30pm: Content | Concepts and technologies for managing structured, semistructured, unstructured data, including genome sequences, arrays. Row vs Column stores; SQL/noSQL concepts |
| :30-3:15pm: Presentation | 2Use case: Neurosciences Information Framework |
| 3:15-3:30pm | Break |
| 3:45-5pm: Hands-on/demo | Demos and Hands-on: noSQL databases, SciDB, text databases. Introduction to using databases on Gordon |
| 8:30-12:00 | AM: DATA GENRES (contd) |
| 8:30-9:30am: Content | Management of data streams, graph data |
| 9:30-10:15am: Presentation | Use case Presentations: R - IGRAPH |
| 10:15-10:30am | Break |
| Demo/Hands-on | 10:30am-12pm: Demos/Hands-on: Exercises with graph data, e.g. DB2 RDF, GraphLab |
| 12-1:30PM | LUNCH |
| 1:30-5PM | PM: DATA ANALYTICS |
| 1:30-2:30pm: Content | Overview of Data Analytics: From small to big data; Introduction to R |
| 2:30-3:15pm: Presentation | Use case Presentation: Smart Energy Grid data |
| 3:15-3:30pm | Break |
| 3:30-5pm: Hands-on/demo | Demos/Hands-on: Exercises with R |
| 8:30-12:00pm | AM: DATA ANALYTICS (Contd) |
| 8:30-9:30am: Content | Parallel and Distributed Programming Models, e.g. MPI, OpenMP, MapReduce; Scaling to large data: use of Mahout, Spark/Shark, Revolution Analytics, Presto-R |
| 9:30-10:30am: Presentation | Behavior contagion in social networks |
| 10:30-10:45am | Break |
| 10:45am-12pm: Demo/Hands-on | Demo: Parallel/distributed computing |
| 12-1:30PM | LUNCH |
| 1:30-5PM | PM: HPC PROGRAMMING |
| 1:30-2:30pm: Content | Data , Information visualization |
| 2:30-3:15pm: Presentation | Use case presentation: Vis demos/usecases; Financial data usecase |
| 3:15-3:30pm | Break |
| 3:30-5pm: Hands-on/demo | Demos/Hands-on: Visualization; Parallel Computing |
| 8:30-12:00 | AM: HPC PROGRAMMING (contd) |
| Content | Introduction to MatLab, parallel MatLab |
| 9:30-10:30am: Presentation | Use case Presentation: MatLab usecase with workflow |
| 10:30-10:45am: | Break |
| 10:45am-12pm: Demo/Hands-on | Closing: Lightning Talks by Users |