News Archive

SDSC’s 2014 Series of Data Mining ‘Boot Camps’ Begins February 26-27

Published 02/04/2014

PACE: Data Mining Boot Camps

The San Diego Supercomputer Center (SDSC) at the University of California, San Diego is kicking off its 2014 series of “Data Mining Boot Camps” aimed at helping business professionals and academic research scientists gain a clear understanding of how to rapidly translate the burgeoning amounts of data and learn how to design, build, verify and evaluate predictive models. 

Developed and organized by SDSC’s Predictive Analytics Center of Excellence, or PACE, the two-day sessions provide conceptual and hands-on training with critical predictive analytic tools and techniques that non-computer science professionals can use to detect patterns and relationships in what is now referred to as “Big Data.” 

Each day, our society creates 2.5 quintillion bytes of data (that’s 2.5 followed by 18 zeros). The need among researchers to make sense of all this information has become more acute, and as a result demand for data scientists has increased steadily. According to the Harvard Business Journal and Fortune magazine, a career as a data scientist is “the” job to have in the 21st century. The McKinsey Global Institute’s Big Data Report notes that by 2018, the U.S. alone could face a shortage of 140,000 to 190,000 professional with deep analytical skills, as well as 1.5 million managers and analysts with the know-how to properly analyze Big Data to make more effective decisions. 

“Conventional statistical analysis and business intelligence software, although useful, are not designed to capture, curate, manage and process large quantities of data generated by most enterprises,” according to Natasha Balac, PACE’s director. “Data mining and predictive modeling, now commonly referred to as data science, are capable of automatic extraction of meaningful value hidden in this data, enabling discovery of new insights and providing a competitive edge.” 

Launched in October 2012, the PACE Boot Camps assist organizations by expanding the analytical skills of their own subject matter experts to develop a built-in pool of talented data scientists, as well as preparing managers and analysts to perform in-depth analyses of massive or multiple data sets. The workshops have generated a tremendous amount of interest from both industry and academia, including many UC San Diego colleagues. The sessions have attracted a wide range of industry participants, including some unanticipated business sectors such as utilities, food services, and gaming industry. 

The comprehensive, hands-on curriculum used during the PACE Data Mining Boot Camps is an outgrowth of the data mining certification courses offered through the UC San Diego Extension. The instructors for the data certification courses also lead the training. Specifically, the PACE Boot Camps cover basic data mining, data analysis, pattern recognition concepts, and predictive modeling algorithms so that participants can explore and implement analyses on their data. 

Participants also will have access to a comprehensive set of data mining tools available on SDSC’s Gordon, one of the world’s most innovative supercomputers with 300 terabytes of flash memory. Participants will be able to apply data mining algorithms to real data and interpret the results. The classroom setting allows the instructors to work one-on-one with participants during the hands-on training. By area, Boot Camp training includes:

  • Overview: Data Mining, Machine Learning and Statistics
  • Overview of CRISP-DM: Cross Industry Standard Process for Data Mining
  • Introduction to Data Mining Tools
  • Data Preparation
  • Learning Algorithms Implementations
  • Model Evaluation and Validation
  • Data Mining Trends, Applications and Guidelines

Participants can register for the 2014 series of PACE Boot Camps online or ask for more information by emailing or calling (858) 534-8321. PACE also conducts onsite training tailored to meet an organization’s specific core research and business objectives.

About PACE
The Predictive Analytics Center of Excellence (PACE) is an SDSC “center of excellence” aimed at leveraging SDSC’s data-intensive expertise and resources to help create the next generation of data researchers by leading a collaborative, nationwide education and training effort among academia, industry, and government. The program’s goal is to develop and deploy a comprehensive suite of integrated, sustainable, and secure cyberinfrastructure (CI) services to accelerate research and education in predictive analytics – or the process of using a variety of statistical techniques from modeling, data mining, and game theory to analyze current and historical facts to make predictions, as well as assess risks and opportunities, about future events. Predictive analytics are now being used in a wide variety of fields such as healthcare, pharmaceuticals, financial services, insurance, and telecommunications. 

About SDSC
As an Organized Research Unit of UC San Diego, SDSC is considered a leader in data-intensive computing and cyberinfrastructure, providing resources, services, and expertise to the national research community, including industry and academia. Cyberinfrastructure refers to an accessible, integrated network of computer-based resources and expertise, focused on accelerating scientific inquiry and discovery. SDSC supports hundreds of multidisciplinary programs spanning a wide variety of domains, from earth sciences and biology to astrophysics, bioinformatics, and health IT. With its two newest supercomputers, Trestles and Gordon, and a new system called Comet to be deployed in early 2015, SDSC is a partner in XSEDE (Extreme Science and Engineering Discovery Environment), the most advanced collection of integrated digital resources and services in the world.

For Comment:
Natasha Balac
(858) 534-5161 or

Media Contacts:
Jan Zverina, SDSC Communications
858 534-5111 or

Warren R. Froelich, SDSC Communications
858 822-3622 or

Related Links

PACE website:
San Diego Supercomputer Center:
UC San Diego: