Projects from two “centers of excellence” at the San Diego Supercomputer Center (SDSC) at the University of California, San Diego – the Center for Large-scale Data Systems research (CLDS) and the Predictive Analytics Center of Excellence (PACE) – were highlighted at a White House Office of Science and Technology Policy meeting focused on accelerating research, development, and collaborations in data-enabled science and engineering.
The event, called ‘Data to Knowledge to Action: Building New Partnerships’, took place this week in Washington D.C. It was held by the Obama Administration’s Networking and Information Technology R&D (NITRD) program, which represents the information technology portfolios of 18 federal agencies.
Thomas Kalil, Deputy Director for Policy for the White House Office of Science and Technology Policy, highlighted both SDSC projects that were selected to be part of the event. Kalil is also the Senior Advisor for Science, Technology, and Innovation for the United States National Economic Council.
In addition, SDSC received recognition for the two projects presented by the center and its partners in industry, government, and academia: Benchmarking of Big Data and Sustainable Communities.
“It is indeed gratifying to see that some of our best and brightest are being recognized at such a high level for their work in big data applications,” said SDSC Director Michael Norman. “This reflects SDSC’s focus on addressing both the management and technical aspects of big data and other data-enabled applications such as predictive analytics, which are now becoming pervasive among academia, industry, and government.”
In early 2012, the Administration announced a new initiative focused on research and development in data science and engineering to harness the power of data to advance national goals such as economic growth, education, health, and clean energy, while fostering regional innovation. At that time six federal departments and agencies made commitments of more than $200 million aimed at developing new tools, techniques, and expertise needed to “move from data to knowledge to action,” according to the NITRD.
This year, the Administration is encouraging multiple stakeholders, such as federal/state/local agencies, private industry, academia, non-profits, and foundations, to forge innovative partnerships, including collaborations that support advanced data management and data analytic techniques.
Chaitan Baru. Photo: Alan Decker.
‘Big Data’ Benchmarking
Chaitan Baru, an SDSC Distinguished Scientist and director of CLDS, was invited to attend the White House event in recognition of coordinating a collaboration among industry, academia, and government to develop industry-standard, application-level benchmarks to evaluate hardware and software systems for big data applications. The BigData Top100 List is a new open, community-based big data benchmarking initiative coordinated by a board of directors that includes representation from SDSC, Pivotal, Cisco, Oracle, Intel, Brocade, Seagate, NetApp, Mellanox, Facebook, IBM, Google, and the University of Toronto.
“This initiative marks the start of the Big Data Benchmark Challenge to seek community input in defining big data benchmarks and metrics,” said Baru. “The creation of objective standards for application-level performance and price/performance fosters competition and innovation in the marketplace. During the past 20 years, for example, the benchmark performance of commercial database software has improved by about a million times, while the price/performance ratio has improved by a factor of a couple of hundred thousand.”
With support from the National Science Foundation (NSF), the initiative has been hosting a series of community workshops, with a fourth workshop held last month. As a part of this effort, the National Institute for Standards and Technology (NIST) recently funded SDSC researchers to study different strategies for synthetic data generation for big data.
Natasha Balac, director of SDSC’s Predictive Analytics Center of Excellence, was recognized for a project she is coordinating with Clean Tech San Diego and OSIsoft to develop a “sustainable communities” infrastructure for downtown San Diego, in part to reduce power consumption.
“We envision deploying a data infrastructure that connects physical systems such as those managing electricity, gas, water, waste, buildings, transportation and traffic,” said Balac, who’s SDSC group is a non-profit public educational organization focused on leveraging the potential of predictive data analytics and developing a comprehensive, sustainable, and secure cyberinfrastructure. “This project will enable the city of San Diego to use city-scale applications that will result in reduced electricity consumption and cost, while at the same time anticipating or uncovering grid instabilities, educating the public, and improving both the quality of life and economic development.”
OSIsoft’s software system will connect to and acquire significant volumes of detailed data streams which will be published in a cyber-secure, private cloud that is only accessible via signed and approved access mechanism protocols. Presently, San Diego Gas & Electric (SDGE) and UC San Diego are beta-testing the OSIsoft software, and UC San Diego researchers are using the campus’ Microgrid system to analyze the data on the main SDGE grid and the UC San Diego Smart Grid.
“A key goal of this project and its broad collaboration is to develop a model for the collection and refinement of data that is transportable to other communities and applications,” said Balac. “Processes developed, as well as their results, will be published to help enable other communities on their own path to sustainability.”
As an Organized Research Unit of UC San Diego, SDSC is considered a leader in data-intensive computing and cyberinfrastructure, providing resources, services, and expertise to the national research community, including industry and academia. Cyberinfrastructure refers to an accessible, integrated network of computer-based resources and expertise, focused on accelerating scientific inquiry and discovery. SDSC supports hundreds of multidisciplinary programs spanning a wide variety of domains, from earth sciences and biology to astrophysics, bioinformatics, and health IT. With its two newest supercomputers, Trestles and Gordon, and a new system called Comet to be deployed in early 2015, SDSC is a partner in XSEDE (Extreme Science and Engineering Discovery Environment), the most advanced collection of integrated digital resources and services in the world.
Jan Zverina, SDSC Communications
858 534-5111 or firstname.lastname@example.org
Warren R. Froelich, SDSC Communications
858 822-3622 or email@example.com