Press Archive

SDSC Helps Professionals ‘Connect the Dots’ in Graph Analytics

Inaugural “Boot Camp” to be Held April 23 at SDSC

Published April 2, 2015

The San Diego Supercomputer Center (SDSC) will begin addressing a rapidly emerging area of interest in data science by holding its first Graph Analytics “boot camp” on Thursday, April 23rd, at its location on the UC San Diego campus in La Jolla, California. 

The recent explosion of interest in analyzing and mining “Big Data” across almost all sectors of science and industry has data scientists looking for new ways to model data so as to accurately represent it and facilitate the gleaning of insights and knowledge from the data.

Graph representations, while certainly not a new concept, are noteworthy because of the timely and relevant situations and phenomena that can be modeled using graph-like data structures.  What is new and innovative is the development of tools and techniques to operate on graphs at the Big Data scale.  

“This development now has organizations and individuals looking for specialized training and education in graph analytic techniques,” said Amarnath Gupta, director of the Advanced Query Processing Laboratory, which is part of SDSC’s Science Research & Development division. “We believe that our popular ‘boot camp’ approach, used in other areas of SDSC’s data science program, will provide participants with a practical introduction to graph analytics in an approachable, short course format.”

Graph Analytics is a rapidly developing area where a combination of graph-theoretic, statistical and database techniques are applied to model, store, retrieve, and perform analyses on graph-structured data. Such techniques enable an understanding of the structure of a network and how it changes in different conditions, finding paths between pairs of entities that satisfy different constraints, identifying clusters or closely interacting subgroups inside a graph, and finding sub-graphs that are similar to a given pattern and so on. Some examples where graph analytics may be employed include:

  • A social scientist trying to find primary influencers over a friendship and message network.
  • An investigator who would like to track suspicious alliances in a network of rogue organizations and individuals.
  • A network biologist who would like to find relationships between genes in a chromosomal region and a family of diseases.
  • An enterprise that needs to correlate measurements across a massive sensor network in the Internet of Things.

For the above scenarios, as well as investigations in many other data-rich areas of research, it is important to view data as a graph (network) of nodes or vertices that represent objects, and edges that represent relationships between objects. For many application areas such as sensor networks, the graphs may be large and have a billion nodes and edges. For applications such as situation monitoring, they may represent a thousand types of entities and relationships. For telecommunication applications, the connections may vary with time, and some entities can be very densely connected to each other.  

The one-day Graph Analytics Boot Camp will offer a broad overview of the field as well as a deep insight into specific analytical techniques. It will introduce how to model a problem into a graph database, and perform analytical tasks in a scalable manner. Based on a fully worked-out use case, the course will provide a comprehensive understanding of how to apply graph analytics to the attendee’s applications.

Click for more information and to register.

About SDSC
As an Organized Research Unit of UC San Diego, SDSC is considered a leader in data-intensive computing and cyberinfrastructure, providing resources, services, and expertise to the national research community, including industry and academia. Cyberinfrastructure refers to an accessible, integrated network of computer-based resources and expertise, focused on accelerating scientific inquiry and discovery. SDSC supports hundreds of multidisciplinary programs spanning a wide variety of domains, from earth sciences and biology to astrophysics, bioinformatics, and health IT. In 2015 SDSC will debut Comet, a new petascale supercomputer that will join its data-intensive Gordon cluster. SDSC is a partner in XSEDE (eXtreme Science and Engineering Discovery Environment), the most advanced collection of integrated digital resources and services in the world.