Amarnath Gupta Named SDSC’s Pi Person of the Year

Technology Awards Science

Amarnath Gupta Named SDSC’s Pi Person of the Year

Published June 05, 2025

By SDSC Communications

Amarnath Gupta. Credit: SDSC Communications

Each year, SDSC recognizes an individual whose research contributions over the years straddle both science and cyberinfrastructure (CI) technology. The Pi Person of the Year Award, represented by the Pi symbol (∏), was first bestowed to an SDSC researcher in 2013. This year’s recipient is Amarnath Gupta, a senior research scientist within the CICORE Division.

Gupta is a computer scientist by training. He specializes in information systems, including heterogeneous information integration, query processing techniques, knowledge engineering, graph-based access control models, and most recently conversational engines and novel use of large language models (LLMs) for different information systems tasks. These include automated construction of knowledge graphs, ontologies and designing benchmarks for cross-model queries in polystore systems. He has more than 100 publications and four issued patents over his 27 years at SDSC.

“During his career, Amarnath has been a prominent multi-disciplinary researcher with significant publications in computer and domain sciences, as well as software leadership experience in the domains of neuroscience, oceanography, public health, social sciences, biomedical sciences and most recently food security,” SDSC’s CICORE Director and Chief Data Science Officer Ilkay Altintas wrote for the nomination. “In every case, he has seamlessly transformed a domain-science problem into a computer science/information systems problem where he has created innovative solutions that have been practically used by domain scientists.”

In 2008, Gupta co-created a large neuroscience ontology that is used even today. He was the primary designer of the Neuroscience Information Framework platform (2012) that was recognized as a major “big data” CI by the White House. His Polystore-based AWESOME system (2016-2018) is used in quantum materials science, political science, the Tempredict project (Wearable Device for Physiological Data Monitoring), the National Data Platform (NDP) and as part of the NOURISH system’s core CI (2024). He has been awarded the ACM Distinguished Scientist award for his contributions to the application of data management principles and techniques to science disciplines.

Contributions to Research

Gupta has a long history of research in computer science. Over the past 10 years, he has created a novel framework around the “Variety” problem of big data—where widely heterogeneous information sources (relational, documents, graph, time-series, etc.) data can be integrated virtually to: 1) enable a user to query information across these data sources and 2) create a virtual knowledge graph over them. This has not only led to several publications and research funding, but also two new U.S. patents related to optimized data ingestion and joint query processing.

One significant application of this architecture has been demonstrated in a project funded by the U.S. Navy, whereby a knowledge graph analytics technique was applied to data from U.S. Patents, Publications, and News data to discover knowledge gaps between the U.S. and other countries in domains of interest to the Navy.

More recently, Gupta’s research interest includes the use of LLMs to automate the tasks of constructing and interrogating knowledge graphs. To this end, he has publications on how domain-specific knowledge graphs with a known schema can be constructed from text using LLMs together with deep learning methods like graph neural networks. He also advances information systems research by using AI techniques to plan and validate cross-model queries, perform semantic schema mapping and direct generation of polystore query plans from natural language questions.

Use of CI in Research

According to Altintas, all of Gupta’s work relies on scalable data systems. Many of the projects mentioned above offer Representational State Transfer (REST) services that internally use the National Research Platform (NRP), a notable utilization of SDSC’s cyberinfrastructure.

An example is the Tempredict project that ingested physiological data daily from rings worn by 60,000 subjects in a UC San Francisco (UCSF)-led study. The data was analyzed to determine subjects that should take a COVID-19 test the following day because the wearable ring data suggested likely infection. Since the data had personal information, it was anonymized using the secure facilities of Sherlock, a service provided within the Stack Science Division at SDSC, then tunneled into the data services of AWESOME and sent to the NRP for statistical analysis. Results of the analysis were routed back to UCSF via the Sherlock secure system for de-anonymization, and ultimately shared with the at-risk subjects to inform them of their likely COVID infection with a suggestion to take a COVID test.

Further, Gupta’s lab offers a set of CI services including the semantic search service used in NDP and several ontology and knowledge graph services offered for the NOURISH project. The AWESOME platform services are also offered to students as part of different classroom and mentorship activities.

“Amarnath’s talent for translating research across domains into problem-solving systems typifies our mission at SDSC to advance the frontiers of science, technology, education and society through innovations in data and computing,” SDSC Director Frank Würthwein said. “His many years of dedicated service to SDSC and the research community is noted and very much appreciated.”

News