Skip to content

News Center

Home > News Center > Publications > EnVision



Building the Biggest Astronomy Lab on Earth


he Virtual Sky project provides not only stunning images of the night sky, but also a seamless composite view of the northern heavens. Anyone can use the new Virtual Sky website, which features an easy-to-use, intuitive interface to high-resolution images from the complete Digital Palomar Observatory Sky Survey, all of the released data from the Sloan Digital Sky Survey, and many other surveys taken in infrared, visible, and radio wavelengths. The Web-based educational resource is free to the public, teachers, and students who want to learn about the universe and explore it.

Figure 1: Browser Window

“This is part of NPACI’s Digital Sky project, originally available to just professional astronomers,” said the leader of the Virtual Sky project, Roy Williams of Caltech’s Center for Advanced Computing Research. “We realized that we also had the opportunity to use astronomical databases and supercomputers to provide a resource the general public could use.”

Virtual Sky is a collaboration with Microsoft Research, and uses the same SQL Server database technology as its Terraserver interface to Earth imagery. Starting with the entire northern sky in the Web browser window (Figure 1), users can zoom in on any section, magnifying the view (Figure 2) by up to 2,000 times to an image scale of 1.5 arcseconds per pixel—the best resolution of ground-based photographs.

Figure 2: Zoom, Zoom, Zoom

“Virtual Sky is an incredible public asset,” said Jim Gray, manager of Microsoft Research’s Bay Area Research Center and senior researcher in Microsoft’s Research Scalable Servers Research Group. “The images and other data were available through Caltech, but you had to be a professional astronomer to use them and you had to know what you wanted. Roy single-handedly changed that, and the results are really stunning.”

Microsoft contributed software, Caltech provides most of the operating expenses, and an anonymous donor contributed hardware to Virtual Sky worth about $20,000.


The Virtual Sky presents its information in multiple “themes,” each one a representation of the heavens from a different database (Figure 3). Star charts, provided by YourSky, a Web-based interactive sky atlas, show the positions and names of stars, constellations, galaxies, and nebulas on a map grid. But this is just a convenient starting point.

The largest theme is the Digital Palomar Observatory Sky Survey. Photographic plates taken in blue and red light from a 48-inch telescope at the Palomar Observatory have been converted to 3 terabytes of digital-image data, encompassing an estimated 50 million galaxies and 3 billion stars. Another theme is the portion of the sky that has been imaged by the ongoing Sloan Digital Sky Survey that has been released to the public (about 0.3 terabyte). Other themes are the Hubble Deep Field, National Optical Astronomy Observatory’s Deep Wide-Field Survey, a radio sky map from the National Radio Astronomy Observatory’s Very Large Array Sky Survey, x-ray data from the ROSAT satellite, and infrared images processed from the Infrared Astronomical Satellite. The project currently has a NASA grant to mosaic infrared imagery from the Two Micron All Sky Survey.

One of the most engaging themes is the Uranometria, a set of engravings by German astronomer Johann Bayer (1572-1625), which contains 51 star charts and was the forerunner of all later star atlases. The book contains both accurate star charts and fanciful drawings of the mythological figures, which the constellations are supposed to represent, and is the direct ancestor of the modern system of star and constellation nomenclature.

All the themes are resampled to the same standard projection, so a user of the Virtual Sky service can see a region of the sky in any its different representations, all perfectly aligned and at the same scale. The database architecture employs a hierarchy of precomputed image tiles, giving quick response.


Virtual Sky is an unparalleled educational resource. “The natural fascination that we all have with the night sky can be harnessed to teach not only astronomy, but also general science and mathematics,” said Williams. “On one level, children and adults can simply cruise the glory of the heavens with the world’s finest telescopes. Those interested in the history of science might compare the Uranometria representation with the photographic survey. Historical perspectives could provide a far-reaching cultural view, considering the questions of why the star atlases were created, and how mythological stories have been represented from classical times through the Enlightenment.”

More advanced students can compare views of the same regions at different wavelengths and scales to learn about the properties of astronomical objects. Students can classify the galaxies they find on the basis of appearance, and compare their results with the catalogued Hubble classification. Counts of stars and galaxies at different scales can be used to teach statistical analysis techniques. Both amateur and professional astronomers can use Virtual Sky to prepare finder charts, and they can compare their observations to Virtual Sky views to aid in the discovery of variable stars, supernovas, asteroids, and comets.

A novel feature of the Visible Sky service is the weblog (or “blog” in insider parlance), a bulletin board where people can record comments and direct other users to interesting sights. “I think this is incredible,” wrote a visitor recently. “I have never seen anything like this. I will e-mail this link to my friends and let them have a look.”


Figure 2: Pick a Theme

The Virtual Sky currently has 15 million tiles in a 120-gigabyte Microsoft SQL Server database. Data ingestion for Virtual Sky is a formidable task, with the Palomar and Sloan data sets requiring weeks of computing and data transfer. Virtual Sky is complementary to the SDSC and Caltech focus on data management and data-intensive computing for the TeraGrid. In addition to “big computing,” the astronomy project involves handling multi-terabyte data sets. The data are on both Microsoft and Unix machines, and much of the processing involves long, high-bandwidth data flow. Caltech has performed the high-performance data computing for Virtual Sky, making extensive use of the HPSS and HP Superdome systems.

The Virtual Sky is connected to other astronomical data services, such as the NASA/IPAC Extragalactic Database—clicking on a galaxy calls up its name, catalog data, and publication references. Virtual Sky also is being developed as an index into the raw survey data. “Accessing the Sloan, Palomar, and other survey data that covers a given astronomical object is currently a tedious task that is idiosyncratic for each survey,” Williams said. “We would like Virtual Sky to provide a unified portal, so from any part of the sky we could automatically retrieve the raw image data.”

Much of the analytical research in astronomy involves access to catalogs of interesting, similar, or peculiar objects. Modern catalogs are created by pattern-matching software working on digitized survey images and may contain billions of objects. Williams sees the Virtual Sky as a way to create catalogs from federations of image surveys rather than single surveys—for example, by finding objects that are simultaneously bright in infrared but faint in the optical.

“The NSF-funded National Virtual Observatory, the PACI TeraGrid, IBM, Microsoft, NASA, and other development teams are creating open architectures based on XML and associated protocols for Web services, description, and directory services,” Williams said. “We would like to use Virtual Sky as a demonstrator for this way of doing things, providing not only the services themselves, but also the subsidiary data that allows interoperation between data services and astronomy portals.”





Roy Williams