ADASS XII Conference

VO Interoperability

O10.1 SkyQuery - A Prototype Distributed Query Web Service for the VO

Tamas Budavari, Tanu Malik, Alex Szalay, Ani Thakar, (JHU) Jim Gray (Microsoft Research)

We present SkyQuery, a prototype distributed query and cross-matching service for the VO community. SkyQuery enables astronomers to run combined queries on existing heterogeneous astronomy archives. SkyQuery provides a simple, user-friendly interface to run distributed queries over the federation of registered astronomical archives in the VO. SkyQuery not only provides location transparency, but also takes care of vertical fragmentation of the data and runs the query efficiently to minimize query execution costs.

The SkyQuery client connects to the portal, which is an XML Web Service. The portal farms the query out to the individual archives, which are also accessible via Web Services called SkyNodes. The cross-matching algorithm is run recursively on each SkyNode. Each archive is a relational DBMS with a HTM (Hierarchical Triangular Mesh) index built in for fast spatial lookups. The results of the distributed query are returned as an XML DataSet that is automatically rendered by the client. SkyQuery client web application also displays the image cutout corresponding to the query result.

The importance of a service like SkyQuery for the worldwide astronomical community cannot be overstated: scientific data on the same astronomical objects residing in various archives are mapped in different wavelength ranges and look very different due to the different errors, instrument sensitivities and other peculiarities of the data acquisition and calibration processes used for each archive. Our cross-matching algorithm preforms a probabilistic spatial join across multiple catalogs. This is far from a solved problem in astronomy - indeed, this type of cross-matching is currently often done by eye, one object at a time. Even if we built a static cross-identification table for a set of archives, it would become obsolete by the time we finished building it - the exponential rate of growth of astronomical data means that a dynamic cross-identification mechanism like SkyQuery is the only viable option. Finally, it should be noted that finding non-matches (dropouts) between datasets - objects that exist in some of the catalogs but not in others - is often as important as finding matches, and SkyQuery provides that capability.

O10.2 Why Indexing the Sky is Desirable

Patricio F. Ortiz (Leicester University)

"Indexing the sky" is a database-oriented term to indicate a partitioning scheme of the celestial sphere in order to achieve better performance in queries involving finding close neighbors (cone search, cross correlations amongst catalogues, etc.). Several schemes have been proposed (HTM, HEALPix, "quasi-equal area tiles", cubic projection, etc.), and their use has been kept "hidden" from a more massive use. The scientific value of the internal indexation files is much higher though, as they keep track of the source density of catalogues allowing to answer a family of questions not easily handled by a standard DB system and providing an unusual visual aid: a snapshot of the location of sources listed in any catalog. The pros and cons of adopting an VO-oriented indexation scheme are analyzed.

O10.3 Quantum Topic Maps: A Physicist's View of the Information Universe

Nikita Ogievetsky (Cogitech, Inc.)

It is a continuation of work on RDF Topic Maps presented at Extreme 2001 [1] and KT2002 [2]; Quantum Topic Maps provide a very concise and intuitive way to represent experimental data. By experiment here we assume any examination or inquiry in experimental physics, other natural sciences, or any type of investigation whatsoever in the real world in general.

It will be shown how Quantum Topic Maps can be validated against DAML-OIL ontology.

[1] http://www.cogx.com/xtm2rdf/extreme2001
[2] http://www.cogx.com/kt2002/

O10.4 A New Way of Joining Source Catalogs using a Relational DBMS

Clive Page (University of Leicester)

As part of the AstroGrid and AVO projects we have been examining the facilities of a number of free and commercial DBMS for astronomical data processing, especially the handling of large source catalogs.

One particularly important operation is the cross-matching of sources in different catalogs, this being an important precursor to a wide range of data mining operations. This operation, sometimes called the fuzzy join, is difficult because it needs a match of spatial coordinates within the combined error radius. Unfortunately spatial-indexing rarely comes as standard, and even where it does (e.g. in PostgreSQL), R-trees do not cope well with the singularities in spherical-polar coordinate systems. A new algorithm is proposed here which makes use of a pixelation of the sky using, for example, HTM or HEALPix. This allows the use of a simple equi-join on integers, well within the capability of SQL on any relational DBMS. Using PostgreSQL it has been possible to compare the new PCODE method with the traditional approach based on R-tree indexing. A number of other results from our evaluations are also reported.

O10.5 A Bit of GLUe for the VO: Aladin Experience

Pierre Fernique, Andre Schaaff, Francois Bonnarel, Thomas Boch (CDS)

Aladin is now widely known as a tool to display and cross-match heterogeneous data and images anticipating future VO portals. It offers transparent access to Simbad, VizieR, NED, SkyView, SuperCosmos, NVSS and FIRST, as well as archive logs such as CFHT, Chandra, HST, HUT, ISO, IUE and Merlin. For each of these servers, Aladin knows how to access them, the required query syntax, the list of query parameters, and the fastest mirror site. This knowledge database is automatically updated by taking advantage of the GLU system on which Aladin is based.

We present in this article how the GLU system allows Aladin to integrate in an unique interface, several image and data servers. We describe how it works, how it is updated and how it is implemented in this java applet context.

We also present the evolutions we foresee in the GLU system in order to interact with the emerging web services like UDDI, WSDL...

O10.6 Interoperability of the ISO Data Archive and the XMM-NEWTON Science Archive

Christophe Arviset, John Dowson, Jose Hernandez, Pedro Osuna, Aurele Venet (ESA)

The ISO Data Archive (IDA) and the XMM-Newton Science Archive (XSA) have been developed by the Science Operations and Data Systems Division in Villafranca, Spain. They are both built using the same flexible and modular 3-tier architecture: (Data Products and Database, Business Logic, User Interface). This open architecture, together with Java and XML technology have helped in making the IDA and XSA inter-operable with other archives and applications.
Inter-operability has been achieved from these archives to external archives through:

target name resolution with NED and SIMBAD
access to electronic articles through ADS
access to IRAS data through the IRSA server

Moreover, direct access to ISO and XMM-Newton data is provided, bypassing the standard user interface. The observation / exposure log is given to external archives or application together with a mechanism to access data via a Java Server Page. Later development will be described in particular the so-called Postcard and Product Server.
This is currently available from:

the ADS WWW, that give then access to the data from the articles
the CDS / Vizier catalogue
the IRSA ISO Visualizer
HEASARC archive

The ISO Data Archive can be accessed at: http://www.iso.vilspa.esa.es/ida

The XMM-NEWTON Science Archive can be accessed at:
http://xmm.vilspa.esa.es/xsa