Research at Uppsala DataBase Laboratory (UDBL)Tore Risch
Department of Information Technology
751 05 Uppsala
The computing environments have become increasingly distributed through the use of Internet and mobile networks. What we are experiencing is an ever increasing access to more or less structured information which is furthermore very dynamic and is continuously changing. It is getting more and more critical to develop methods for building scalable systems that combine and extract relevant data from many difference kinds of information sources and present them in a form which is comprehensible for users.
The research at UDBL is concentrating on developing methods and techniques for managing data from heterogeneous and distributed data sources. Our research emphasis is on investigating how modern query processing techniques can be used for efficient searching and integration of data from heterogeneous databases, storage managers and other data sources in a distributed environment.
The current research direction of the group concentrates on developing methods for processing queries over data in terms of i) semantic web based representations and ii) high volume parallel scientific data streams. A challenge in both cases is to provide scalable search as the data volume increases and the queries become complicated. Our approach is to develop smart query transformation techniques and distributed query execution strategies.
As our current research vehicle we have developed a mediator engine, Amos II, based on a functional data model. Using this prototype, we have developed and verified a number of results on query optimization and proven the strength of functional data models for these purposes. The following paper gives an overview of our research prototype and its use for integrating heterogeneous data:
T.Risch, V.Josifovski, and T.Katchaounov: Functional Data Integration in a Distributed Mediator System , in P.Gray, L.Kerschberg, P.King, and A.Poulovassilis (eds.): Functional Approach to Data Management - Modeling, Analyzing and Integrating Heterogeneous Data , Springer, ISBN 3-540-00375-4, 2003.
The following paper gives an introduction to our approach to integrate and query semantic web data sources:
T.Risch: Functional Queries to Wrapped Educational Semantic Web Meta-data in P.Gray, L.Kerschberg, P.King, and A.Poulovassilis (eds.): Functional Approach to Data Management - Modeling, Analyzing and Integrating Heterogeneous Data , Springer, ISBN 3-540-00375-4, 2003.
The technique was used in the Edutella project for querying Datalog-based distributed learning sources:
W.Neidl, B.Wolf, C.Qu, S.Decker, M.Sinek, A.Naeve, M.Nilsson, M.Palmér, and T.Risch: EDUTELLA: A P2P Networking Infrastructure Based on RDF. Presented at 11th International World Wide Web Conference, Honolulu, Hawaii, USA, May 2002.
For space physics applications we have developed methods to scale high volume data stream query processing:
M.Ivanova and T.Risch: Customizable Parallel Execution of Scientific Stream Queries, Proc. 31st International Conference on Very Large Databases, VLDB2005, Trondheim, Norway, 2005, pp 157-168.
We are currently developing scalable and parallel stream query processor SCSQ to run in a massively parallel and heterogeneous computing environment. SCSQ has been applied on space physics and traffic applications:
G.Gidofalvi, T.B. Pedersen, T.Risch, and E.Zeitler: Highly Scalable Trip Grouping for Large Scale Collective Transportation Systems, Proc. 11th International Conference on Extending Database Technology, EDBT 2008 , Nantes, France, March 2008.