MLKD logo   Machine Learning &
Knowledge Discovery Group
 
 

Distributed Data Mining

Introduction

The continuous developments in information and communication technology have recently led to the appearance of distributed computing environments, which comprise several, and different sources of large volumes of data and several computing units. The most prominent example of a distributed environment is the Internet, where increasingly more databases and data streams appear that deal with several areas, such as meteorology, oceanography, economy and others. In addition the Internet constitutes the communication medium for geographically distributed information systems, as for example the earth observing system of NASA.

The application of the classical knowledge discovery process in distributed environments requires the collection of distributed data in a data warehouse for central processing. However, this is usually either ineffective or infeasible because of the storage, communication and computational cost, as well as the privacy issues involved in such an approach. Distributed Data Mining offers algorithms, methods and systems that deal with the above issues in order to discover knowledge from distributed data in an effective and efficient way.

Publications

Links