Prof. Dr. Tadeuzs Morzy

Constraint-Based Data Mining

Data mining, also known as knowledge discovery in databases, consists in extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) information from large databases. The main aim of data mining is to give organizations the tools to sift through large databases to find the trends, patterns, and correlations that can guide strategic decision making. Although many data mining methodologies and systems have been developed in recent years, most of present mining models lack human involvement, particularly in the form of guidance and user control. By constraint-based data mining we mean data mining process in which the user provides constraints that guide the mining of databases. Constraint-based data mining techniques can provide a more ad-hoc, query-driven process that exploits the semantics of data more efficiently than current stand-alone data mining systems.

Constraint-based data mining system incorporates two capabilities that distinguish it from stand-alone data mining system as well as from a statistical analysis program or a machine-learning system. First, it should offer a mining query language, which should be a high-level declarative language. Such a declarative mining language lets users express: the part of the database to be mined, the type of rule to be mined, and the properties that the patterns should satisfy. Second, a constraint-based data mining system should support efficient processing and optimization of mining queries providing a mining-query optimizer.

The talk will present the concept of integrated constraint-based on-line data mining. The high-level declarative data mining query language, called MSQL, will be presented. It allows users to express the data to be mined, the types of rules to be mined, and the properties that the rules should satisfy. Then, the talk will present the taxonomy of constraints that can be specified in the language and demonstrate how the various constraints stated in the user-specified mining query can be exploited by an optimizer to improve the performance of data mining process.



 
Referent:  Prof. Dr. Tadeuzs Morzy,
           Vorstand Datenbanksystem-Labor,
           Institut für Computer Science
           Technische Universität Poznan, 
           Polen

Zeitpunkt: Freitag, 14. Jänner 2000, 14 Uhr c. t.

Ort:       HS 3 der Universität Klagenfurt