Browsing by Author "Rantzau, Ralf"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item Open Access Laws for rewriting queries containing division operators(2005) Rantzau, Ralf; Mangold, ChristophRelational division, also known as small divide, is a derived operator of the relational algebra that realizes a many-to-one set containment test, where a set is represented as a group of tuples: Small divide discovers which sets in a dividend relation contain all elements of the set stored in a divisor relation. The great divide operator extends small divide by realizing many-to-many set containment tests. It is also similar to the set containment join operator for schemas that are not in first normal form. Neither small nor great divide has been implemented in commercial relational database systems although the operators solve important problems and many efficient algorithms for them exist. We present algebraic laws that allow rewriting expressions containing small or great divide, illustrate their importance for query optimization, and discuss the use of great divide for frequent itemset discovery, an important data mining primitive. A recent theoretic result shows that small divide must be implemented by special purpose algorithms and not be simulated by pure relational algebra expressions to achieve efficiency. Consequently, an efficient implementation requires that the optimizer treats small divide as a first-class operator and possesses powerful algebraic laws for query rewriting.Item Open Access Query processing concepts and techniques for set containment tests(2003) Rantzau, Ralf; Mitschang, Bernhard (Prof. Dr.-Ing. habil.)Relational division is an operator of the relational algebra that realizes universal quantifications in queries against a relational database. Expressing a universal quantification problem in SQL is cumbersome. If the division operator would have a counterpart in a query language, a more intuitive formulation of universal quantification problems would be possible. Although division is a derived operator - it can be expressed using other basic algebra operators - the performance of queries involving a division problem can be significantly increased if it is implemented as a separate operator in a query processor. In this work, we study relational division as well as operators related to it like set containment join. We present a classification of division algorithms and define algebraic laws to enable query optimization for queries involving division. A natural generalization of the division operator, called set containment division, is proposed and algorithms that realize it are discussed, including an algorithm that uses a new in-memory index data structure. Furthermore, we study strategies for parallelizing set containment division algorithms. The most popular data mining sub-task called frequent itemset discovery, where one tries to find the number of transactions that contain a given set of items, is an excellent example of a problem that requires efficient set containment tests. In this work, we focus on frequent itemset discovery algorithms that exploit the query processing capabilities of a relational database system. After reviewing the best algorithms based on SQL-92, we suggest a new approach, called Quiver, that can be executed with the help of set containment division. We report on performance results that cover experiments with frequent itemset discovery algorithms running on commercial database systems as well as experiments that highlight the most important properties of set containment test algorithms using a query processor prototype implemented in Java.