de.fuberlin.wiwiss.d2rq.algebra
Class JoinOptimizer
java.lang.Object
de.fuberlin.wiwiss.d2rq.algebra.JoinOptimizer
public class JoinOptimizer
- extends Object
Removes unnecessary joins from an RDFRelation
in cases
where this is possible without affecting the result. This is an
optimization.
A join J from table T1 to table T2 with join condition
T1.c_1 = T2.c_1 && T1.c_2 = T2.c_2 && ...
can be removed if these conditions hold:
- The only join mentioning T2 is J.
- All columns of T2 that are selected or constrained or used in
an expression occur in J's join condition.
- All values of T1.c_n are guaranteed to occur
in T2.c_n, that is, there is a foreign key constraint
on T1.c_n.
In this case, J can be dropped, and all mentions of T2.c_n
can be replaced with T1.c_n.
TODO: Note: The third condition is currently not enforced.
This is not a problem in most situations, because d2rq:join is typically
used along an FK constraint. At this point in the code, we don't know the
direction of the FK though. If the FK is on T2.c_n, then removing
the join may result in incorrect results.
The way 1:n and n:m joins are typically used, condition 2 will exclude most
cases where the FK is on T2.c_n.
Another heuristic catches many 1:1 cases: If a join looks removable, but
switching the order of the two tables around also results in a removable
join, then we don't remove it, because we have no indication in which
direction the FK is pointing and might pick the wrong one, so the safe
choice is not to optimize.
However, there are still cases where a join will be incorrectly removed,
e.g. a 1:1 join with a condition on a column not involved in the join.
This is a bug. The only way to be sure if a join can be removed is to check
the database schema (or ask the user) if there is indeed a FK constraint on
T1.c_n.
TODO: Prune unnecessary aliases after removing joins
- Version:
- $Id: JoinOptimizer.java,v 1.14 2007/10/21 11:58:22 cyganiak Exp $
- Author:
- Richard Cyganiak (richard@cyganiak.de)
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
JoinOptimizer
public JoinOptimizer(RDFRelation base)
- Constructs a new JoinOptimizer.
- Parameters:
base
- The RDFRelation to be optimized
optimize
public RDFRelation optimize()