Business intelligence systems need to perform complex
queries efficiently, and some of these queries must access hierarchical
data. For example, an employee table may hold an organization's hierarchy
information by listing each employee's name and manager. How can we generate
an organization chart based on the hierarchical data? FirstSQL/J can readily
perform the necessary hierarchical query with the CONNECT BY clause. The
CONNECT BY clause specifies the relationship between parent rows and child
rows in the hierarchy and the starting point of the hierarchy. The CONNECT
BY construct supports recursive queries. A recursive query traverses a tree
structure to produce a flat result set. A single table or a join of tables
can represent a tree in the database.
An example of a single table that uses a tree structure is
the employees table, containing both regular employees and their
managers. Each row in the table represents an employee or manager. A managed
employee references their direct manager with a foreign key.
The CREATE TABLE for employees:
CREATE TABLE employees
(emp_no INT PRIMARY KEY,
name VARCHAR(32) NOT NULL,
hire_date DATE,
salary DECIMAL(7,2),
mgr_no INT REFERENCES employees
)
A simple employee tree:
is represented by the following rows in employees:
emp_no |
name |
hire_date |
salary |
mgr_no |
1001 |
Mary |
1999-01-01 |
40000.00 |
NULL |
1002 |
John |
1999-03-17 |
30000.00 |
1001 |
1003 |
Gail |
1999-04-01 |
30000.00 |
1001 |
1004 |
Bill |
1999-04-21 |
30000.00 |
1001 |
1005 |
Jane |
1999-06-01 |
20000.00 |
1002 |
1006 |
Fred |
1999-06-01 |
20000.00 |
1002 |
1007 |
Ruth |
1999-06-24 |
20000.00 |
1002 |
1008 |
Burt |
1999-06-30 |
20000.00 |
1003 |
1009 |
Lucy |
1999-07-13 |
20000.00 |
1003 |
1010 |
Alan |
1999-07-28 |
20000.00 |
1003 |
1011 |
Beth |
1999-08-03 |
20000.00 |
1003 |
1012 |
Dora |
1999-08-15 |
20000.00 |
1004 |
1013 |
Mark |
1999-09-05 |
20000.00 |
1004 |
1014 |
Lynn |
1999-09-19 |
20000.00 |
1004 |
A recursive query using the CONNECT BY construct creates
a result by traversing the tree. It has the following general format:
SELECT select-list
FROM from-list
[WHERE predicate]
START WITH predicate
CONNECT BY predicate
The START WITH predicate selects the row or rows that are
the root of one or more independent trees. The CONNECT BY predicate links
successive levels of the tree. It connects the root to the second level, the
second level to the third, and so on. In the CONNECT BY predicate, the PRIOR
keyword marks references to the previous level.
The following query traverses the employee tree shown above:
SELECT LEVEL, name
FROM employees
START WITH name = 'Mary'
CONNECT BY PRIOR emp_no = mgr_no
Note: The LEVEL keyword is a system value used in CONNECT
BY queries. It contains the level number of the current row in the tree.
The query result is:
level |
name |
1 |
Mary |
2 |
John |
3 |
Jane |
3 |
Fred |
3 |
Ruth |
2 |
Gail |
3 |
Burt |
3 |
Lucy |
3 |
Alan |
3 |
Beth |
2 |
Bill |
3 |
Dora |
3 |
Mark |
3 |
Lynn |
The processing steps for the first three result rows are:
- Select the root row (Mary) with the START WITH predicate.
- Select the second level row (John) with the CONNECT BY
predicate by matching the emp_no for the root level (1001 for Mary) against
the mgr_no for the candidate rows.
- Select the third level row (Jane) by matching the emp_no
for the prior level (1002 for John) against the mgr_no for the candidate
rows.
When a row is selected for a given level, the descendants
for that row (if any) are selected before the next row at the same level is
processed. In the sample query above, the second level descendants of the
root level (Mary) are John, Gail, Bill. John is selected first followed by
all third level descendants of John before processing Gail.
The from-list for a CONNECT BY may contain outer joins and
nested queries (table subqueries). The only restriction on table subqueries
is that they may not contain a nested CONNECT BY query. Predicate subqueries
and scalar subqueries, on the other hand, may be or contain a CONNECT BY
query.
The tree cannot contain any loops (cycles). Tree loops occur
because of loops in the data or because the query is incorrectly formulated.
If it detects a loop in the tree traversal, the query terminates and throws
a SQL Exception.
Extended Capabilities
FirstSQL/J provides extended capabilities within a CONNECT BY query. This
includes the PRIOR and LEVEL keywords introduced above, plus the connectByPath()
function.
The PRIOR keyword is valid only in the CONNECT BY predicate.
It modifies a column name to reference the previous level of the tree. For
example,
This references the emp_no column in the result
row from the previous level. The expression:
matches the emp_no column from the previous level
with the mgr_no column from the current level. The expression connects
an employee from the current level with its immediate manager on the previous
level of the tree.
The CONNECT BY predicate can have multiple references to
columns on the previous level, each using the PRIOR modifier.
The LEVEL keyword is valid in the CONNECT BY predicate
and in the select-list of the CONNECT BY query. It is a reference to a built-in
system value containing the current level being processed. LEVEL is an integer
value. The root level has a LEVEL value of 1, the next level is numbered 2,
and so on.
Full query example:
SELECT LEVEL, name
FROM employees
START WITH mgr_no IS NULL
CONNECT BY PRIOR emp_no = mgr_no
AND LEVEL <= 2
level |
name |
1 |
Mary |
2 |
John |
2 |
Gail |
2 |
Bill |
This query includes the LEVEL number in the select list.
It also uses the LEVEL number in the CONNECT BY predicate to limit the tree
traversal to 2 level
The connectByPath() function constructs a string path from
the root to the current level that includes all intervening levels. It is
only valid in the select-list of a CONNECT BY query. The connectByPath() function
has the following general format:
SET.connectByPath(column-name, separator)
connectByPath() is a method in the built-in system class
-- SET. Its arguments are:
- column-name -- the name of a character column in the CONNECT
BY query. The connectByPath() function collects the value of this column
from each successive level in constructing the path string.
- separator -- string value used to separate each level value
in the path string.
This is best illustrated by modifying the initial example
query to use the function:
SELECT LEVEL, name, SET.connectByPath(name, '/') path
FROM employees
START WITH name = 'Mary'
CONNECT BY PRIOR emp_no = mgr_no
level |
name |
path |
1 |
Mary |
/Mary |
2 |
John |
/Mary/John |
3 |
Jane |
/Mary/John/Jane |
3 |
Fred |
/Mary/John/Fred |
3 |
Ruth |
/Mary/John/Ruth |
2 |
Gail |
/Mary/Gail |
3 |
Burt |
/Mary/Gail/Burt |
3 |
Lucy |
/Mary/Gail/Lucy |
3 |
Alan |
/Mary/Gail/Alan |
3 |
Beth |
/Mary/Gail/Beth |
2 |
Bill |
/Mary/Bill |
3 |
Dora |
/Mary/Bill/Dora |
3 |
Mark |
/Mary/Bill/Mark |
3 |
Lynn |
/Mary/Bill/Lynn |
The connectByPath() function will accept any string expression
for both arguments. It also has a single argument form:
The 2 argument example above:
SET.connectByPath(name, '/')
is equivalent to:
using the single argument form. For example:
SELECT LEVEL, name, SET.connectByPath('/' || name) path
FROM employees
START WITH name = 'Mary'
CONNECT BY PRIOR emp_no = mgr_no
level |
name |
path |
1 |
Mary |
/Mary |
2 |
John |
/Mary/John |
3 |
Jane |
/Mary/John/Jane |
3 |
Fred |
/Mary/John/Fred |
3 |
Ruth |
/Mary/John/Ruth |
2 |
Gail |
/Mary/Gail |
3 |
Burt |
/Mary/Gail/Burt |
3 |
Lucy |
/Mary/Gail/Lucy |
3 |
Alan |
/Mary/Gail/Alan |
3 |
Beth |
/Mary/Gail/Beth |
2 |
Bill |
/Mary/Bill |
3 |
Dora |
/Mary/Bill/Dora |
3 |
Mark |
/Mary/Bill/Mark |
3 |
Lynn |
/Mary/Bill/Lynn |
Note: SQL scripts for creating and populating the employees
table and the example queries above are included with the distribution.
Copyright © 2003 FFE Software,
Inc. All Rights Reserved WorldWide
|