SQL Tutorial

SELECT Statement -- Extended Query Capabilities

This subsection details the remaining features of SELECT statements. The basics are at SELECT Statement Basics.

The extended features are grouped as follows:

ORDER BY Clause

The ORDER BY clause is optional. If used, it must be the last clause in the SELECT statement. The ORDER BY clause requests sorting for the results of a query.

When the ORDER BY clause is missing, the result rows from a query have no defined order (they are unordered). The ORDER BY clause defines the ordering of rows based on columns from the SELECT clause. The ORDER BY clause has the following general format:

column-1, column-2, ... are column names specified (or implied) in the select list. If a select column is renamed (given a new name in the select entry), the new name is used in the ORDER BY list. ASC and DESC request ascending or descending sort for a column. ASC is the default.

ORDER BY sorts rows using the ordering columns in left-to-right, major-to-minor order. The rows are sorted first on the first column name in the list. If there are any duplicate values for the first column, the duplicates are sorted on the second column (within the first column sort) in the Order By list, and so on. There is no defined inner ordering for rows that have duplicate values for all Order By columns.

Database nulls require special processing in ORDER BY. A null column sorts higher than all regular values; this is reversed for DESC.

In sorting, nulls are considered duplicates of each other for ORDER BY. Sorting on hidden information makes no sense in utilizing the results of a query. This is also why SQL only allows select list columns in ORDER BY.

For convenience when using expressions in the select list, select items can be specified by number (starting with 1). Names and numbers can be intermixed.

Example queries:

Expressions

In the previous subsection on basic Select statements, column values are used in the select list and where predicate. SQL allows a scalar value expression to be used instead. A SQL value expression can be a:

Literals

A literal is a typed value that is self-defining. SQL supports 3 types of literals:

SQL Functions

SQL has the following builtin functions:

System Values

SQL System Values are reserved names used to access builtin values:

SQL Special Constructs

SQL supports a set of special expression constructs:

Expression Operators

Expression operators combine 2 subexpressions to calculate a value. There are 2 basic types -- numeric and string. In expressions, parentheses are used for grouping.

Joining Tables

The FROM clause allows more than 1 table in its list, however simply listing more than one table will very rarely produce the expected results. The rows from one table must be correlated with the rows of the others. This correlation is known as joining.

An example can best illustrate the rationale behind joins. The following query:

Produces: Each row in sp is arbitrarily combined with each row in p, giving 12 result rows (4 rows in sp X 3 rows in p.) This is known as a cartesian product.

A more usable query would correlate the rows from sp with rows from p, for instance matching on the common column -- pno:

This produces: Rows for each part in p are combined with rows in sp for the same part by matching on part number (pno). In this query, the WHERE Clause provides the join predicate, matching pno from p with pno from sp.

The join in this example is known as an inner equi-join. equi meaning that the join predicate uses = (equals) to match the join columns. Other types of joins use different comparison operators. For example, a query might use a greater-than join.

The term inner means only rows that match are included. Rows in the first table that have no matching rows in the second table are excluded and vice versa (in the above join, the row in p with pno P3 is not included in the result.) An outer join includes unmatched rows in the result. See Outer Join below.

More than 2 tables can participate in a join. This is basically just an extension of a 2 table join. 3 tables -- a, b, c, might be joined in various ways:

Plus several other variations. With inner joins, this structure is not explicit. It is implicit in the nature of the join predicates. With outer joins, it is explicit; see below.

This query performs a 3 table join:

It joins s to sp and sp to p, producing: Note that the order of tables listed in the FROM clause should have no significance, nor does the order of join predicates in the WHERE clause.

Outer Joins

An inner join excludes rows from either table that don't have a matching row in the other table. An outer join provides the ability to include unmatched rows in the query results. The outer join combines the unmatched row in one of the tables with an artificial row for the other table. This artificial row has all columns set to null.

The outer join is specified in the FROM clause and has the following general format:

predicate-1 is a join predicate for the outer join. It can only reference columns from the joined tables. The LEFT, RIGHT or FULL specifiers give the type of join: Outer join example:

Self Joins

A query can join a table to itself. Self joins have a number of real world uses. For example, a self join can determine which parts have more than one supplier: As illustrated in the above example, self joins use correlation names to distinguish columns in the select list and where predicate. In this case, the references to the same table are renamed - a and b.

Self joins are often used in subqueries. See Subqueries below.

Subqueries

Subqueries are an identifying feature of SQL. It is called Structured Query Language because a query can nest inside another query.

There are 3 basic types of subqueries in SQL:

All subqueries must be enclosed in parentheses.

Predicate Subqueries

Predicate subqueries are used in the WHERE (and HAVING) clause. Each is a special logical construct. Except for EXISTS, predicate subqueries must retrieve one column (in their select list.)

Scalar Subqueries

The Scalar Subquery can be used anywhere a value can be used. The subquery must reference just one column in the select list. It must also retrieve no more than one row.

When the subquery returns a single row, the value of the single select list column becomes the value of the Scalar Subquery. When the subquery returns no rows, a database null is used as the result of the subquery. Should the subquery retreive more than one row, it is a run-time error and aborts query execution.

A Scalar Subquery can appear as a scalar value in the select list and where predicate of an another query. The following query on the sp table uses a Scalar Subquery in the select list to retrieve the supplier city associated with the supplier number (sno column in sp):

The next query on the sp table uses a Scalar Subquery in the where clause to match parts on the color associated with the part number (pno column in sp): Note that both example queries use outer references. This is normal in Scalar Subqueries. Often, Scalar Subqueries are Aggregate Queries.

Table Subqueries

Table Subqueries are queries used in the FROM clause, replacing a table name. Basically, the result set of the Table Subquery acts like a base table in the from list. Table Subqueries can have a correlation name in the from list. They can also be in outer joins.

The following two queries produce the same result:

Grouping Queries

A Grouping Query is a special type of query that groups and summarizes rows. It uses the GROUP BY Clause.

A Grouping Query groups rows based on common values in a set of grouping columns. Rows with the same values for the grouping columns are placed in distinct groups. Each group is treated as a single row in the query result.

Even though a group is treated as a single row, the underlying rows can be subject to summary operations known as Set Functions whose results can be included in the query. The optional HAVING Clause supports filtering for group rows in the same manner as the WHERE clause filters FROM rows.

For example, grouping the sp table on the pno column produces 2 groups:

Nulls get special treatment by GROUP BY. GROUP BY considers a null as distinct from every other null. Each row that has a null in one of its grouping columns forms a separate group.

Grouping the sp table on the qty column produces 3 groups:

The row where qty is null forms a separate group.

GROUP BY Clause

GROUP BY is an optional clause in a query. It follows the WHERE clause or the FROM clause if the WHERE clause is missing. A query containing a GROUP BY clause is a Grouping Query. The GROUP BY clause has the following general format: column-1 and column-2 are the grouping columns. They must be names of columns from tables in the FROM clause; they can't be expressions.

GROUP BY operates on the rows from the FROM clause as filtered by the WHERE clause. It collects the rows into groups based on common values in the grouping columns. Except nulls, rows with the same set of values for the grouping columns are placed in the same group. If any grouping column for a row contains a null, the row is given its own group.

For example,

In Grouping Queries, the select list can only contain grouping columns, plus literals, outer references and expression involving these elements. Non-grouping columns from the underlying FROM tables cannot be referenced directly. However, non-grouping columns can be used in the select list as arguments to Set Functions. Set Functions summarize columns from the underlying rows of a group.

Set Functions

Set Functions are special summarizing functions used with Grouping Queries and Aggregate Queries. They summarize columns from the underlying rows of a group or aggregate.

Using the Group By example from above, grouping the sp table on the pno column:

A Set Function can compute the total quantities for each group: Null columns are ignored in computing the summary. The Set Function -- SUM, computes the arithmetic sum of a numeric column in a set of grouped/aggregate rows. For example, Set Functions have the following general format: set-function is: The result of the COUNT function is always integer. The result of all other Set Functions is the same data type as the argument.

The Set Functions skip columns with nulls, summarizing non-null values. COUNT counts rows with non-null values, AVG averages non-null values, and so on. COUNT returns 0 when no non-null column values are found; the other functions return null when there are no values to summarize.

A Set Function argument can be a column or an scalar expression.

The DISTINCT and ALL specifiers are optional. ALL specifies that all non-null values are summarized; it is the default. DISTINCT specifies that distinct column values are summarized; duplicate values are skipped. Note: DISTINCT has no effect on MIN and MAX results.

COUNT also has an alternate format:

... which counts the underlying rows regardless of column contents.

Set Function examples:

HAVING Clause

The HAVING Clause is associated with Grouping Queries and Aggregate Queries. It is optional in both cases. In Grouping Queries, it follows the GROUP BY clause. In Aggregate Queries, HAVING follows the WHERE clause or the FROM clause if the WHERE clause is missing.

The HAVING Clause has the following general format:

Like the WHERE Clause, HAVING filters the query result rows. WHERE filters the rows from the FROM clause. HAVING filters the grouped rows (from the GROUP BY clause) or the aggregate row (for Aggregate Queries).

predicate is a logical expression referencing grouped columns and set functions. It has the same restrictions as the select list for Grouping Queries and Aggregate Queries.

If the Having predicate evaluates to true for a grouped or aggregate row, the row is included in the query result, otherwise, the row is skipped (not included in the query result).

For example,

Aggregate Queries

An Aggregate Query can use Set Functions and a HAVING Clause. It is similar to a Grouping Query except there are no grouping columns. The underlying rows from the FROM and WHERE clauses are grouped into a single aggregate row. An Aggregate Query always returns a single row, except when the Having clause is used.

An Aggregate Query is a query containing Set Functions in the select list but no GROUP BY clause. The Set Functions operate on the columns of the underlying rows of the single aggregate row. Except for outer references, any columns used in the select list must be arguments to Set Functions. See Set Functions above.

An aggregate query may also have a Having clause. The Having clause filters the single aggregate row. If the Having predicate evaluates to true, the query result contains the aggregate row. Otherwise, the query result contains no rows. See HAVING Clause above.

For example,

Subqueries are often Aggregate Queries. For example, parts with suppliers: Parts with multiple suppliers:

Union Queries

The SQL UNION operator combines the results of two queries into a composite result. The component queries can be SELECT/FROM queries with optional WHERE/GROUP BY/HAVING clauses. The UNION operator has the following general format: query-1 and query-2 are full query specifications. The UNION operator creates a new query result that includes rows from each component query.

By default, UNION eliminates duplicate rows in its composite results. The optional ALL specifier requests that duplicates be retained in the UNION result.

The component queries of a Union Query can also be Union Queries themselves. Parentheses are used for grouping queries.

The select lists from the component queries must be union-compatible. They must match in degree (number of columns). For Entry Level SQL92, the column descriptor (data type and precision, scale) for each corresponding column must match. The rules for Intermediate Level SQL92 are less restrictive. See Union-Compatible Queries.

Union-Compatible Queries

For Entry Level SQL92, each corresponding column of both queries must have the same column descriptor in order for two queries to be union-compatible. The rules are less restrictive for Intermediate Level SQL92. It supports automatic conversion within type categories. In general, the resulting data type will be the broader type. The corresponding columns need only be in the same data type category:

UNION Examples

SQL Modification Statements

The remaining SQL-Data Statements (SQL DML) are the SQL Modification Statements, described in the next sub-section:
SQL-Data Statements   SQL Tutorial Main Page

Copyright © 2002-2005 FFE Software, Inc. All Rights Reserved WorldWide