SQL TutoriaL

Posted on at


SQL Tutorial

SELECT Statement Basics

(http://www.firstsql.com/tutor1.htm)

 

In the subsequent text, the following 3 example tables are used:

p Table (parts)

s Table (suppliers)

sp Table (suppliers & parts)

pno

descr

color

P1

Widget

Blue

P2

Widget

Red

P3

Dongle

Green

sno

name

city

S1

Pierre

Paris

S2

John

London

S3

Mario

Rome

sno

pno

qty

S1

P1

NULL

S2

P1

200

S3

P1

1000

S3

P2

200

 

 

The SQL SELECT statement queries data from tables in the database. The statement begins with the SELECT keyword. The basic SELECT statement has 3 clauses:

  • SELECT
  • FROM
  • WHERE

The SELECT clause specifies the table columns that are retrieved. The FROM clause specifies the tables accessed. The WHERE clause specifies which table rows are used. The WHERE clause is optional; if missing, all table rows are used.

For example,

SELECT name FROM s WHERE city='Rome'

This query accesses rows from the table - s. It then filters those rows where the city column contains Rome. Finally, the query retrieves the name column from each filtered row. Using the example s table, this query produces:

name

Mario

A detailed description of the query actions:

  • The FROM clause accesses the s table. Contents:

sno

name

city

S1

Pierre

Paris

S2

John

London

S3

Mario

Rome

  • The WHERE clause filters the rows of the FROM table to use those whose city column contains Rome. This chooses a single row from s:

sno

name

city

S3

Mario

Rome

  • The SELECT clause retrieves the name column from the rows filtered by the WHERE clause:

name

Mario

SELECT Clause

The SELECT clause is mandatory. It specifies a list of columns to be retrieved from the tables in the FROM clause. It has the following general format:

SELECT [ALL|DISTINCT] select-list

select-list is a list of column names separated by commas. The ALL and DISTINCT specifiers are optional. DISTINCT specifies that duplicate rows are discarded. A duplicate row is when each corresponding select-list column has the same value. The default is ALL, which retains duplicate rows.

For example,

SELECT descr, color FROM p

The column names in the select list can be qualified by the appropriate table name:

SELECT p.descr, p.color FROM p

A column in the select list can be renamed by following the column name with the new name. For example:

SELECT name supplier, city location FROM s

This produces:

supplier

location

Pierre

Paris

John

London

Mario

Rome

 

A special select list consisting of a single '*' requests all columns in all tables in the FROM clause. For example,

SELECT * FROM sp 

sno

pno

qty

S1

P1

NULL

S2

P1

200

S3

P1

1000

S3

P2

200

The * delimiter will retrieve just the columns of a single table when qualified by the table name. For example:

SELECT sp.* FROM sp

This produces the same result as the previous example.

An unqualified * cannot be combined with other elements in the select list; it must be stand alone. However, a qualified * can be combined with other elements. For example,

SELECT sp.*, cityFROM sp, sWHERE sp.sno=s.sno

sno

pno

qty

city

S1

P1

NULL

Paris

S2

P1

200

London

S3

P1

1000

Rome

S3

P2

200

Rome

Note: this is an example of a query joining 2 tables.

FROM Clause

The FROM clause always follows the SELECT clause. It lists the tables accessed by the query. For example,

SELECT * FROM s

When the From List contains multiple tables, commas separate the table names. For example,

SELECT sp.*, cityFROM sp, sWHERE sp.sno=s.sno

When the From List has multiple tables, they must be joined together.

Correlation Names

Like columns in the select list, tables in the from list can be renamed by following the table name with the new name. For example,

SELECT supplier.name FROM s supplier

The new name is known as the correlation (or range) name for the table. Self joins require correlation names.

WHERE Clause

The WHERE clause is optional. When specified, it always follows the FROM clause. The WHERE clause filters rows from the FROM clause tables. Omitting the WHERE clause specifies that all rows are used.

Following the WHERE keyword is a logical expression, also known as a predicate.

The predicate evaluates to a SQL logical value -- true, false or unknown. The most basic predicate is a comparison:

color = 'Red'

This predicate returns:

  • true -- if the color column contains the string value -- 'Red',
  • false -- if the color column contains another string value (not 'Red'), or
  • unknown -- if the color column contains null.

Generally, a comparison expression compares the contents of a table column to a literal, as above. A comparison expression may also compare two columns to each other. Table joins use this type of comparison.

The = (equals) comparison operator compares two values for equality. Additional comparison operators are:

  • > -- greater than
  • < -- less than
  • >= -- greater than or equal to
  • <= -- less than or equal to
  • <> -- not equal to

For example,

SELECT * FROM sp WHERE qty >= 200

sno

pno

qty

S2

P1

200

S3

P1

1000

S3

P2

200

Note: In the sp table, the qty column for one of the rows contains null. The comparison - qty >= 200, evaluates to unknown for this row. In the final result of a query, rows with a WHERE clause evaluating to unknown (or false) are eliminated (filtered out).

Both operands of a comparison should be the same data type, however automatic conversions are performed between numeric, datetime and interval types. The CAST expression provides explicit type conversions.

Extended Comparisons

In addition to the basic comparisons described above, SQL supports extended comparison operators -- BETWEEN, IN, LIKE and IS NULL.

  • BETWEEN Operator

The BETWEEN operator implements a range comparison, that is, it tests whether a value is between two other values. BETWEEN comparisons have the following format:

value-1 [NOT] BETWEEN value-2 AND value-3

This comparison tests if value-1 is greater than or equal to value-2 and less than or equal to value-3. It is equivalent to the following predicate:

value-1 >= value-2 AND value-1 <= value-3

Or, if NOT is included:

NOT (value-1 >= value-2 AND value-1 <= value-3)

For example,

SELECT *FROM spWHERE qty BETWEEN 50 and 500

sno

pno

qty

S2

P1

200

S3

P2

200

  • IN Operator

The IN operator implements comparison to a list of values, that is, it tests whether a value matches any value in a list of values. IN comparisons have the following general format:

value-1 [NOT] IN ( value-2 [, value-3] ... )

This comparison tests if value-1 matches value-2 or matches value-3, and so on. It is equivalent to the following logical predicate:

value-1 = value-2 [ OR value-1 = value-3 ] ...

or if NOT is included:

NOT (value-1 = value-2 [ OR value-1 = value-3 ] ...)

For example,

SELECT name FROM s WHERE city IN ('Rome','Paris')

name

Pierre

Mario

  • LIKE Operator

The LIKE operator implements a pattern match comparison, that is, it matches a string value against a pattern string containing wild-card characters.

The wild-card characters for LIKE are percent -- '%' and underscore -- '_'. Underscore matches any single character. Percent matches zero or more characters.

Examples,

Match Value

Pattern

Result

'abc'

'_b_'

True

'ab'

'_b_'

False

'abc'

'%b%'

True

'ab'

'%b%'

True

'abc'

'a_'

False

'ab'

'a_'

True

'abc'

'a%_'

True

'ab'

'a%_'

True

LIKE comparison has the following general format:

value-1 [NOT] LIKE value-2 [ESCAPE value-3]

All values must be string (character). This comparison uses value-2 as a pattern to match value-1. The optional ESCAPE sub-clause specifies an escape character for the pattern, allowing the pattern to use '%' and '_' (and the escape character) for matching. The ESCAPE value must be a single character string. In the pattern, the ESCAPE character precedes any character to be escaped.

For example, to match a string ending with '%', use:

x LIKE '%/%' ESCAPE '/'

A more contrived example that escapes the escape character:

y LIKE '/%//%' ESCAPE '/'

... matches any string beginning with '%/'.

The optional NOT reverses the result so that:

z NOT LIKE 'abc%'

is equivalent to:

NOT z LIKE 'abc%'

  • IS NULL Operator

A database null in a table column has a special meaning -- the value of the column is not currently known (missing), however its value may be known at a later time. A database null may represent any value in the future, but the value is not available at this time. Since two null columns may eventually be assigned different values, one null can't be compared to another in the conventional way. The following syntax is illegal in SQL:

WHERE qty = NULL

A special comparison operator -- IS NULL, tests a column for null. It has the following general format:

value-1 IS [NOT] NULL

This comparison returns true if value-1 contains a null and false otherwise. The optional NOT reverses the result:

value-1 IS NOT NULL

is equivalent to:

NOT value-1 IS NULL

For example,

SELECT * FROM sp WHERE qty IS NULL

sno

pno

qty

S1

P1

NULL

Logical Operators

The logical operators are AND, OR, NOT. They take logical expressions as operands and produce a logical result (True, False, Unknown). In logical expressions, parentheses are used for grouping.

  • AND Operator

The AND operator combines two logical operands. The operands are comparisons or logical expressions. It has the following general format:

predicate-1 AND predicate-2

AND returns:

    • True -- if both operands evaluate to true
    • False -- if either operand evaluates to false
    • Unknown -- otherwise (one operand is true and the other is unknown or both are unknown)

The truth table for AND:

AND

 T 

 F 

 U 

 T 

 T 

 F 

 U 

 F 

 F 

 F 

 F 

 U 

 U 

 F 

 U 

For example,

SELECT *FROM spWHERE sno='S3' AND qty < 500

sno

pno

qty

S3

P2

200

  • OR Operator

The OR operator combines two logical operands. The operands are comparisons or logical expressions. It has the following general format:

predicate-1 OR predicate-2

OR returns:

    • True -- if either operand evaluates to true
    • False -- if both operands evaluate to false
    • Unknown -- otherwise (one operand is false and the other is unknown or both are unknown)

The truth table for OR:

OR

 T 

 F 

 U 

 T 

 T 

 T 

 T 

 F 

 T 

 F 

 U 

 U 

 T 

 U 

 U 

For example,

SELECT *FROM sWHERE sno='S3' OR city = 'London'

sno

name

city

S2

John

London

S3

Mario

Rome

AND has a higher precedence than OR, so the following expression:

a OR b AND c

is equivalent to:

a OR (b AND c)

  • NOT Operator

The NOT operator inverts the result of a comparison expression or a logical expression. It has the following general format:

NOT predicate-1

The truth table for NOT:

NOT

 

 T 

 F 

 F 

 T 

 U 

 U 

Example query:

SELECT *FROM spWHERE NOT sno = 'S3'

sno

pno

qty

S1

P1

NULL

S2

P1

200

 

 

ORDER BY Clause

The ORDER BY clause is optional. If used, it must be the last clause in the SELECT statement. The ORDER BY clause requests sorting for the results of a query.

When the ORDER BY clause is missing, the result rows from a query have no defined order (they are unordered). The ORDER BY clause defines the ordering of rows based on columns from the SELECT clause. The ORDER BY clause has the following general format:

ORDER BY column-1 [ASC|DESC] [ column-2 [ASC|DESC] ] ...

column-1, column-2, ... are column names specified (or implied) in the select list. If a select column is renamed (given a new name in the select entry), the new name is used in the ORDER BY list. ASC and DESC request ascending or descending sort for a column. ASC is the default.

ORDER BY sorts rows using the ordering columns in left-to-right, major-to-minor order. The rows are sorted first on the first column name in the list. If there are any duplicate values for the first column, the duplicates are sorted on the second column (within the first column sort) in the Order By list, and so on. There is no defined inner ordering for rows that have duplicate values for all Order By columns.

Database nulls require special processing in ORDER BY. A null column sorts higher than all regular values; this is reversed for DESC.

In sorting, nulls are considered duplicates of each other for ORDER BY. Sorting on hidden information makes no sense in utilizing the results of a query. This is also why SQL only allows select list columns in ORDER BY.

For convenience when using expressions in the select list, select items can be specified by number (starting with 1). Names and numbers can be intermixed.

Example queries:

SELECT * FROM sp ORDER BY 3 DESC

sno

pno

qty

S1

P1

NULL

S3

P1

1000

S3

P2

200

S2

P1

200

SELECT name, city FROM s ORDER BY name

name

city

John

London

Mario

Rome

Pierre

Paris

SELECT * FROM sp ORDER BY qty DESC, sno

sno

pno

qty

S1

P1

NULL

S3

P1

1000

S2

P1

200

S3

P2

200

Expressions

In the previous subsection on basic Select statements, column values are used in the select list and where predicate. SQL allows a scalar value expression to be used instead. A SQL value expression can be a:

  • Literal -- quoted string, numeric value, datetime value
  • Function Call -- reference to builtin SQL function
  • System Value -- current date, current user, ...
  • Special Construct -- CAST, COALESCE, CASE
  • Numeric or String Operator -- combining sub-expressions

Literals

A literal is a typed value that is self-defining. SQL supports 3 types of literals:

  • String -- ASCII text framed by single quotes ('). Within a literal, a single quote is represented by 2 single quotes ('').
  • Numeric -- numeric digits (at least 1) with an optional decimal point and exponent. The format is

[ddd][[.]ddd][E[+|-]ddd]

Numeric literals with no exponent or decimal point are typed as Integer. Those with a decimal point but no exponent are typed as Decimal. Those with an exponent are typed as Float.

  • Datetime -- datetime literals begin with a keyword identifying the type, followed by a string literal:
    • Date -- DATE 'yyyy-mm-dd'
    • Time -- TIME 'hh:mm:ss[.fff]'
    • Timestamp -- TIMESTAMP 'yyyy-mm-dd hh:mm:ss[.fff]'
    • Interval -- INTERVAL [+|-] string interval-qualifier

The format of the string in the Interval literal depends on the interval qualifier. For year-month intervals, the format is: 'dd[-dd]'. For day-time intervals, the format is '[dd ]dd[:dd[:dd]][.fff]'.

SQL Functions

SQL has the following builtin functions:

  • SUBSTRING(exp-1 FROM exp-2 [FOR exp-3])

Extracts a substring from a string - exp-1, beginning at the integer value - exp-2, for the length of the integer value - exp-3. exp-2 is 1 relative. If FOR exp-3 is omitted, the length of the remaining string is used. Returns the substring.

  • UPPER(exp-1)

Converts any lowercase characters in a string - exp-1 to uppercase. Returns the converted string.

  • LOWER(exp-1)

Converts any uppercase characters in a string - exp-1 to lowercase. Returns the converted string.

  • TRIM([LEADING|TRAILING|BOTH] [FROM] exp-1)
    TRIM([LEADING|TRAILING|BOTH] exp-2 FROM exp-1)

Trims leading, trailing or both characters from a string - exp-1. The trim character is a space, or if exp-2 is specified, it supplies the trim character. If LEADING, TRAILING, BOTH are missing, the default is BOTH. Returns the trimmed string.

  • POSITION(exp-1 IN exp-2)

Searches a string - exp-2, for a match on a substring - exp-2. Returns an integer, the 1 relative position of the match or 0 for no match.

  • CHAR_LENGTH(exp-1)
    CHARACTER_LENGTH(exp-1)

Returns the integer number of characters in the string - exp-1.

  • OCTET_LENGTH(exp-1)

Returns the integer number of octets (8-bit bytes) needed to represent the string - exp-1.

  • EXTRACT(sub-field FROM exp-1)

Returns the numeric sub-field extracted from a datetime value - exp-1. sub-field is YEAR, QUARTER, MONTH, DAY, HOUR, MINUTE, SECOND, TIMEZONE_HOUR or TIMEZONE_MINUTE. TIMEZONE_HOUR and TIMEZONE_MINUTE extract sub-fields from the Timezone portion of exp-1. QUARTER is (MONTH-1)/4+1.

System Values

SQL System Values are reserved names used to access builtin values:

  • USER -- returns a string with the current SQL authorization identifier.
  • CURRENT_USER -- same as USER.
  • SESSION_USER -- returns a string with the current SQL session authorization identifier.
  • SYSTEM_USER -- returns a string with the current operating system user.
  • CURRENT_DATE -- returns a Date value for the current system date.
  • CURRENT_TIME -- returns a Time value for the current system time.
  • CURRENT_TIMESTAMP -- returns a Timestamp value for the current system timestamp.

SQL Special Constructs

SQL supports a set of special expression constructs:

  • CAST(exp-1 AS data-type)

Converts the value - exp-1, into the specified date-type. Returns the converted value.

  • COALESCE(exp-1, exp-2 [, exp-3] ...)

Returns exp-1 if it is not null, otherwise returns exp-2 if it is not null, otherwise returns exp-3, and so on. Returns null if all values are null.

  • CASE exp-1 { WHEN exp-2 THEN exp-3 } ... [ELSE exp-4] END
    CASE { WHEN predicate-1 THEN exp-3 } ... [ELSE exp-4] END

The first form of the CASE construct compares exp-1 to exp-2 in each WHEN clause. If a match is found, CASE returns exp-3 from the corresponding THEN clause. If no matches are found, it returns exp-4 from the ELSE clause or null if the ELSE clause is omitted.

The second form of the CASE construct evaluates predicate-1 in each WHEN clause. If the predicate is true, CASE returns exp-3 from the corresponding THEN clause. If no predicates evaluate to true, it returns exp-4 from the ELSE clause or null if the ELSE clause is omitted.

Expression Operators

Expression operators combine 2 subexpressions to calculate a value. There are 2 basic types -- numeric and string.

  • String Operators

There is just one string operator - ||, for string concatenation. Both operands of || must be strings. The operator concatenates the second string to the end of the first. For example,

'ab' || 'cd'  ==> 'abcd'

  • Numeric operators

The numeric operators are common to most languages:

    • + -- addition
    • - -- subtraction
    • * -- multiplication
    • / -- division

All numeric operators can be used on the standard numeric data types:

    • Integer -- TINYINT, SMALLINT, INT, BIGINT
    • Exact -- NUMERIC, DECIMAL
    • Approximate -- FLOAT, DOUBLE, REAL

Automatic conversion is provided for numeric operators. If an integer type is combined with an exact type, the integer is converted to exact before the operation. If an exact (or integer) type is combined with an approximate type, it is converted to approximate before the operation.

The + and - operators can also be used as unary operators.

The numeric operators can be applied to datetime values, with some restrictions. The basic rules for datetime expressions are:

    • A date, time, timestamp value can be added to an interval; result is a date, time, timestamp value.
    • An interval value can be subtracted from a date, time, timestamp value; result is a date, time, timestamp value.
    • An interval value can be added to or subtracted from another interval; result is an interval value.
    • An interval can be multiplied by or divided by a standard numeric value; result is an interval value.

A special form can be used to subtract a date, time, timestamp value from another date, time, timestamp value to yield an interval value:

(datetime-1 - datetime-2) interval-qualifier

The interval-qualifier specifies the specific interval type for the result.

A second special form allows a ? parameter to be typed as an interval:

? interval-qualifier

In expressions, parentheses are used for grouping.

Joining Tables

The FROM clause allows more than 1 table in its list, however simply listing more than one table will very rarely produce the expected results. The rows from one table must be correlated with the rows of the others. This correlation is known as joining.

An example can best illustrate the rationale behind joins. The following query:

SELECT * FROM sp, p

Produces:

sno

pno

qty

pno

descr

color

S1

P1

NULL

P1

Widget

Blue

S1

P1

NULL

P2

Widget

Red

S1

P1

NULL

P3

Dongle

Green

S2

P1

200

P1

Widget

Blue

S2

P1

200

P2

Widget

Red

S2

P1

200

P3

Dongle

Green

S3

P1

1000

P1

Widget

Blue

S3

P1

1000

P2

Widget

Red

S3

P1

1000

P3

Dongle

Green

S3

P2

200

P1

Widget

Blue

S3

P2

200

P2

Widget

Red

S3

P2

200

P3

Dongle

Green

Each row in sp is arbitrarily combined with each row in p, giving 12 result rows (4 rows in sp X 3 rows in p.) This is known as a cartesian product.

A more usable query would correlate the rows from sp with rows from p, for instance matching on the common column -- pno:

SELECT *FROM sp, pWHERE sp.pno = p.pno

This produces:

sno

pno

qty

pno

descr

color

S1

P1

NULL

P1

Widget

Blue

S2

P1

200

P1

Widget

Blue

S3

P1

1000

P1

Widget

Blue

S3

P2

200

P2

Widget

Red

Rows for each part in p are combined with rows in sp for the same part by matching on part number (pno). In this query, the WHERE Clause provides the join predicate, matching pno from p with pno from sp.

The join in this example is known as an inner equi-join. equi meaning that the join predicate uses = (equals) to match the join columns. Other types of joins use different comparison operators. For example, a query might use a greater-than join.

The term inner means only rows that match are included. Rows in the first table that have no matching rows in the second table are excluded and vice versa (in the above join, the row in p with pno P3 is not included in the result.) An outer join includes unmatched rows in the result.

More than 2 tables can participate in a join. This is basically just an extension of a 2 table join. 3 tables -- a, b, c, might be joined in various ways:

  • a joins b which joins c
  • a joins b and the join of a and b joins c
  • a joins b and a joins c

Plus several other variations. With inner joins, this structure is not explicit. It is implicit in the nature of the join predicates. With outer joins, it is explicit;

This query performs a 3 table join:

SELECT name, qty, descr, colorFROM s, sp, pWHERE s.sno = sp.snoAND sp.pno = p.pno

It joins s to sp and sp to p, producing:

name

qty

descr

color

Pierre

NULL

Widget

Blue

John

200

Widget

Blue

Mario

1000

Widget

Blue

Mario

200

Widget

Red

Note that the order of tables listed in the FROM clause should have no significance, nor does the order of join predicates in the WHERE clause.

Outer Joins

An inner join excludes rows from either table that don't have a matching row in the other table. An outer join provides the ability to include unmatched rows in the query results. The outer join combines the unmatched row in one of the tables with an artificial row for the other table. This artificial row has all columns set to null.

The outer join is specified in the FROM clause and has the following general format:

table-1 { LEFT | RIGHT | FULL } OUTER JOIN table-2 ON predicate-1

predicate-1 is a join predicate for the outer join. It can only reference columns from the joined tables. The LEFT, RIGHT or FULL specifiers give the type of join:

  • LEFT -- only unmatched rows from the left side table (table-1) are retained
  • RIGHT -- only unmatched rows from the right side table (table-2) are retained
  • FULL -- unmatched rows from both tables (table-1 and table-2) are retained

Outer join example:

SELECT pno, descr, color, sno, qtyFROM p LEFT OUTER JOIN sp ON p.pno = sp.pno

pno

descr

color

sno

qty

P1

Widget

Blue

S1

NULL

P1

Widget

Blue

S2

200

P1

Widget

Blue

S3

1000

P2

Widget

Red

S3

200

P3

Dongle

Green

NULL

NULL

Self Joins

A query can join a table to itself. Self joins have a number of real world uses. For example, a self join can determine which parts have more than one supplier:

SELECT DISTINCT a.pnoFROM sp a, sp bWHERE a.pno = b.pnoAND a.sno <> b.sno

pno

P1

As illustrated in the above example, self joins use correlation names to distinguish columns in the select list and where predicate. In this case, the references to the same table are renamed - a and b.

Self joins are often used in subqueries.

Subqueries

Subqueries are an identifying feature of SQL. It is called Structured Query Language because a query can nest inside another query.

There are 3 basic types of subqueries in SQL:

  • Predicate Subqueries -- extended logical constructs in the WHERE (and HAVING) clause.
  • Scalar Subqueries -- standalone queries that return a single value; they can be used anywhere a scalar value is used.
  • Table Subqueries -- queries nested in the FROM clause.

All subqueries must be enclosed in parentheses.

Predicate Subqueries

Predicate subqueries are used in the WHERE (and HAVING) clause. Each is a special logical construct. Except for EXISTS, predicate subqueries must retrieve one column (in their select list.)

  • IN Subquery

The IN Subquery tests whether a scalar value matches the single query column value in any subquery result row. It has the following general format:

value-1 [NOT] IN (query-1)

Using NOT is equivalent to:

NOT value-1 IN (query-1)

For example, to list parts that have suppliers:

SELECT *FROM pWHERE pno IN (SELECT pno FROM sp)

pno

descr

color

P1

Widget

Blue

P2

Widget

Red

The Self Join example in the previous subsection can be expressed with an IN Subquery:

SELECT DISTINCT pnoFROM sp aWHERE pno IN (SELECT pno FROM sp b WHERE a.sno <> b.sno)

pno

P1

Note that the subquery where clause references a column in the outer query (a.sno). This is known as an outer reference. Subqueries with outer references are sometimes known as correlated subqueries.

  • Quantified Subqueries

A quantified subquery allows several types of tests and can use the full set of comparison operators. It has the following general format:

value-1 {=|>|<|>=|<=|<>} {ANY|ALL|SOME} (query-1)

The comparison operator specifies how to compare value-1 to the single query column value from each subquery result row. The ANY, ALL, SOME specifiers give the type of match expected. ANY and SOME must match at least one row in the subquery. ALL must match all rows in the subquery.

For example, to list all parts that have suppliers:

SELECT *FROM pWHERE pno =ANY (SELECT pno FROM sp)

pno

descr

color

P1

Widget

Blue

P2

Widget

Red

A self join is used to list the supplier with the highest quantity of each part (ignoring null quantities):

SELECT *FROM sp aWHERE qty >ALL (SELECT qty FROM sp b                WHERE a.pno = b.pno                 AND a.sno <> b.sno                AND qty IS NOT NULL)

sno

pno

qty

S3

P1

1000

S3

P2

200

  • EXISTS Subqueries

The EXISTS Subquery tests whether a subquery retrieves at least one row, that is, whether a qualifying row exists. It has the following general format

EXISTS(query-1)

Any valid EXISTS subquery must contain an outer reference. It must be a correlated subquery.

Note: the select list in the EXISTS subquery is not actually used in evaluating the EXISTS, so it can contain any valid select list (though * is normally used).

To list parts that have suppliers:

SELECT *FROM pWHERE EXISTS(SELECT * FROM sp WHERE p.pno = sp.pno)

pno

descr

color

P1

Widget

Blue

P2

Widget

Red

Scalar Subqueries

The Scalar Subquery can be used anywhere a value can be used. The subquery must reference just one column in the select list. It must also retrieve no more than one row.

When the subquery returns a single row, the value of the single select list column becomes the value of the Scalar Subquery. When the subquery returns no rows, a database null is used as the result of the subquery. Should the subquery retreive more than one row, it is a run-time error and aborts query execution.

A Scalar Subquery can appear as a scalar value in the select list and where predicate of an another query. The following query on the sp table uses a Scalar Subquery in the select list to retrieve the supplier city associated with the supplier number (sno column in sp):

SELECT pno, qty, (SELECT city FROM s WHERE s.sno = sp.sno)FROM sp

pno

qty

city

P1

NULL

Paris

P1

200

London

P1

1000

Rome

P2

200

Rome

The next query on the sp table uses a Scalar Subquery in the where clause to match parts on the color associated with the part number (pno column in sp):

SELECT *FROM spWHERE 'Blue' = (SELECT color FROM p WHERE p.pno = sp.pno)

sno

pno

qty

S1

P1

NULL

S2

P1

200

S3

P1

1000

Note that both example queries use outer references. This is normal in Scalar Subqueries. Often, Scalar Subqueries are Aggregate Queries.

Table Subqueries

Table Subqueries are queries used in the FROM clause, replacing a table name. Basically, the result set of the Table Subquery acts like a base table in the from list. Table Subqueries can have a correlation name in the from list. They can also be in outer joins.

The following two queries produce the same result:

SELECT p.*, qtyFROM p, spWHERE p.pno = sp.pnoAND sno = 'S3'

pno

descr

color

qty

P1

Widget

Blue

1000

P2

Widget

Red

200

SELECT p.*, qtyFROM p, (SELECT pno, qty FROM sp WHERE sno = 'S3')WHERE p.pno = sp.pno

pno

descr

color

qty

P1

Widget

Blue

1000

P2

Widget

Red

200

Grouping Queries

A Grouping Query is a special type of query that groups and summarizes rows. It uses the GROUP BY Clause.

A Grouping Query groups rows based on common values in a set of grouping columns. Rows with the same values for the grouping columns are placed in distinct groups. Each group is treated as a single row in the query result.

Even though a group is treated as a single row, the underlying rows can be subject to summary operations known as Set Functions whose results can be included in the query. The optional HAVING Clause supports filtering for group rows in the same manner as the WHERE clause filters FROM rows.

For example, grouping the sp table on the pno column produces 2 groups:

sno

pno

qty

 

S1

P1

NULL

'P1' Group

S2

P1

200

S3

P1

1000

S3

P2

200

'P2' Group

  • The P1 group contains 3 sp rows with pno='P1'
  • The P2 group contains a single sp row with pno='P2'

Nulls get special treatment by GROUP BY. GROUP BY considers a null as distinct from every other null. Each row that has a null in one of its grouping columns forms a separate group.

Grouping the sp table on the qty column produces 3 groups:

sno

pno

qty

 

S1

P1

NULL

NULL Group

S2

P1

200

200 Group

S3

P2

200

S3

P1

1000

1000 Group

The row where qty is null forms a separate group.

GROUP BY Clause

GROUP BY is an optional clause in a query. It follows the WHERE clause or the FROM clause if the WHERE clause is missing. A query containing a GROUP BY clause is a Grouping Query. The GROUP BY clause has the following general format:

GROUP BY column-1 [, column-2] ...

column-1 and column-2 are the grouping columns. They must be names of columns from tables in the FROM clause; they can't be expressions.

GROUP BY operates on the rows from the FROM clause as filtered by the WHERE clause. It collects the rows into groups based on common values in the grouping columns. Except nulls, rows with the same set of values for the grouping columns are placed in the same group. If any grouping column for a row contains a null, the row is given its own group.

For example,

SELECT pnoFROM spGROUP BY pno

pno

P1

P2

In Grouping Queries, the select list can only contain grouping columns, plus literals, outer references and expression involving these elements. Non-grouping columns from the underlying FROM tables cannot be referenced directly. However, non-grouping columns can be used in the select list as arguments to Set Functions. Set Functions summarize columns from the underlying rows of a group.

Set Functions

Set Functions are special summarizing functions used with Grouping Queries and Aggregate Queries. They summarize columns from the underlying rows of a group or aggregate.

Using the Group By example from above, grouping the sp table on the pno column:

sno

pno

qty

 

S1

P1

NULL

'P1' Group

S2

P1

200

S3

P1

1000

S3

P2

200

'P2' Group

A Set Function can compute the total quantities for each group:

sno

pno

qty

 

qty total

S1

P1

NULL

'P1' Group

1200

S2

P1

200

S3

P1

1000

S3

P2

200

'P2' Group

200

 

 

 

 

Null columns are ignored in computing the summary. The Set Function -- SUM, computes the arithmetic sum of a numeric column in a set of grouped/aggregate rows. For example,

SELECT pno, SUM(qty)FROM spGROUP BY pno

pno

 

P1

1200

P2

200

Set Functions have the following general format:

set-function ( [DISTINCT|ALL] column-1 )

set-function is:

  • COUNT -- count of rows
  • SUM -- arithmetic sum of numeric column
  • AVG -- arithmetic average of numeric column; should be SUM()/COUNT().
  • MIN -- minimum value found in column
  • MAX -- maximum value found in column

The result of the COUNT function is always integer. The result of all other Set Functions is the same data type as the argument.

The Set Functions skip columns with nulls, summarizing non-null values. COUNT counts rows with non-null values, AVG averages non-null values, and so on. COUNT returns 0 when no non-null column values are found; the other functions return null when there are no values to summarize.

A Set Function argument can be a column or a scalar expression.

The DISTINCT and ALL specifiers are optional. ALL specifies that all non-null values are summarized; it is the default. DISTINCT specifies that distinct column values are summarized; duplicate values are skipped. Note: DISTINCT has no effect on MIN and MAX results.

COUNT also has an alternate format:

COUNT(*)

... which counts the underlying rows regardless of column contents.

Set Function examples:

SELECT pno, MIN(sno), MAX(qty), AVG(qty), COUNT(DISTINCT sno)FROM spGROUP BY pno

pno

 

 

 

 

P1

S1

1000

600

3

P2

S3

200

200

1

SELECT sno, COUNT(*) partsFROM spGROUP BY sno

sno

parts

S1

1

S2

1

S3

2

HAVING Clause

The HAVING Clause is associated with Grouping Queries and Aggregate Queries. It is optional in both cases. In Grouping Queries, it follows the GROUP BY clause. In Aggregate Queries, HAVING follows the WHERE clause or the FROM clause if the WHERE clause is missing.

The HAVING Clause has the following general format:

HAVING predicate

Like the WHERE Clause, HAVING filters the query result rows. WHERE filters the rows from the FROM clause. HAVING filters the grouped rows (from the GROUP BY clause) or the aggregate row (for Aggregate Queries).

predicate is a logical expression referencing grouped columns and set functions. It has the same restrictions as the select list for Grouping Queries and Aggregate Queries.

If the Having predicate evaluates to true for a grouped or aggregate row, the row is included in the query result, otherwise, the row is skipped (not included in the query result).

For example,

SELECT sno, COUNT(*) partsFROM spGROUP BY snoHAVING COUNT(*) > 1

sno

parts

S3

2

Aggregate Queries

An Aggregate Query can use Set Functions and a HAVING Clause. It is similar to a Grouping Query except there are no grouping columns. The underlying rows from the FROM and WHERE clauses are grouped into a single aggregate row. An Aggregate Query always returns a single row, except when the Having clause is used.

An Aggregate Query is a query containing Set Functions in the select list but no GROUP BY clause. The Set Functions operate on the columns of the underlying rows of the single aggregate row. Except for outer references, any columns used in the select list must be arguments to Set Functions.

An aggregate query may also have a Having clause. The Having clause filters the single aggregate row. If the Having predicate evaluates to true, the query result contains the aggregate row. Otherwise, the query result contains no rows.

For example,

SELECT COUNT(DISTINCT pno) number_parts, SUM(qty) total_partsFROM sp

number_parts

total_parts

2

1400

Subqueries are often Aggregate Queries. For example, parts with suppliers:

SELECT *FROM pWHERE (SELECT COUNT(*) FROM sp WHERE sp.pno=p.pno) > 0

pno

descr

color

P1

Widget

Blue

P2

Widget

Red

Parts with multiple suppliers:

SELECT *FROM pWHERE (SELECT COUNT(DISTINCT sno) FROM sp WHERE sp.pno=p.pno) > 1

pno

descr

color

P1

Widget

Blue

Union Queries

The SQL UNION operator combines the results of two queries into a composite result. The component queries can be SELECT/FROM queries with optional WHERE/GROUP BY/HAVING clauses. The UNION operator has the following general format:

query-1 UNION [ALL] query-2

query-1 and query-2 are full query specifications. The UNION operator creates a new query result that includes rows from each component query.

By default, UNION eliminates duplicate rows in its composite results. The optional ALL specifier requests that duplicates be retained in the UNION result.

The component queries of a Union Query can also be Union Queries themselves. Parentheses are used for grouping queries.

The select lists from the component queries must be union-compatible. They must match in degree (number of columns). For Entry Level SQL92, the column descriptor (data type and precision, scale) for each corresponding column must match. The rules for Intermediate Level SQL92 are less restrictive.

Union-Compatible Queries

For Entry Level SQL92, each corresponding column of both queries must have the same column descriptor in order for two queries to be union-compatible. The rules are less restrictive for Intermediate Level SQL92. It supports automatic conversion within type categories. In general, the resulting data type will be the broader type. The corresponding columns need only be in the same data type category:

  • Character (String) -- fixed/variable length
  • Bit String -- fixed/variable length
  • Exact Numeric (fixed point) -- integer/decimal
  • Approximate Numeric (floating point) -- float/double
  • Datetime -- sub-category must be the same,
    • Date
    • Time
    • Timestamp
  • Interval -- sub-category must be the same,
    • Year-month
    • Day-time

UNION Examples

SELECT * FROM spUNIONSELECT CAST(' ' AS VARCHAR(5)), pno, CAST(0 AS INT)FROM pWHERE pno NOT IN (SELECT pno FROM sp)

sno

pno

qty

S1

P1

NULL

S2

P1

200

S3

P1

1000

S3

P2

200

 

P3

0

 

 

SQL Modification Statements

The SQL Modification



About the author

160