Uncategorized

SQL

SQL

  • SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in languages like C++.
  • What makes SQL viable is that its queries are “optimized” quite well, yielding efficient query executions.

Select-From-Where Statements

• The principal form of a query is:

SELECT desired attributes
FROM one or more tables
WHERE condition about tuples of the tables

Our Running Example

• Our SQL queries will be based on the following database schema.

Movie(title, year, length, inColor, studioName, producerC) StarsIn(movieTitle, movieYear, starName) MovieStar(name, address, gender, birthdate) MovieExec(name, address, cert#, netWorth)

Studio(name, address, cert#, netWorth) Find all movies produced by Disney Studios in 1990.

SELECT *
FROM Movie
WHERE studioName = ‘Disney’ AND year = 1990;

Meaning of Single-Relation Query

• Begin with the relation in the FROM clause.

• Apply the selection indicated by the WHERE clause.

• Apply the extended projection indicated by the SELECT clause.

(Extended) Projection in SQL

SELECT title, length
FROM Movie
WHERE studioName = ‘Disney’ AND year = 1990;

SELECT title AS name, length AS duration
FROM Movie
WHERE studioName = ‘Disney’ AND year = 1990;

SELECT title AS name, length*0.016667 AS lenghtInHours FROM Movie
WHERE studioName = ‘Disney’ AND year = 1990;

SELECT title AS name, length/60 AS length, ‘hrs.’ AS inHours FROM Movie
WHERE studioName = ‘Disney’ AND year = 1990;

Selection in SQL

  • The selection of the relational algebra is available through the WHERE clause of SQL.
  • We may build expressions by using the operators:

    = <> < > <= >=

  • The string constants are surrounded by single quotes. – studioName = ‘Disney’
  • Numeric constants are for e.g.: -12.34, 1.23E45
  • Boolean operators are: AND, OR, NOT.

    SELECT title
    FROM Movie
    WHERE (year > 1970) AND NOT (inColor=’C’);

Selection in SQL (Cont.)

  • Which Disney movies are after 1970 or have length greater than 90 mins?

    SELECT title
    FROM Movie
    WHERE (year > 1970 OR length < 90) AND studioName=’Disney’;

  • The parenthesis are needed because the precedence of OR is less than that of AND.

Comparision of strings

• Strings can as well be compared (lexicographically) with the same operators:

= <> < > <= >=

• For instance ‘fodder’<‘foo’

‘bar’ < ‘bargain’

Patterns

  • WHERE clauses can have conditions in which a string is compared with a pattern, to see if it matches.
  • General form:

    <Attribute> LIKE <pattern>

    <Attribute> NOT LIKE <pattern>

  • Pattern is a quoted string with

    % = “any string”

    _ = “any character.”

  • Examples. Suppose we remember a movie “Star something”. SELECT title
    FROM Movie
    WHERE title LIKE ‘Star %’;

    SELECT title
    FROM Movie
    WHERE title LIKE ‘%”s%’;

    Two consecutive apostrophes in a string represent itself and not the end of the string.

Comparison of Strings (Continued)

  • What if the pattern we wish to use in a LIKE expression involves the characters % or _?
  • We should “escape” their special meaning proceeding them by some escape character.
  • In UNIX and C we use backslash \ as the escape character.
  • SQL allows us to use any character we like.
  • s LIKE ‘x%%x%%’ ESCAPE ‘x’
    • –  x will be the escape character.
    • –  A string that is matched by this pattern is for example: %aaaa%

      SELECT title
      FROM Movie
      WHERE title LIKE ‘x%%x%%’ ESCAPE ‘x’;

Ordering the Input

  • We may ask the tuples produced by a query to be presented in sorted order.
  • ORDER BY <list of attributes>
  • Example. Find the Disney movies of 1990.
    • –  Movie(title, year, length, inColor, studioName, producerC#)
    • –  To get the movies listed by length, shortest first, and among movies of equal length, sort alphabetically:

      SELECT *
      FROM Movie
      WHERE studioName = ‘Disney’ ORDER BY length, title;

  • Ordering is ascending, unless you specify the DESC keyword to an attribute.
  • Ties are broken by the second attribute on the ORDER BY list, etc.

NULL Values

• Tuples in SQL relations can have NULL as a value for one or more components.

• Meaning depends on context. Two common cases:

– Missing value : e.g., we know the length has some value, but we don’t know what it is.

– Inapplicable : e.g., the value of attribute spouse for an unmarried person.

Comparing NULL’s to Values

  • The logic of conditions in SQL is really 3- valued logic: TRUE, FALSE, UNKNOWN.
  • When any value is compared with NULL, the truth value is UNKNOWN.
  • But a query only produces a tuple in the answer if its truth value for the WHERE clause is TRUE (not FALSE or UNKNOWN).

Three-Valued Logic

  • To understand how AND, OR, and NOT work in 3-valued logic, think of

    – TRUE=1,FALSE=0,andUNKNOWN=_.

  • AND = MIN
  • OR=MAX
  • NOT(x) = 1-x
  • Example:
    TRUE AND (FALSE OR NOT(UNKNOWN)) =

    MIN(1, MAX(0, (1 – _ ))) =
    MIN(1, MAX(0, _ )) = MIN(1, _ ) = _.

Surprising Example

SELECT *
FROM Movie
WHERE length <=120 OR length > 120;

  • Suppose that we have some NULL values in the length.
  • What’s the result?

Products and Joins in SQL

  • SQL has a simple way to couple relations in one query: list each relation in the FROM clause.
    • –  All the relations in the FROM clause are coupled through Cartesian product
    • –  Then we can put conditions in the WHERE clause in order to get the various kinds of join.
  • Example. We want to know the name of the producer of Star Wars.
  • To answer we need the information from both of the relations:
    • –  Movie(title, year, length, inColor, studioName, producerC)
    • –  MovieExec(name, address, cert, netWorth)

      SELECT name
      FROM Movie, MovieExec
      WHERE title = ‘Star Wars’ AND producerC = cert;

Disambiguating Attributes

  • When we involve two or more relations in a query, we can have attributes with the same name among these relations.
  • We solve the problem of disambiguating between them by putting the name of the relation followed by a dot and then the name of the attribute.
  • Example. Suppose we wish to find pairs (star, movie executive) living in the same address.
    • –  MovieStar(name, address, gender, birthdate)
    • –  MovieExec(name, address, cert, netWorth)

      SELECT MovieStar.name, MovieExec.name FROM MovieStar, MovieExec
      WHERE MovieStar.address = MovieExec.address;

Tuple Variables

  • Sometimes we need to ask a query that involves two or more tuples from the same relation. We may list a relation R as many times we want in the from clause but we need a way to refer to each occurrence of R.
  • SQL allows us to define, for each occurrence in the FROM clause, an alias which we call “tuple variable.”
  • Example. We like to know about two stars who share an address.

    SELECT Star1.name, Star2.name
    FROM MovieStar AS Star1, MovieStar AS Star2 WHERE Star1.address = Star2.address

.

AND Star1.name < Star2.name;

Tuple Variables (Continued)

• Why we have the condition
– Star1.name < Star2.name ?

  • Without this condition we would produce also pairs of identical star names.
  • This conditions forces us to produce each pair of stars with a common address only.
  • Why we used the operator < and not <>?
  • If we had used <> the we would have produced pairs of married stars

twice, like:

Star1.name

Alec Baldwin Kim Basinger

Star2.name

Kim Basinger

Alec Baldwin

Conversion to Relational Algebra

  • Another approach to interpret SQL queries is to relate them to relational algebra.
  • Start with the relations in the FROM clause and take their Cartesian Product.
  • Having created the product, we apply a selection operator to it by converting the WHERE clause to a selection condition.
  • Finally from the list of attributes in the SELECT clause we do a projection.

An Unintuitive Consequence of SQL semantics

  • Suppose R, S, T are unary relations each having attribute A alone.
  • We wish to compute R∩(S∪T).
  • We might expect the following SQL query to do the job.

    SELECT R.A
    FROM R, S, T
    WHERE R.A = S.A OR R.A = T.A

  • However, consider the situation in which T is empty. Since R.A = T.A can never be satisfied, we might expect the query to produce exactly R∩T.
  • But using the interpretation the result is empty.

– If we use the conversion to RA, the Cartesian product R x S x T is ∅.

Union, Intersection, and Difference of Queries

  • If two SQL queries produce relations with the same set of attributes then we can combine the queries using the set operations: UNION, INTERSECT and EXCEPT.
  • Example. Suppose we want the names and addresses of all female movie stars who are also movie executives with a net worth over $1,000,000.
    • –  MovieStar(name, address, gender, birthdate)
    • –  MovieExec(name, address, cert, netWorth)

      (SELECT name, address FROM MovieStar WHERE gender = ‘F’)

      INTERSECT

      (SELECT name, address
      FROM MovieExec
      WHERE netWorth > 1000000);

Union, Intersection, and Difference of Queries (Continued)

  • Example. Give the names and addresses of movie stars who are not also movie executives.

    (SELECT name, address FROM MovieStar)

    EXCEPT

    (SELECT name, address FROM MovieExec);

  • Example. We want all the titles and years of movies that appeared in either the Movie or StarsIn relation.

    (SELECT title, year FROM Movie)

    UNION

    (SELECT title, year FROM StarsIn);