Computer Science Study Notes Help

Database System

1 SQL Part Ⅰ

1.1 SQL Introduction

SQL = Structured Query Language

Although over 40 years old, it keeps re-emerging as the standard.

Features:

  • Declarative!

    • Specify what you want, not how to get it.

  • Implemented widely

    • With varying levels of efficiency, completeness.

  • Constrained

    • Not targeted at Turing-complete tasks.

  • General-purpose and feature-rich

    • Many years of added features.

    • Extensible: Callouts to other languages, databases.

1.2 Relational Terminology

Definitions:

  • Database: Set of named relations.

  • Relation (aka Table):

    • Schema: description ( "metadata").

    • Instance: set of data satisfying the schema.

  • Attribute (aka Column, Field)

  • Tuple (aka Record, Row)

  • Cardinality: Number of rows in a table.

Database

Properties:

  • Schema is fixed, unique attribute names, atomic (aka primitive) types.

  • Tables are NOT ordered, they are sets or multisets (bags).

  • Tables are FLAT, no nested attributes.

  • Tables DO NOT prescribe how they are implemented/stored on disk, which is called physical data independence.

1.3 SQL Language

SQL Language:

  1. Two Sublanguages:

    • DDL (Data Definition Language): Define and modify schema.

    • DML (Data Manipulation Language): Queries can be written intuitively.

  2. Relational Database Management System (RDBMS): responsible for efficient evaluation, choose and run algorithms for declarative queries (choice of algorithm must not affect query answer).

SQL DDL Examples:

Sailors

sid

sname

rating

age

1

Fred

7

22

2

Jim

2

39

3

Nancy

8

27

CREATE TABLE Sailors ( sid INTEGER, sname CHAR(100), rating INTEGER, age FLOAT PRIMARY KEY (sid));

Boats

bid

bname

color

101

Nina

red

102

Pinta

blue

103

Santa Maria

red

CREATE TABLE Boats ( bid INTEGER, bname CHAR (20), color CHAR(10), PRIMARY KEY (bid));

Reserves

sid

bid

day

1

102

9/12

2

102

9/13

CREATE TABLE Reserves ( sid INTEGER, bid INTEGER, day DATE, PRIMARY KEY (sid, bid, day), FOREIGN KEY (sid) REFERENCES Sailors, FOREIGN KEY (bid) REFERENCES Boats);

1.4 SQL Queries

Basic Single Table Queries:

SELECT [DISTINCT] <column expression list> FROM <single table> [WHERE <predicate>]
  • Produce all tuples in the table that satisfy the predicate.

  • Output the expressions in the SELECT list,

  • Expression can be a column reference, or an arithmetic expression over column refs.

Example:

SELECT S.name, S.gpa, S.age*2 AS a2 FROM Students [AS] S WHERE S.dept = 'CS' ORDER BY S.gpa DESC, S.name ASC, a2; LIMIT 3;
  1. SELECT, FROM, WHERE lines:

    • Return all unique (name, GPA) pairs from students.

    • DISTINCT specifies removal of duplicate rows before output.

    • Can refer to the students table as S, this is called an alias.

  2. ORDER line:

    • ORDER BY clause specifies output to be sorted.

      • Sorts the output by GPA in descending order, then by name in ascending order.

      • Also computes the value of age*2 and names it a2.

    • Obviously must refer to columns in the output.

      Note the AS clause for naming output columns!

  3. LIMIT line:

    • Only produces the first 3 output rows.

    • Typically used with ORDER BY.

      • Otherwise the output is non- deterministic.

      • Not a "pure" declarative construct in that case – output set depends on algorithm for query processing.

Aggregates:

SELECT [DISTINCT] AVG(S.gpa) FROM Students S WHERE S.dept = 'CS';
  • Before producing output, compute a summary (aka an aggregate) of some arithmetic expression.

  • Produces 1 row of output, with one column of average in this case.

  • Other aggregates: SUM, COUNT, MAX, MIN (and others).

Group By:

SELECT [DISTINCT] AVG(S.gpa), S.dept FROM Students S GROUP BY S.dept
  • Partition table into groups with same GROUP BY column values, can group by a list of columns.

  • Produce an aggregate result per group, cardinality (rows) of output = number of distinct group values.

  • Note: can put grouping columns in SELECT list (in this case, average GPA + department).

Having:

SELECT [DISTINCT] AVG(S.gpa), S.dept FROM Students S GROUP BY S.dept HAVING COUNT(*) > 2
  • The HAVING predicate filters groups

  • HAVING is applied after grouping and aggregation

    • Hence can contain anything that could go in the SELECT list

    • i.e., aggs or GROUP BY columns

  • HAVING can only be used in aggregate queries

  • It's an optional clause

SQL Queries

2 SQL Part Ⅱ

2.1 Join Queries

Join: Form cross product of the tables, output all tuples and select specific columns.

Example:

Sailors

sid

sname

rating

age

1

Popeye

10

22

2

OliveOyl

11

39

3

Garfield

1

27

4

Bob

5

19

Reserves

sid

bid

day

1

102

9/12

2

102

9/13

1

101

10/01

SELECT S.sid, S.sname, R.bid FROM Sailors, Reserves WHERE Sailors.sid=Reserves.sid;
SELECT S.sid, S.sname, R.bid FROM Sailors AS S, Reserves AS R WHERE S.sid=R.sid;

Output

Join Queries

sid

sname

bid

1

Popeye

102

2

OliveOyl

102

Self-Joins

SELECT x.sname AS sname1, x.age AS age1, y.sname AS sname2, y.age AS age2 FROM Sailors AS x, Sailors AS y WHERE x.age > y.age;

Output

sname1

age1

sname2

age2

Popeye

22

Bob

19

OliveOyl

39

Popeye

22

OliveOyl

39

Garfield

27

OliveOyl

39

Bob

19

Garfield

27

Popeye

22

Garfield

27

Bob

19

Inner Join

2.2 Select & Where Advanced

Arithmetic Expressions

SELECT S.age, S.age-5 AS age1, 2*S.age AS age2 FROM Sailors AS S WHERE S.sname = 'Popeye';

SQL Calculator

SELECT log(1000) as three, exp(ln(2)) as two, cos(0) as one, ln(2*3) = ln(2) + ln(3) as sanity;

three

two

one

sanity

3.0

2.0

1.0

1

String Comparison

SELECT S.sname FROM Sailors AS S WHERE S.sname LIKE 'B_%';
SELECT S.sname FROM Sailors AS S WHERE S.sname ~ 'B.*';

Boolean Logic vs. Set Operators

Reserve a red or a green boat

SELECT R.sid FROM Boats B, Reserves R WHERE R.bid = B.bid AND (B.color = 'red' OR B.color = 'green');
SELECT R.sid FROM Boats B, Reserves R WHERE R.bid = B.bid AND B.color = 'red'; UNION ALL SELECT R.sid FROM Boats B, Reserves R WHERE R.bid = B.bid AND B.color = 'green';

Reserve a red and a green boat

-- A boat cannot be red and green at the same time SELECT R.sid FROM Boats B, Reserves R WHERE R.bid = B.bid AND (B.color = 'red' AND B.color = 'green');
SELECT R.sid FROM Boats B, Reserves R WHERE R.bid = B.bid AND B.color = 'red'; INTERSECT SELECT R.sid FROM Boats B, Reserves R WHERE R.bid = B.bid AND B.color = 'green';

Set Semantics

Default (Think of each letter as being a tuple in a relation):

R = {A, A, A, A, B, B, C, D}

S = {A, A, B, B, B, C, E}

  • UNION: {A, B, C, D, E}

  • INTERSECT: {A, B, C}

  • EXCEPT: {D}

"ALL": Multiset Semantics

R = {A, A, A, A, B, B, C, D} = {A(4), B(2), C(1), D(1)}

S = {A, A, B, B, B, C, E} = {A(2), B(3), C(1), E(1)}

  • UNION ALL (sum of all cardinalities): {A(6), B(5), C(2), D(1), E(1)}

  • INTERSECT ALL (min of cardinalities): {A(min(4,2)), B(min(2,3)), C(min(1,1)), D(min(1,0)), E(min(0,1))} = {A, A, B, B, C}

  • EXCEPT ALL (subtract cardinalities): {A(4-2), B(2-3), C(1-1), D(1-0), E(0-1)} = {A, A, D}

2.3 Nested Queries

  1. In

    Example

    SELECT S.sname FROM Sailors S WHERE S.sid IN (SELECT R.sid FROM Reserves R WHERE R.bid = 102); -- Subquery
  2. Not In

    Example

    SELECT S.sname FROM Sailors S WHERE S.sid NOT IN (SELECT R.sid FROM Reserves R WHERE R.bid = 103); -- Subquery
  3. Exists

    Example 1 (bit odd, but legal)

    SELECT S.sname FROM Sailors S WHERE EXISTS (SELECT * FROM Reserves R WHERE R.bid = 103); /* The query will output all the sailors if there is at least one sailor who reserves boat 103 or return nothing if there isn't */

    Example 2 (Nested Queries with Correlation)

    SELECT S.sname FROM Sailors S WHERE EXISTS (SELECT * FROM Reserves R WHERE R.bid = 103 AND R.sid = S.sid); /* The query will output all the sailors who reserve boat 103. * In this example, the subquery will take every tuple of the sailors and run it, * tuple by tuple. */

Example for Division

SELECT S.sname FROM Sailors S WHERE NOT EXISTS (SELECT B.bid FROM Boats B WHERE NOT EXISTS (SELECT R.sid FROM Reserves R WHERE R.bid = B.bid AND R.sid = S.sid));

ARGMAX

Example: Sailor with the Highest Rating

SELECT * FROM Sailors S WHERE S.rating >= ALL (SELECT S2.rating FROM Sailors S2);
SELECT * FROM Sailors S ORDER BY S.rating DESC LIMIT 1;
Last modified: 25 November 2024