Part Ⅰ
2 Linked Lists
The sentinel reference always points to a sentinel node.
Sentinel code make it easier to reason about code, and also give you specific goals to strive for in making sure your code works.
2.1 Singly Linked Lists
The first item (if it exists) is at sentinel.next
.
No need to check for special cases since sentinel node is never null!
2.2 Doubly Linked Lists & Deques
3 Union-Find
3.1 Quick Find (Eager Approach)
parent[i]
is the root of, and are connected if and only if they have the same parent[i]
.Find: Check if
and have the same parent[i]
.Union: To merge components containing
and , change all entries whose id equals parent[p]
toparent[q]
.
3.2 Quick Union (Lazy Approach)
parent[i] is the parent of
, root of is parent[parent[...[i]]] (keep going until it doesn't change). Find: Check if
and have the same root. Union: To merge components containing
and , set the parent of 's root to the parent of 's root.
3.3 Quick Union Improvements
Two Improvements: Weighting and Path Compression (WQUPC).
Weighted quick-union
Modify quick-union to avoid tall trees.
Keep track of size of each tree (number of objects).
Balance by linking root of smaller tree to root of larger tree.
Path compression
Just after computing the root of
, set the parent of each examined node to point to that root.
rank[i]
is the size of the tree rooted at. Property: Starting from an empty data structure, any sequence of
union-find operations on objects makes array accesses. In theory, WQUPC is not linear; in practice, it is.
Application: Percoloation, games, dynamic connectivity, least common ancestor, Kruskal's minimum spanning tree algorithm, etc.
4 Bags, Queues and Stacks
Stack: Examine the item most recently added (LIFO: last in first out).
Queue: Examine the item least recently added (FIFO: first in first out).
4.1 Stacks
Defintions
Client: program using operations defined in interface.
Implementation: actual code implementing operations.
Interface: description of data types, basic operations.
Separate interface from implementation!
Benefits:
Clients can't know details of implementation => client has many implementation from which to choose.
Implementation can't know details of client needs => many clients can re-use the same implementation.
Design: creates modular, reusable libraries.
Performance: use optimized implementation where it matters.
4.1.1 Built-In Stacks
4.1.2 Linked-List Implementation
Stack pop
Save item to return.
Delete first node.
Return saved item.
Stack push
Save a link to the list.
Create a new node for the beginning.
Set the instance variables in the new node.
Properties
Every operation takes constant time in the worst case.
A stack with
items uses bytes.
4.1.3 Resizing-Array Implementation
Property: Uses between
when full. when one-quarter full.
Linked-List Implementation | Resizing-Array Implementation |
---|---|
Every operation takes constant time in the worst case. | Every operation takes constant amortized time. |
Uses extra time and space to deal with the links. | Less wasted space. |
4.2 Queues
4.2.1 Built-in Queues
4.2.2 Queue Implementation
Queue dequeue
Save item to return.
Delete first node.
Return saved item.
Queue enqueue
Save a link to the last node.
Create a new node for the end.
Link the new node to the end of the list.
4.3 Generics
Use generics for different types of objects!
The implementation above has already taken generics into account.
4.4 Iterators
Iterators allow clients to iterate through the items in a collection.
Using iterator, we can:
The implementation above has already taken iterators into account.
For more information on iterators, please visit Iterators in C++.
4.5 Bag (Princeton)
The order of items in a bag does not matter.
5 Elementary Sorts
5.1 Selection Sort
Time Complexity:
. Space Complexity:
.
Selection Sort
In iteration
, find index of smallest remaining entry. Swap
a[i]
anda[min]
.
Properties
Use
compares and exchanges. Quadratic time, even if the input is sorted!
Data movement is minimal: linear number of exchanges -> suitable for small number of data.
5.2 Insertion Sort
Time Complexity:
. Space Complexity:
.
Insertion Sort
In iteration
, swap with each larger entry to its left. Entries to the left of
are in ascending order.
Definitions
Inversion: A pair of keys that are out of order.
Example A E E L M O T R X P S
Six inversions: T-R T-P T-S R-P X-P X-S
Partially Sorted: An array is partially sorted if the number of inversions is
.
Property: For partially-sorted arrays, insertion sort runs in linear time.
Proof: Number of exchanges equals the number of inversions (number of compares = exchanges + (N – 1)).
Running Time Analysis
On average: To sort a randomly-ordered array with distinct keys, insertion sort uses
compares and exchanges on average. Best case: If the array is in ascending order, insertion sort makes
compares and exchanges. Worst Case: If the array is in descending order (and no duplicates), insertion sort makes
compares and exchanges.
5.3 Shell Sort
Shell Sort
Move entries more than one position at a time by
-sorting the array. -sort the array: Insertion sort, with stride .
Increment Sequence
Knuth's increment sequence (3x+1): 1, 4, 13, 40, 121, 364, 1093, ... (OK, easy to compute)
Sedgewick's increment sequence: ( merging of
and 1, 5, 19, 41, 109, 209, 505, 929, 2161, ... (Good, tough to beat empirical studies).
Property: With the
5.4 Shuffling
Knuth Shuffle
In iteration
, pick integer between and uniformly at random. Swap
a[i]
anda[r]
.
5.5 Convex Hull
Convex Hull: A convex hull of a set of
Equivalent definitions
Smallest convex set containing all the points.
Smallest area convex polygon enclosing the points.
Convex polygon enclosing the points, whose vertices are points in set.
Geometric properties
Can traverse the convex hull by making only counterclockwise turns.
The vertices of convex hull appear in increasing order of polar angle with respect to point
with lowest -coordinate.
Graham Scan
Choose point
with smallest -coordinate. Sort points by polar angle with
. Consider points in order; discard unless it create a ccw (counterclockwise) turn.
Implementing ccw: Given three points
ccw
Determinant (or cross product) gives
signed area of planar triangle. If signed area
, then -> -> is counterclockwise. If signed area
, then -> -> is clockwise. If signed area
, then -> -> are collinear.
Proof
Applications
Hammer nails perpendicular to plane, stretch elastic rubber band around points
Find shortest path in the plane from
to that avoids a polygonal obstacle.Shortest path is either straight line from
to or it is one of two polygonal chains of convex hull .Given
points in the plane, find a pair of points with the largest Euclidean distance between them.Farthest pair of points are extreme points on convex hull.
Cost of Convex Hull:
Proof: Convex hull reduces to sorting
cost of sorting cost of reduction
Point2D
#ifndef POINT2D_H #define POINT2D_H #include <iostream> class Point2D { public: double x; double y; Point2D(const double x, const double y) : x(x), y(y) {} [[nodiscard]] double distanceTo(const Point2D& that) const; [[nodiscard]] double distanceSquaredTo(const Point2D& that) const; [[nodiscard]] int compareTo(const Point2D& that) const; struct PolarOrder; bool operator==(const Point2D& other) const; friend std::ostream& operator<<(std::ostream& os, const Point2D& p); static int ccw(const Point2D& a, const Point2D& b, const Point2D& c); }; struct Point2D::PolarOrder { Point2D origin; explicit PolarOrder(const Point2D& origin) : origin(origin) {} bool operator()(const Point2D& q1, const Point2D& q2) const; }; #endif // POINT2D_H
Graham Scan
6. Mergesort
6.1 Mergesort
Time complexity:
.Space complexity:
.
Merge Sort
Divide the array into two halves.
Recursively sort each half.
Merge two halves.
In-place: A sorting algorithm is in-place if it uses
Example: Insertion sort, selection sort, shellsort.
Property: Mergesort uses at most
Proof
The number of compares
->
Prove
And we have:
6.2 Bottom-Up Mergesort
Bottom-Up Mergesort
Pass through array, merging subarrays of size 1.
Repeat for subarrays of size 2, 4, 8, 16, ...
6.3 Computational Complexity
Definitions
Computational complexity: Framework to study efficiency of algorithms for solving a particular problem
.Model of computation: Allowable operations.
Cost model: Operation count(s).
Upper bound: Cost guarantee provided by some algorithm for
.Lower bound: Proven limit on cost guarantee of all algorithms for
.Optimal algorithm: Algorithm with best possible cost guarantee for
.
Example: sorting
Model of computation: decision tree (can access information only through compares).
Cost model: Number of compares .
Upper bound:
from mergesort.Lower bound:
.Optimal algorithm = mergesort.
Property: Any compare-based sorting algorithm must use at least
Proof
Assume array consists of
distinct values through .Worst case dictated by height
of decision tree.Binary tree of height
has at most leaves. different orderings => at least leaves.
6.4 Stability
Stability: A stable sort preserves the relative order of items with equal keys.
Methods
Need to carefully check the code ("less than" vs " less than or equal to"). Equal items never move past each other -> stable sort.
Counterexample: Long-distance exchange might move an item past some equal item.
Conclusion
Stable sort: Insertion sort, mergesort, bubble sort, radix sort, etc.
Unstable sort: Selection sort, shellsort, quicksort, heapsort, etc.
7 Quicksort
7.1 Quicksort
Quicksort
Shuffle the array.
Partition so that, for some
:Entry
a[j]
is in place;No larger entry to the left of
;No smaller entry to the right of
.
Sort each piece recursively .
Quicksort Partitioning
Scan
from left to right so long as (a[i] < a[lo]).Scan
from right to left so long as (a[j] > a[lo]).Exchange a[i] with a[j], repeat until
and pointers cross.When pointers cross, Exchange a[lo] with a[j].
Runtime Analysis
Time complexity:
.Best case: Number of compares is
.Worst case: Number of compares is
.
Property: The average number of compares
Proof
Multiply both sides by
and collect terms:Subtract this from the same equation for
:Repeatedly apply above equation:
Approximate sum by integral:
Finally, desired result:
Quicksort is an in-place, non-stable sorting algorithm.
Summary of performance characteristics:
Worst case: Number of compares are quadratic.
More likely that your computer is struck by lightning bolt.
Average case: Number of compares is
.39% more compares than mergesort.
But faster than mergesort in practice because of less data movement.
Random shuffle
Probabilistic guarantee against worst case.
Basis for math model that can be validated with experiments .
Caveat emptor
Many textbook implementations go quadratic if array:
Is sorted or reverse sorted.
Has many duplicates (even if randomized!)
7.2 Selection
Goal: Given an array of
Quick Selection
Partition array so that entry
a[j]
is in place.Repeat in one subarray, depending on
; finished when equals .
Property: Quick-select takes linear time on average.
Proof
Intuitively, each partitioning step splits array approximately in half:
compares.Formal analysis similar to quicksort analysis yields:
to find the median
7.3 Duplicate Keys
Problem of quicksort with duplicate keys
Mergesort with duplicate keys: Always between
and compares.Quicksort with duplicate keys: Algorithm goes quadratic unless partitioning stops on equal keys.
Goal: Partitioning array into 3 parts so that:
Entries between lt and gt equal to partitioning item v.
No larger entries to left of lt.
No smaller entries to right of gt.
Dijkstra 3-way Partitioning
Let v be partitioning item a[lo].
Scan i from left to right.
a[i] < v: exchange a[lt] with a[i]; increment both
and .a[i] > v: exchange a[gt] with a[i]; decrement
.a[i] == v: increment
.
Sorting lower bound: If there are
compares in the worst case (
Property: Quicksort with 3-way partitioning is entropy-optimal.
Conclusion
8 Priority Queues
8.1 API and Elementary Implementations
Applications
Event-driven simulation [customers in a line, colliding particles]
Numerical computation [reducing roundoff error]
Data compression [Huffman codes]
Graph searching [Dijkstra's algorithm, Prim's algorithm]
Number theory [sum of powers]
Artificial intelligence [A* search]
Statistics [maintain largest M values in a sequence]
Operating systems [load balancing, interrupt handling]
Discrete optimization [bin packing, scheduling]
Spam filtering [Bayesian spam filter]
Built-in Implementation
Unordered Array Implementation
Ordered Array Implementation
8.2 Binary Heap
8.1.1 Concepts & Properties
Definitions:
Binary tree: Empty or node with links to left and right binary trees.
Complete binary tree: Perfectly balanced, except for bottom level.
Binary heap: Array representation of a heap-ordered complete binary tree.
Property of complete binary tree: Height of complete binary tree with
Proof: Height only increases when
Properties of Binary Heap:
Key in nodes.
Parent's key no smaller than children's keys.
Largest key is
a[1]
, which is root of binary tree.Can use array indices to move through tree.
Parent of node at
is at .Children of node at
are at and .
8.1.2 Binary-Heap Implementation of Priority Queues
Violation in Binary Heap (Child's key becomes larger key than its parent's key
Exchange key in child with key in parent.
Repeat until heap order restored.
Insertion in Binary Heap
Add node at end, then swim it up.
At most
compares.
Demotion in Binary Heap (Parent's key becomes smaller than one (or both) of its children's
Exchange key in parent with key in larger child.
Repeat until heap order restored.
Deletion in Binary Heap
Exchange root with node at end, then sink it down.
At most
compares.
8.3 Indexed Priority Queue
Indexed Priority Queue: Associate an index between
Client can insert and delete-the-minimum.
Client can change the key by specifying the index.
Indexed Priority Queue
Start with same code as MinPQ.
Maintain parallel arrays keys[], pq[] and qp[] so that:
keys[i] is the priority of i.
pq[i] is the index of the key in heap position i.
qp[i] is the heap position of the key with index i .
Use swim(qp[i]) implement decreaseKey(i, key).
8.4 Heapsort
Heapsort
Heap construction: Build max heap using bottom-up method.
Sortdown: Repeatedly delete the largest remaining item.
Properties
Heap construction uses
compares and exchanges.Heapsort uses
compares and exchanges.Heapsort is an in-place algorithm with
worst-case.