Database System

Last Modified: 2/12/2025

1 SQL Part Ⅰ

1.1 SQL Introduction

2 Disks & Drives

2.1 DBMS Architectures

Favicon

Features

2.2 Storage Techniques

Memory Hierarchy

Favicon

2.2.1 Disks

Disk Components

Favicon

Platters spin together (around 15000 rpm). Arm assembly can move in or out to position a head on a desired track, but only one head reads/writes at any one time.

Disk Operation & Access Time

Disks read and write data in sector-size blocks with access time of three components.

Tavg transfer=1RPM×1average # sectors/tracks×60s1minT_{\text{avg transfer}}=\frac{1}{\text{RPM}} \times \frac{1}{\text{average \# sectors/tracks}} \times \frac{60s}{1min}

Disk Capacity

Disk capacity is determined by the following technology factors.

Capacity=# bytessector×average # sectorstrack×# trackssurface×# surfacesplatter×# plattersdisk\text{Capacity} = \frac{\text{\# bytes}}{\text{sector}} \times \frac{\text{average \# sectors}}{\text{track}} \times \frac{\text{\# tracks}}{\text{surface}} \times \frac{\text{\# surfaces}}{\text{platter}} \times \frac{\text{\# platters}}{\text{disk}}

2.2.2 Random Access Memory (RAM)

Random-access memory (RAM) is a form of electronic computer memory that can be read and changed in any order, typically used to store working data and machine code. A random-access memory device allows data items to be read or written in almost the same amount of time irrespective of the physical location of data inside the memory.

RAM is a type of volatile memory (a type of memory that requires power to maintain the stored information). There are two widely used forms of RAM: SRAM and DRAM (there are SDRAM and so forth as well).

SRAMDRAM
CompositionTypically 4 or 6 transistors1 transitor + 1 capacitor
Basic StructureSRAM StructureDRAM Structure
Relative Access Time1×1 \times10×10 \times
RefreshingDoes not require refreshing, loses data if power is turned offRequires constant refreshing to maintain data, loses data if not refreshed
ApplicationsCache memoriesMain memories, frame buffer

2.2.3 Solid State Disks (SSDs)

Key Terminology

For reading in SSDs, the process is straightforward:

  1. Address Mapping: The SSD controller translates logical block addresses (LBAs) to physical block addresses (PBAs).
  2. Read: The SSD controller reads the data from the physical block address.

For writing in SSDs, the process is more complex:

  1. Address Mapping: Same as in the reading procedure.
  2. Check Page State: Check if the page to be written already contains data. If the page is part of a block that has been recently erased (all ‘1’s), SSD can directly write to the page. Otherwise, the SSD must:
    • Read block contents, identify & copy valid pages (not the target page we want to modify) to a new, erased block (write amplification).
    • SSD controller uses the new modified data and writes the new content to the corresponding page in the new block, updates the mapping table, and the original block is ready for garbage collection.

2.2.4 Locality

Principle of Locality

#include <iostream>
#include <vector>
int main() {
std::vector<int> data(1000, 1);
int sum = 0;
// Temporal locality: 'sum' is accessed repeatedly in the loop
for (int i = 0; i < data.size(); ++i) {
sum += data[i]; // Spatial locality: 'data[i]' is accessed sequentially
}
std::cout << "Sum: " << sum << std::endl;
return 0;
}

/* TODO: Add more examples */

2.3 Database Files

For database system, tables are stored as logical files on disk. Each file contains a collection of pages, and each page contains a collection of records. Normally, a record represents a row (tuple) in a table, and it contains a collection of fields (attributes).

API for higher levels of DBMS

Pages are managed by:

Procedure

  1. The DBMS wants to access a specific record. It determines the logical page number containing that record, and asks the buffer manager for the page with that logical page number.
  2. Buffer Hit/Miss:

For DBMS, several of the possible ways of organizing records in files are:

2.3.1 Heap Files

Unordered heap files: Collection of records in no particular order.

Heap Files