Database System
Last Modified: 1/18/2025
1 SQL Part Ⅰ
1.1 SQL Introduction
2 Disks & Drives
2.1 DBMS Architectures
Features- Organized in layers
- Each layer abstracts the layer below: Manage complexity (easy to use) & Performance assumptions (assume the perfomrance of lower layers)
- Two cross-cutting issues related to storage and memory management: Concurrency control & Recovery
2.2 Storage Techniques
Memory Hierarchy
2.2.1 Disks
Disk Components
Platters spin together (around 15000 rpm). Arm assembly can move in or out to position a head on a desired track, but only one head reads/writes at any one time.
Disk Operation & Access TimeDisks read and write data in sector-size blocks with access time of three components.
- Seek Time: Moving arms to position disk head on track that contains that contains the target sector.
- Rotational Delay: Waiting for block to rotate the first bit of the target sector to pass under the head.
- Transfer Time: Moving data to/from disk surface (Read/Write the contents of the sector).
Disk capacity is determined by the following technology factors.
- Recording Density (bits/in): The number of bits that can be squeezed into a 1-inch segment of a track.
- Track density (tracks/in): The number of tracks that can be squeezed into a 1-inch segment of the radius extending from the center of the platter.
- Areal density (bits/in2): The product of the recording density and the track density.
2.2.2 Random Access Memory (RAM)
Random-access memory (RAM) is a form of electronic computer memory that can be read and changed in any order, typically used to store working data and machine code. A random-access memory device allows data items to be read or written in almost the same amount of time irrespective of the physical location of data inside the memory.
RAM is a type of volatile memory (a type of memory that requires power to maintain the stored information). There are two widely used forms of RAM: SRAM and DRAM (there are SDRAM and so forth as well).
SRAM | DRAM | |
---|---|---|
Composition | Typically 4 or 6 transistors | 1 transitor + 1 capacitor |
Basic Structure | ||
Relative Access Time | ||
Refreshing | Does not require refreshing, loses data if power is turned off | Requires constant refreshing to maintain data, loses data if not refreshed |
Applications | Cache memories | Main memories, frame buffer |
2.2.3 Solid State Disks (SSDs)
Key Terminology
- Cells: The fundamental units of storage, storing a single bit of information.
- Pages: Groups of cells form pages. These are the smallest units that can be read or written to. Typical page sizes are 4KB or 8KB.
- Blocks: Groups of pages form blocks. Blocks are the smallest units that can be erased. Typical block sizes are 1-2MB.
- Planes/Dies/Channels: These are ways to parallelize read/write access.
For reading in SSDs, the process is straightforward:
- Address Mapping: The SSD controller translates logical block addresses (LBAs) to physical block addresses (PBAs).
- Read: The SSD controller reads the data from the physical block address.
For writing in SSDs, the process is more complex:
- Address Mapping: Same as in the reading procedure.
- Check Page State: Check if the page to be written already contains data. If the
page is part of a block that has been recently erased (all ‘1’s), SSD can directly
write to the page. Otherwise, the SSD must:
- Read block contents, identify & copy valid pages (not the target page we want to modify) to a new, erased block (write amplification).
- SSD controller uses the new modified data and writes the new content to the corresponding page in the new block, updates the mapping table, and the original block is ready for garbage collection.
Note
- For SSDs, high charge means ‘1’, and low charge means ‘0’ (related to floating-gate transistor). Writing ‘1’ to ‘0’ requires removing charge, and resetting requires an electrical impulse!
- Flash blocks only support 2k-3k erasures before failure, so wear-leveling (by spreading erasures evenly across all blocks) is important!
- For SSDs, read sequences aren’t necessary, but random writing is slower than sequential writing!