x86-64 Architecture
6 Introduction to Assembly Language
6.1 Assembly Language
Assembly (also known as Assembly language, ASM): A low-level programming language where the program instructions match a particular architecture's operations.
Architecture: (also ISA: instruction set architecture) The parts of a processor design that one needs to understand for writing assembly/machine code.
Properties:
Splits a program into many small instructions that each do one single part of the process.
Each architecture will have a different set of operations that it supports (although there are similarities).
Assembly is not portable to other architectures.
Complex/Reduced Instruction Set Computing
Early trend - add more and more instructions to do elaborate operations
Complex Instruction Set Computing (CISC)
Difficult to learn and comprehend language
Less work for the compiler
Complicated hardware runs more slowly
Opposite philosophy later began to dominate
Reduced Instruction Set Computing (RISC)
Simple (and smaller) instruction set makes it easier to build fast hardware.
Let software do the complicated operations by composing simpler ones.
6.2 Registers
Assembly uses registers to store values. Registers are:
Small memories of a fixed size.
Can be read or written.
Limited in number.
Very fast and low power to access.
Registers | Memory | |
---|---|---|
Speed | Fast | Slow |
Size | Small e.g., 32 registers * 32 bit = 128 bytes | Big 4-32 GB |
Connection | More variables than registers? Keep most frequently used in registers and move the rest to memory |
x86-84 Registers
%rax %eax | %r8 %r8d |
%rbx %ebx | %r9 %r9d |
%rcx %ecx | %r10 %r10d |
%rdx %edx | %r11 %r11d |
%rsi %esi | %r12 %r12d |
%rdi %edi | %r13 %r13d |
%rsp %esp | %r14 %r14d |
%rbp %ebp | %r15 %r15d |
6.3 x86-64 Instructions
In high-level languages, variable types determine operation.
In assembly, operation determines type, i.e., how register contents are treated.
Operations
Transfer data between memory and register
Load data from memory into register
Store register data into memory
Perform arithmetic function on register or memory data
Transfer control
Unconditional jumps to/from procedures
Conditional branches
Indirect branches
6.3.1 Moving Data Instructions
movq source, dest
Operand Types
Immediate: Constant integer data. Like C constants, but prefix with '$'.
Memory:: 8 consecutive bytes of memory at address given by register (for movq).
This is corresponding to
movq
. Themov
instruction without a suffix is size-dependent.mov
defaults to 32-bit operations when the size isn't explicitly indicated.Recommend to use the explicit suffixes (movq, movl, movw, movb) whenever possible
Memory Addressing Mode
Normal (R) => pointer dereferencing in C
movq (%rcx),%rax
equals:rax = *rcxDisplacement D(R) => accessing data at a fixed offset from a base address
movq 8(%rbp), %rdx
equals:rdx = *(((char*)rbp) + 8)Complete Mode: D(Rb,Ri,S)
D: Constant "displacement" 1, 2, or 4 bytes
Rb: Base register: Any of 16 integer registers
Ri: Index register: Any, except for %rsp
S: Scale: 1, 2, 4, or 8
Example:
movl (%rbx,%rdi,4), %eax
Calculate the address: %rbx + (4 * %rdi) (e.g., base address + 4 * 5 = base address + 20).
Access the 32-bit value at that calculated address.
Move the retrieved value into the %eax register.
6.3.2 Address Computation Instructions
leaq src, dst
It can also be used for arithmetic expressions.
6.3.3 Arithmetic Instructions
Two Operand Instructions:
Format | Computation |
---|---|
| dest = dest + src |
| dest = dest - src |
| dest = dest * src |
| dest = dest << src |
| dest = dest >> src |
| dest = dest >> src |
| dest = dest ^ src |
| dest = dest & src |
| dest = dest | src |
One Operand Instructions:
Format | Computation |
---|---|
| dest = dest + 1 |
| dest = dest - 1 |
| dest = - dest |
| dest = ~ dest |
| dest = dest >> src |
| dest = dest >> src |
| dest = dest ^ src |
| dest = dest & src |
| dest = dest | src |
Example
6.4 C, Assembly & Machine Code
Compile with the following code (debugging-friendly):
Use the following command to generate the assembly code:
Use the following command to disassembly the machine code:
7 Control Flow Instructions
7.1 Conditional Codes
Processor State (Partial)
Temporary Data: %rax, ...
Location of runtime stack: %rsp, ...
Location of current code control point: %rip, ...
Status of recent tests: CF, ZF, SF, OF, ... => Conditional Codes!
Conditional Codes
Single bit registers, GDB prints these as one "rflags" register or "eflags"(more commonly used, lower 32 bits).
Implicitly set (as side effect) of arithmetic operations.
CF (Carry Flag): Overflowing or underflowing in an unsigned range.
ZF (Zero Flag): The result of an operation is zero
SF (Sign Flag): The result of an operation, interpreted as a signed two's complement number, is negative (MSB is 1).
OF (Overflow FLag): Overflow in signed two's complement.
Instructions
cmp a, b
Computes
, set conditional codes based on result, but does not change b! Used for
if (a < b)
test a, b
Computes
, set conditional codes based on the result, but does not change b! Most common use:
test %rx, %rx
to compare %rx to zeroSecond most common use:
test %rX, %rY
to test if any of the 1-bits in%rY
are also 1 in%rX
.
7.2 Conditional Branches
Jumping
| Condition | Description |
---|---|---|
| 1 | Unconditional |
|
| Equal/Zero |
|
| Not Equal/Not Zero |
|
| Negative |
|
| Non-Negative |
|
| Greater (Signed) |
|
| Greater or Equal (Signed) |
|
| Less (Signed) |
|
| Less or Equal (Signed) |
|
| Above (unsigned) |
|
| Below (unsigned) |
SetX
Instructions: Set low-order byte of destination to 0 or 1 based on combinations of condition codes. Does not alter remaining 7 bytes!
| Condition | Description |
---|---|---|
|
| Equal/Zero |
|
| Not Equal/Not Zero |
|
| Negative |
|
| Non-Negative |
|
| Greater (Signed) |
|
| Greater or Equal (Signed) |
|
| Less (Signed) |
|
| Less or Equal (Signed) |
|
| Above (unsigned) |
|
| Below (unsigned) |
Normal version vs. goto version
Example
7.3 Loops
Example
7.4 Switch Statements
Example
8 Program Optimization
Limitations of Optimizing Compilers
Operate under fundamental constraint.
Behavior that may be obvious to programmers can be obfuscated by languages and code styles.
When in doubt, the compiler must be conservative.
...
8.1 Generally Useful Optimizations
Reduce frequency with which computation performed
void set_row(double *a, double *b, long i, long n) { long j; for (j = 0; j < n; j++) { a[i * n + j] = b[j]; } }void set_row(double *a, double *b, long i, long n) { long j; int ni = n + i; for (j = 0; j < n; j++) { a[ni + j] = b[j]; } }Reduction in Strength: Shift, add instead of multiply or divide
for (int i = 0; i < n; i++) { int ni = n * i; for (int j = 0; j < n; j++) a[ni + j] = b[j]; }int ni = 0; for (int i = 0; i < n; i++) { for (int j = 0; j < n; j++) a[ni + j] = b[j]; ni += n; }Share Common Subexpressions
/* Sum neighbors of i, j */ up = val[(i - 1) * n + j]; down = val[(i + 1) * n + j]; left = val[i * n + j - 1]; down = val[i * n + j + 1]; sum = up + down + left + right; // 3 multiplicationslong inj = i * n + j; up = val[inj - n]; down = val[inj + n]; left = val[inj - 1]; down = val[inj + 1]; sum = up + down + left + right; // 1 multiplicationRemove unnecessary procedure calls
8.2 Optimization Blockers
Procedure Calls
/* Convert string to lower case with quadratic performance*/ void lower(char* s) { size_t i; for (i = 0; i < strlen(s); i++) { if (s[i] >= 'A' && s[i] <= 'Z') s[i] += 'a' - 'A'; } }/* Optimized code */ void lower(char* s) { size_t i; size_t len = strlen(s); for (i = 0; i < len; i++) { if (s[i] >= 'A' && s[i] <= 'Z') s[i] += 'a' - 'A'; } }strlen()
takes linear time to scan the string until it reaches the null character.Why couldn't compiler move
strlen()
out of inner loop?Procedure may have side effects.
Function may not return the same value for a given argument.
Compiler treats procedure call as a black box => weak optimizaion!
You can use GCC with inline functions and optimization O1, but better do yur own code motion.
Memory Alias
void sum_rows1(double *a, double *b, long n) { long i, j; for (i = 0; i < n; i++) { b[i] = 0; for (j = 0; j < n; j++) { b[i] += a[i * n + j]; } } } int main() { double A[9] = {0, 1, 2, 4, 8, 16, 32, 64, 128}; double B[3] = A + 3; sum_rows1(A, B, 3); return 0; }Memory and registers have to pass values over and over again!
The optimizer cannot optimize because the code may updates a[i] on every iteration!
8.3 Exploit Instruction-Level Parallelism
CPE (Cycle Per Element)
Convenient way to express performance of program that operates on vectors or lists.
Cycles = CPE*n + Overhead, CPE is slope of line.
Modern CPU Design
Pipelined Functional Units
Divide computation into stages.
Pass partial computations from stage to stage.
Loop Unrolling: Reason: Breaks sequential dependency
No Loop Unrolling
Loop Unrolling (
Loop Unrolling (
Loop Unrolling (
8.4 Dealing with Conditions
Instruction Control Unit must work well ahead of Execution Unit to generate enough operations to keep EU busy!
Branch Prediction
Guess which way branch will go.
Begin executing instructions at predicted position, but don’t actually modify register or memory data.
If correct, execute; if mispredict, recover.
9 Memory Hierarchy
10 Linking
Example
main.c
sum.c
Compile with the following command:
Advantages of linkers
Modularity: Program can be written as a collection of smaller source files, rather than one monolithic mass. Can build libraries of common functions.
Efficiency: Separate compilation and libraries.
Types of linking
Static Linking: Executable files and running memory images contain only the library code they actually use.
Dynamic Linking: Executable files contain no library code. During execution, single copy of library code can be shared across all executing processes.
10.1 Static Linking
Static Linking
Symbol resolution
Programs define and reference symbols (global variables, functions, static variable (with static attribute)).
Symbol definitions are stored in object file (by assembler) in symbol table. During symbol resolution step, the linker associates each symbol reference with exactly one symbol definition.
Relocation
Merges separate code and data sections into single sections.
Relocates symbols from their relative locations in the
.o
files to their final absolute memory locations in the executable.Updates all references to these symbols to reflect their new positions.
Three types of object files
Relocatable object file (
.o
file): Contains code and data in a form that can be combined with other relocatable object files to form executable object file.Executable object file (
a.out
file) Contains code and data in a form that can be copied directly into memory and then executed.Shared object file (
.so
file): Special type of relocatable object file that can be loaded into memory and linked dynamically, at either load time or run-time.
10.2 ELF Object Files
Executable and Linkable Format (ELF): One unified format for relocatable object files, executable object files & shared object files
Elf header: Word size, byte ordering, file type (.o, exec, .so), machine type, etc.
Segment header table (only for executable object file): Page size, virtual address memory segments (sections), segment sizes (indicate where different segments of the code go, like stack, shared library, etc.).
.text section (read only): code.
.rodata section (read only): Read only data: jump tables, string constants, ...
.data section: Initialized global and static variables. Local C variables are maintained at run time on the stack and do not appear in either the .data or .bss sections.
.bss section (Block Started by Symbol): (Preciesly in modern gcc) Uninitialized and static variables, along with any global or static variables that are initialized to zero, has section header but occupies no space.
.symtab section: Symbol table for functions and global variables' names, section names and locations.
.rel.text section: Relocation info for .text section, addresses of instructions that will need to be modified in the executable.
.rel.data section: Relocation info for .data section, addresses of pointer data that will need to be modified in the merged executable.
.debug section: Info for symbolic debugging (gcc -g).
Section header table: Offsets and sizes of each section.
Linker Symbols
Global symbols: Symbols defined by module m that can be referenced by other modules. These correspond to nonstatic C functions and global variables.
External symbols: Global symbols that are referenced by module m but defined by some other module. These correspond to nonstatic C functions and global variables that are defined in other modules.
Local symbols: Symbols that are defined and referenced exclusively by module m. These correspond to static C functions and global variables that are defined with the static attribute.
Example (symbols.c)
incr
, foo
, main
, printf
will be on the symbol table of symbols.o.
Types of Duplicate Symbol Definitions
strong: procedures and initialized globals.
weak: uninitialized globals, or ones declared with specifier
extern
.
Linker's Symbol Rules
Multiple strong symbols are not allowed. Each item can be defined only once, otherwise will cause linker error.
Given a strong symbol and multiple weak symbols, choose the strong symbol.
If there are multiple weak symbols, pick an arbitrary one (can override this with
gcc –fno-common
).
Three types of symbols that do not have entries in the section header table
ABS: Symbols that should not be relocated
UNDEF: Symbols that are referenced in this object module but defined elsewhere.
COMMON: Uninitialized data objects that are not yet allocated (Precisely, uninitialized global variables).
11 Exceptional Control Flow
From startup to shutdown, each CPU core simply reads and executes (interprets) a sequence of instructions, one at a time (externally), but it is not sufficient to react to changes in system state, which are Exceptional Control Flow (ECF).
Exceptional Control Flow exists at all levels of a computer system
Low Level Mechanisms
Exceptions: Change in control flow in response to a system event (i.e., change in system state), implemented using combination of hardware and OS software.
High Level Mechanisms
Process context switch: Implemented by OS software and hardware timer.
Signals: Implemented by OS software.
Nonlocal jumps: Implemented by C runtime library.
11.1 Exceptions
An exception is a transfer of control to the OS kernel in response to some event (i.e., change in processor state).
Kernel is the memory-resident part of the OS.
Examples of events: Divide by 0, arithmetic overflow, page fault, I/O request completes, typing Ctrl-C
Exception Handling
Each type of event has a unique exception number
. = index into exception table (a.k.a. interrupt vector). Handler
is called each time exception occurs.
Classes of Exceptions
Asynchronous Exceptions: Caused by events external to the processor.
Interrupts:
Indicated by setting the processor’s interrupt pin.
Handler returns to "next" instruction.
Examples
Timer interrupt: Every few ms, an external timer chip triggers an interrupt, used by the kernel to take back control from user programs (avoiding infinite loops, etc.).
I/O interrupt from external device: Hitting Ctrl-C at the keyboard, arrival of a packet from a network, arrival of data from a disk.
Synchronous Exceptions: Caused by events that occur as a result of executing an instruction.
Traps
Intentional, set program up to "trip the trap" and do something.
Examples: system calls, gdb breakpoints.
Returns control to "next" instruction.
Faults
Unintentional but possibly recoverable.
Examples: page faults (recoverable), protection faults (unrecoverable), floating point exceptions.
Either re-executes faulting ("current") instruction or aborts.
Unintentional and unrecoverable.
Examples: illegal instruction, parity error, machine check.
Aborts current program.
System Calls: Whenever a program wants to cause an effect outside its own process, it must ask the kernel for help.
Read/write files.
Get current time.
Allocate RAM (sbrk).
Create new processes.
System Calls Examples: Opening Files
User calls:
open(filename, options)
Calls
__open
function, which invokes system call instructionsyscall
.%rax contains syscall number, and return value in %rax. Negative value is an error corresponding to negative errno.
Almost like a function call, but executed by kernel. Plus, "address" of "function" is in %rax.
open_examples.c
open_examples.s
11.2 Processes
Process: An instance of a running program. It is identified by Process ID (PID), user account, command name.
Two key abtractions:
Private address space: Each program seems to have exclusive use of main memory. This is provided by kernel mechanism called virtual memory.
Logical control flow: Each program seems to have exclusive use of the CPU. This is provided by kernel mechanism called context switching.
From startup to shutdown, each CPU core simply reads and executesa sequence of machine instructions, one at a time*. This sequence is the CPU’s control flow (or flow of control).
11.2.1 Context Switch
Processes are managed by a shared chunk of memory-resident OS code called the kernel. The kernel is not a separate process, but rather runs as part of some existing process.
Control flow passes from one process to another via a context switch.
Context Switching (Uniprocessor)
Single processor executes multiple processes concurrently.
Process executions interleaved (multitasking).
Address spaces managed by virtual memory system.
Context Switching (Uniprocessor)
Save current registers in memory
Schedule next process for execution
Load saved registers and switch address space (context switch)
Context Switching (Multicore)
Multiple CPUs on single chip.
Share main memory (and some caches).
Each can execute a separate process. Scheduling of processes onto cores done by kernel.
11.2.2 Concurrent Flows
Concurrent Flow: A logical flow whose execution overlaps in time with another flow.
Parallel Flows: Two flows are running concurrently on different processor cores or computers.
11.2.3 System Calls
On error, most system-level functions return −1 and set global variable errno
to indicate cause.
Can simplify somewhat using an error-reporting function:
Simplify even further by using Stevens-style error-handling wrappers:
11.2.4 Process Control
Obtaining Process IDs
pid_t getpid(void)
Returns PID of current process.pid_t getppid(void)
Returns PID of parent process.
Process States
Running: Process is either executing instructions, or it could be executing instructions if there were enough CPU cores.
Blocked/Sleeping: Process cannot execute any more instructions until some external event happens (usually I/O).
Stopped: Process has been prevented from executing by user action (control-Z).
Terminated/Zombie: Process is finished. Parent process has not yet been notified.
Terminating Processes
Receiving a signal whose default action is to terminate.
Returning from the
main
routine.Calling the
exit
function.
void exit(int status):
Terminates with an exit status of status
. Convention is that normal return status is 0, nonzero on error. Another way to explicitly set the exit status is to return an integer value from the main routine.
Creating Processes: Parent process creates a new running child process by calling int fork(void)
Returns 0 to the child process, and child's PID to parent process.
Child is almost identical to parent:
Child get an identical (but separate) copy of the parent's virtual address space.
Child gets identical copies of the parent's open file descriptors.
Child has a different PID than the parent.
Example
Call once, return twice.
Concurrent execution: Can't predict execution order of parent and child.
Duplicate but separate address space: x has a value of 1 when fork returns in parent and child, and subsequent changes to x are independent.
Shared open files:
stdout
is the same in both parent and child.
Process Graphs: Topological sort of the graph corresponds to a feasible total ordering.
Reaping Child Processes: When process terminates, it still consumes system resources.
Reaping Child Processes
Performed by parent on terminated child (using
wait
orwaitpid
).Parent is given exit status information.
Kernel then deletes zombie child process.
If parent doesn't reap...
The orphaned child should be reaped by
init
process (pid == 1).Unless it was
init
that terminated! Then need to reboot...Only need explicit reaping in long-running processes, e.g., shells and servers.
wait Synchronizing with Children
Parent reaps a child with one of these system calls:
pid_t wait(int *status)
Suspends current process until one of its children terminates, returns PID of child, records exit status instatus
.pid_t waitpid(pid_t pid, int *status, int options)
More flexible version ofwait
, can wait for a specific child or group of children, can be told to return immediately if there are no children to reap.
wait
Example1
Feasible Outputs: HC HP CT Bye or HP HC CT Bye
wait
Status Codes
Return value of
wait
is the pid of the child process that terminated.If
status != NULL
, then the integer it points to will be set to a value that indicates the exit status.More information than the value passed to exit.
Must be decoded, using macros defined in sys/wait.h: WIFEXITED, WEXITSTATUS, WIFSIGNALED, WTERMSIG, WIFSTOPPED, WSTOPSIG, WIFCONTINUED...
wait
Example2
If multiple children completed, will take in arbitrary order.
Can use macros WIFEXITED and WEXITSTATUS to get information about exit status.
execve
Loading and Running Programs
Loads and runs in the current process
int execve(char *filename, char *argv[], char *envp[])
Executable file
filename
: Can be object file or script file beginning with#!interpreter
(e.g.,#!/bin/bash
)... with argument list
argv
(By conventionargv[0]==filename
)... and environment variable list
envp
: "name=value" strings (e.g., USER=droh)getenv
,putenv
,printenv
Overwrites code, data, and stack: Retains PID, open files and signal context.
Called once and never returns (except if there is an error)
Example: Execute "/bin/ls –lt /usr/include" in child process using current environment
execve
and process memory layout
To load and run a new program a.out
in the current process using execve
:
Free
vm_area_struct
's and page tables for old areas.Create
vm_area_struct
's and page tables for new areas.Programs and initialized data backed by object files.
.bss
and stack backed by anonymous files.
Set PC to entry point in
.text
11.3 Shell
Simple Shell Implementation
Simple Shell eval
Function
Problems with Simple Shell Example
Shell designed to run indefinitely: should not accumulate unneeded resources.
Background jobs could run the entire computer out of memory, more likely, run out of PIDs.
11.4 Signals
Signals: A signal is a small message that notifies a process that an event of some type has occurred in the system.
Akin to exceptions and interrupts.
Sent from the kernel (sometimes at the request of another process) to a process.
Signal type is identified by small integer ID's (1-30)
Only information in a signal is its ID and the fact that it arrived.
ID | Name | Default Action | Corresponding Event |
---|---|---|---|
2 | SIGINT | Terminate | User typed Ctrl-C |
9 | SIGKILL | Terminate | Kill program (cannot override or ignore) |
11 | SIGSEGV | Terminate | Segmentation violation |
14 | SIGALRM | Terminate | Timer Signal |
17 | SIGCHLD | Ignore | Child stopped or terminated |
Terminology
Kernel sends a signal to a destination process by updating some state in the context of the destination process for following reasons:
Kernel has detected a system event such as divide-by-zero (SIGFPE) or the termination of a child process (SIGCHLD).
Another process has invoked the
kill
system call to explicitly request the kernel to send a signal to the destination process.
A destination process receives a signal when it is forced by the kernel to react in some way to the signal.
Some possible ways to react:
Ignore the signal (do nothing).
Terminate the process (with optional core dump).
Catch the signal by executing a user-level function called signal handler.
A signal is pending if sent but not yet received.
There can be at most one pending signal of each type.
A pending signal is received at most once.
Important: Signals are not queued.
Kernel sets bit k in
pending
when a signal of type k is sent.Kernel clears bit k in
pending
when a signal of type k is received.
A process can block the receipt of certain signals.
Blocked signals can be sent, but will not be received until the signal is unblocked.
Some signals cannot be blocked (SIGKILL, SIGSTOP) or can only be blocked when sent by other processes (SIGSEGV, SIGILL, etc).
Can be set and cleared by using the
sigprocmask
function.Also referred to as the signal mask.
11.4.1 Sending Signals
Sending Signals with /bin/kill
Program: /bin/kill
program sends arbitrary signal to a process or process group.
kill -9 1234
sends SIGKILL to process 1234.kill -9 -1234
sends SIGKILL to to every process in process group 1234.
Sending Signals from the Keyboard: Typing Ctrl-C
(Ctrl-Z
) causes the kernel to send a SIGINT (SIGTSTP) to every job in the foreground process group.
SIGINT: default action is to terminate each process.
SIGTSTP: default action is to stop (suspend) each process.
Sending Signals with the kill
Function
If pid is greater than zero, then the kill function sends signal number sig to process pid.
If pid is equal to zero, then kill sends signal sig to every process in the process group of the calling process, including the calling process itself.
If pid is less than zero, then kill sends signal sig to every process in process group |pid| (the absolute value of pid).
11.4.2 Receiving Signals
Suppose kernel is returning from an exception handler and is ready to pass control to process p, Kernel computes pnb = pending & ~blocked, the set of pending nonblocked signals for process p.
If (pnb == 0), pass control to next instruction in the logical flow for p.
Else,
Choose least nonzero bit k in pnb and force process p to receive signal k.
The receipt of the signal triggers some action by p.
Repeat for all nonzero k in
pnb
.Pass control to next instruction in logical flow for p.
Types of Default Actions
The process terminates.
The process stops until restarted by a SIGCONT signal.
The process ignores the signal.
Installing Signal Handlers: The signal
(handler_t *signal(int signum, handler_t *handler)
) function modifies the default action associated with the receipt of signal signum:
SIG_IGN: ignore signals of type signum.
SIG_DFL: revert to the default action on receipt of signals of type signum.
Otherwise,
handler
is the address of a user-level signal handler. When the handler executes its return statement, control passes back to instruction in the control flow of the process that was interrupted by receipt of the signal.
A signal handler is a separate logical flow (not process) that runs concurrently with the main program, but this flow exists only until returns to main program.
Another way of seeing this:
Nested Signal Handlers: Handlers can be interrupted by other handlers
11.4.3 Block & Unblocking Signals
Implicit blocking mechanism: Kernel blocks any pending signals of type currently being handled.
Explicit blocking and unblocking mechanism:
sigprocmask
function.
11.5 Writing Signal Handlers
Guidelines for Writing Safe Handlers
G0: Keep your handlers as simple as possible. e.g., set a global flag and return.
G1: Call only async-signal-safe functions in your handlers.
printf
,sprintf
,malloc
, andexit
are not safe!G2: Save and restore
errno
on entry and exit, so that other handlers don't overwrite your value oferrno
.G3: Protect accesses to shared data structures by temporarily blocking all signals to prevent possible corruption.
G4: Declare global variables as volatile to prevent compiler from storing them in a register.
G5: Declare global flags as
volatile sig_atomic_t
. flag: variable that is only read or written (e.g. flag = 1, not flag++). Flag declared this way does not need to be protected like other globals.
Function is async-signal-safe if either reentrant (e.g., all variables stored on stack frame) or non-interruptible by signals.
Safe Formatted Output:
Correct Signal Handling: Must wait for all terminated child processes, can't use signals to count events, such as children terminating.
Synchronizing Flows to Avoid Races: Simple shell with a subtle synchronization error because it assumes parent runs before child.
Protable Signal Handling
Different systems have different signal-handling semantics. Use sigaction
function (int sigaction(int signum, struct sigaction *act, struct sigaction *oldact);), which allows users to clearly specify the signal-handling semantics they want when they install a handler.
Explicitly Waiting for Signals: Handlers for program explicitly waiting for SIGCHLD to arrive.
Waiting for Signals with sigsuspend
, equivalent to: