Computer Architecture
In this part, we will explore more about computer architecture, both RISC-Ⅴ and x86-64 architecture.
1 Number Representation
1.1 Number Base
Commonly Used Number Bases:
Decimal (base 10)
Symbols: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Notation:
Understandable by humans.
Binary (base 2)
Symbols: 0, 1
Notation:
Converting numbers to base 2 lets us represent numbers as bits!
Hexadecimal (base 16)
Symbols: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F
Notation:
A convenient shorthand for writing long sequences of bits.
Group of bits:
1 byte = 8 bits 2 hex digits
different values 1 nibble = 4 bits 1 hex digit
different values
Conversions from ten to other bases: Leftover algorithm
Convert
Check the powers of the base. For base-4: 256, 64, 16, 4, 1.
How many multiples of 64 fit in my number (73)?
left over. How many multiples of 16 fit in my remaining number (9)?
Still
left over. How many multiples of 4 fit in my remaining number (9)?
left over. How many multiples of 1 fit in my remaining number (1)?
left over, which means we're done!
For example,
1.2 Unsigned Integers
Can represent
different numbers, simply convert decimal to binary! Smallest number:
represents . Largest number:
represents .
Overflow for Unisgned Integers:
Overflow: Exceed the largest value and wrap back around to 0, e.g.,
. Negative overflow: Try to go below 0 and wrap around to large values, e.g.,
.
Conclusion:
Unsigned Integer | |
---|---|
Can represent negative numbers | χ |
Doing math is easy | ✓ |
Every bit sequence represents a unique number | ✓ |
1.3 Signed Integers
Idea: Use the left-most bit to indicate if the number is positive (0) or negative (1). This is called sign-magnitude representation.
Smallest number:
represents . Largest number:
represents .
Sign-Magnitude | |
---|---|
Can represent negative numbers | ✓ |
Doing math is easy | χ |
Every bit sequence represents a unique number | χ |
1.4 One's Complement
Idea: If the number is negative, flip the bits.
is . is . Left-most bit acts like a sign bit. If it's 1, someone flipped the bits, so number must be negative.
Smallest number:
represents . Largest number:
represents .
One's Complement | |
---|---|
Can represent negative numbers | ✓ |
Doing math is easy | ✓ |
Every bit sequence represents a unique number | χ |
1.5 Two's Complement
Idea: If the number is negative, flip the bits, and add one (because we shifted to avoid double-zero).
Smallest number:
represents . Largest number:
represents .
To convert two's complement to a signed decimal number:
If left-most digit is 0: Read it as unsigned
If left-most digit is 1:
Flip the bits, and add 1.
Convert to base-10, and stick a negative sign in front.
Example: What is
Flip the bits:
Add one:
In base-10:
To convert two's complement to a signed decimal number:
If number is positive: Just convert it to base-2
If number is negative:
Pretend it's unsigned, and convert to base-2
Flip the bits, and add 1
Example: What is
In base-2:
Flip the bits:
Add one:
Two's Complement | |
---|---|
Can represent negative numbers | ✓ |
Doing math is easy | ✓ |
Every bit sequence represents a unique number | ✓ |
1.6 Bias Notation
Idea: Just like unsigned, but shifted on the number line.
Smallest number:
represents . Largest number:
represents .
To convert from bias to decimal:
Read as unsigned decimal.
Add the bias.
Example: Assuming standard bias, what is
, so standard bias is: . Read as unsigned: 1
Add the bias:
To convert from decimal to bias notation:
Subtract the bias.
Convert to unsigned binary.
Example: What is
, so standard bias is: . Subtract the bias:
Write in base-2:
Bias Notation | |
---|---|
Can represent negative numbers | ✓ |
Doing math is easy | χ |
Every bit sequence represents a unique number | ✓ |
1.7 Sign Extension
Leftmost is the most significant bit (MSB).
Rightmost is the least significant bit (LSB).
Want to represent the same number using more bits than before.
Easy for positive numbers (add leading 0's), more complicated for negative numbers.
Sign and magnitude: add 0's after the sign bit.
One's/Two's Complement: copy MSB.
Example:
Sign and magnitude:0b1101 = 0b10000101.
One's/Two's complement:
.
2 C Introduction
2.1 Variable C Types
char: A char takes up to 1 byte.
7 bits are enough to store a char (
Typecasting in C: C is a "weakly" typed language, you can typecast from any type to any other.
Union
Unions are similar to structs, but all members share the same memory location, and union only provides enough space for the largest element.
Enum: A group of related integer constants.
CPP (C Preprocessor) Macro
Prior to compilation, preprocess by performing string replacement in the program based on all #define
macros.
For example, #define PI 3.14159 => Replace all PI with (3.14159) => In effect, makes PI a "constant".
2.2 Addresses & Pointers
Definitions
Address: An address refers to a particular memory location.
Pointer: A pointer is a variable that contains the address of another variable.
The size of an address (and thus, the size of a pointer) in bytes depend on architecture, e.g., for 32-bit, have
possible addresses. byte-addressed = each of its addresses points to a unique byte
word-addressed = each of its addresses points to a unique word
By value vs. By reference
2.3 Array
Declaration & Initialization
A better pattern: single source of truth!
Pointer Arithmetic
pointer+n: Add n*sizeof("whatever pointer is pointing to")
pointer-n: Subtract n*sizeof("whatever pointer is pointing to")
Examples
Potential Pitfalls
Array bounds are not checked during element access
Consequence: We can accidentally access off the the end of the array!
An array is passed to a function as a pointer
Consequence: The array size is lost! Be careful with sizeof()!
Declared arrays are only allocated when the scope is valid
2.4 Strings
C String: A C string is just an array of characters, followed by a null terminator.
String functions (accessible with #include <string.h>
int strlen(char* string):
Return the length of the string (not including the null term)
int strcmp(char* str1, char* str2):
Return 0 if str1 and str2 are identical (no str1 == str2, since this will be checking if they point to the same memory location!)
char* strcpy(char* dst, char* src):
Copy contents of string src to the memory at dst. Caller must ensure that dst has enough memory to hold the data to be copied.
2.5 Word Alignment
sizeof()
The C and C++ programming languages define byte as an "addressable unit of data storage large enough to hold any member of the basic character set of the execution environment", most commonly means the number of bites for a
char
(8 bits).sizeof()
returns the size in bytes of the type.sizeof(char)
is always 1!Depending on the computer architecture, a byte may consist of 8 or more bits, the exact number being recorded in CHAR_BIT.
For example, since
sizeof(char)
is defined to be 1 and assuming the integer type is four bytes long, the following code fragment prints 1,4:char c; printf ("%zu,%zu\n", sizeof c, sizeof (int));
Example use:
Struct Alignment and Padding
Some processors will not allow you to address 32b values without being on 4-byte boundaries.
Others will just be very slow if you try to access " unaligned" memory.
The actual layout on a 32-bit architecture would be:
sizeof(hello) = 16
Improvement:
sizeof(hello) = 12
2.6 Compilation, Run, Debug & Disassembly
Example
Compile & Run
Compile the program into assembly code
$ gcc -S -o hello.s hello.cThen we get hello.s file:
.file "hello.c" .text .section .rodata .LC0: .string "x + y = %d\n" .text .globl main .type main, @function main: .LFB0: .cfi_startproc endbr64 pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq %rsp, %rbp .cfi_def_cfa_register 6 subq $16, %rsp movl $5, -8(%rbp) movl $6, -4(%rbp) movl -8(%rbp), %edx movl -4(%rbp), %eax addl %edx, %eax movl %eax, %esi leaq .LC0(%rip), %rax movq %rax, %rdi movl $0, %eax call printf@PLT movl $0, %eax leave .cfi_def_cfa 7, 8 ret .cfi_endproc .LFE0: .size main, .-main .ident "GCC: (Ubuntu 13.2.0-23ubuntu4) 13.2.0" .section .note.GNU-stack,"",@progbits .section .note.gnu.property,"a" .align 8 .long 1f - 0f .long 4f - 1f .long 5 0: .string "GNU" 1: .align 8 .long 0xc0000002 .long 3f - 2f 2: .long 0x3 3: .align 8 4:Compile the assembly code into an executable
$ gcc -o hello hello.sRun the executable
$ ./helloThen we get the folllowing in the bash:
x + y = 11Disassemble the executable
$ objdump -d hello > hello1.sThen we get the hello1.s file:
hello: file format elf64-x86-64 Disassembly of section .init: 0000000000001000 <_init>: 1000: f3 0f 1e fa endbr64 1004: 48 83 ec 08 sub $0x8,%rsp 1008: 48 8b 05 d9 2f 00 00 mov 0x2fd9(%rip),%rax # 3fe8 <__gmon_start__@Base> 100f: 48 85 c0 test %rax,%rax 1012: 74 02 je 1016 <_init+0x16> 1014: ff d0 call *%rax 1016: 48 83 c4 08 add $0x8,%rsp 101a: c3 ret Disassembly of section .plt: 0000000000001020 <.plt>: 1020: ff 35 9a 2f 00 00 push 0x2f9a(%rip) # 3fc0 <_GLOBAL_OFFSET_TABLE_+0x8> 1026: ff 25 9c 2f 00 00 jmp *0x2f9c(%rip) # 3fc8 <_GLOBAL_OFFSET_TABLE_+0x10> 102c: 0f 1f 40 00 nopl 0x0(%rax) 1030: f3 0f 1e fa endbr64 1034: 68 00 00 00 00 push $0x0 1039: e9 e2 ff ff ff jmp 1020 <_init+0x20> 103e: 66 90 xchg %ax,%ax Disassembly of section .plt.got: 0000000000001040 <__cxa_finalize@plt>: 1040: f3 0f 1e fa endbr64 1044: ff 25 ae 2f 00 00 jmp *0x2fae(%rip) # 3ff8 <__cxa_finalize@GLIBC_2.2.5> 104a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) Disassembly of section .plt.sec: 0000000000001050 <printf@plt>: 1050: f3 0f 1e fa endbr64 1054: ff 25 76 2f 00 00 jmp *0x2f76(%rip) # 3fd0 <printf@GLIBC_2.2.5> 105a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) Disassembly of section .text: 0000000000001060 <_start>: 1060: f3 0f 1e fa endbr64 1064: 31 ed xor %ebp,%ebp 1066: 49 89 d1 mov %rdx,%r9 1069: 5e pop %rsi 106a: 48 89 e2 mov %rsp,%rdx 106d: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp 1071: 50 push %rax 1072: 54 push %rsp 1073: 45 31 c0 xor %r8d,%r8d 1076: 31 c9 xor %ecx,%ecx 1078: 48 8d 3d ca 00 00 00 lea 0xca(%rip),%rdi # 1149 <main> 107f: ff 15 53 2f 00 00 call *0x2f53(%rip) # 3fd8 <__libc_start_main@GLIBC_2.34> 1085: f4 hlt 1086: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1) 108d: 00 00 00 0000000000001090 <deregister_tm_clones>: 1090: 48 8d 3d 79 2f 00 00 lea 0x2f79(%rip),%rdi # 4010 <__TMC_END__> 1097: 48 8d 05 72 2f 00 00 lea 0x2f72(%rip),%rax # 4010 <__TMC_END__> 109e: 48 39 f8 cmp %rdi,%rax 10a1: 74 15 je 10b8 <deregister_tm_clones+0x28> 10a3: 48 8b 05 36 2f 00 00 mov 0x2f36(%rip),%rax # 3fe0 <_ITM_deregisterTMCloneTable@Base> 10aa: 48 85 c0 test %rax,%rax 10ad: 74 09 je 10b8 <deregister_tm_clones+0x28> 10af: ff e0 jmp *%rax 10b1: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 10b8: c3 ret 10b9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 00000000000010c0 <register_tm_clones>: 10c0: 48 8d 3d 49 2f 00 00 lea 0x2f49(%rip),%rdi # 4010 <__TMC_END__> 10c7: 48 8d 35 42 2f 00 00 lea 0x2f42(%rip),%rsi # 4010 <__TMC_END__> 10ce: 48 29 fe sub %rdi,%rsi 10d1: 48 89 f0 mov %rsi,%rax 10d4: 48 c1 ee 3f shr $0x3f,%rsi 10d8: 48 c1 f8 03 sar $0x3,%rax 10dc: 48 01 c6 add %rax,%rsi 10df: 48 d1 fe sar $1,%rsi 10e2: 74 14 je 10f8 <register_tm_clones+0x38> 10e4: 48 8b 05 05 2f 00 00 mov 0x2f05(%rip),%rax # 3ff0 <_ITM_registerTMCloneTable@Base> 10eb: 48 85 c0 test %rax,%rax 10ee: 74 08 je 10f8 <register_tm_clones+0x38> 10f0: ff e0 jmp *%rax 10f2: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 10f8: c3 ret 10f9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 0000000000001100 <__do_global_dtors_aux>: 1100: f3 0f 1e fa endbr64 1104: 80 3d 05 2f 00 00 00 cmpb $0x0,0x2f05(%rip) # 4010 <__TMC_END__> 110b: 75 2b jne 1138 <__do_global_dtors_aux+0x38> 110d: 55 push %rbp 110e: 48 83 3d e2 2e 00 00 cmpq $0x0,0x2ee2(%rip) # 3ff8 <__cxa_finalize@GLIBC_2.2.5> 1115: 00 1116: 48 89 e5 mov %rsp,%rbp 1119: 74 0c je 1127 <__do_global_dtors_aux+0x27> 111b: 48 8b 3d e6 2e 00 00 mov 0x2ee6(%rip),%rdi # 4008 <__dso_handle> 1122: e8 19 ff ff ff call 1040 <__cxa_finalize@plt> 1127: e8 64 ff ff ff call 1090 <deregister_tm_clones> 112c: c6 05 dd 2e 00 00 01 movb $0x1,0x2edd(%rip) # 4010 <__TMC_END__> 1133: 5d pop %rbp 1134: c3 ret 1135: 0f 1f 00 nopl (%rax) 1138: c3 ret 1139: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 0000000000001140 <frame_dummy>: 1140: f3 0f 1e fa endbr64 1144: e9 77 ff ff ff jmp 10c0 <register_tm_clones> 0000000000001149 <main>: 1149: f3 0f 1e fa endbr64 114d: 55 push %rbp 114e: 48 89 e5 mov %rsp,%rbp 1151: 48 83 ec 10 sub $0x10,%rsp 1155: c7 45 f8 05 00 00 00 movl $0x5,-0x8(%rbp) 115c: c7 45 fc 06 00 00 00 movl $0x6,-0x4(%rbp) 1163: 8b 55 f8 mov -0x8(%rbp),%edx 1166: 8b 45 fc mov -0x4(%rbp),%eax 1169: 01 d0 add %edx,%eax 116b: 89 c6 mov %eax,%esi 116d: 48 8d 05 90 0e 00 00 lea 0xe90(%rip),%rax # 2004 <_IO_stdin_used+0x4> 1174: 48 89 c7 mov %rax,%rdi 1177: b8 00 00 00 00 mov $0x0,%eax 117c: e8 cf fe ff ff call 1050 <printf@plt> 1181: b8 00 00 00 00 mov $0x0,%eax 1186: c9 leave 1187: c3 ret Disassembly of section .fini: 0000000000001188 <_fini>: 1188: f3 0f 1e fa endbr64 118c: 48 83 ec 08 sub $0x8,%rsp 1190: 48 83 c4 08 add $0x8,%rsp 1194: c3 retDebug the executable
$ gdb helloThen we can get the following in the bash:
(gdb) break main Breakpoint 1 at 0x1151 (gdb) r Starting program: /home/kekeandzeyu/sample/hello Downloading separate debug info for system-supplied DSO at 0x7ffff7fc3000 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Breakpoint 1, 0x0000555555555151 in main () (gdb) si 0x0000555555555155 in main () (gdb) quit
3 Bits, Bytes & Integers
3.1 Bits Manipulation
Types of manipulation
Bit-level Manipulation
A&B (And): A&B = 1 when both A=1 and B=1
A|B (Or): A|B = 1 when either A=1 or B=1
~A (Not): ~A = 1 when A=0
A^B (Exclusive-Or (Xor): A^B = 1 when either A=1 or B=1, but not both
Examples
int a = 105; // 01101001 int b = 85; // 01010101 printf("a & b = %d\n", a & b); // 65 (01000001) printf("a | b = %d\n", a | b); // 125 (01111101) printf("a ^ b = %d\n", a ^ b); // 60 (0111100) printf("~a = %d\n", ~a); // -106 (10010110, two's complement)You can also use this to represent & manipulate sets, do this on char data type, etc.
Logic Manipulation
View 0 as "False"
Anything nonzero as "True"
Always return 0 or 1
Early termination
Examples
unsigned char value1 = 0x41; // Hexadecimal 41 (decimal 65) unsigned char value2 = 0x00; // Hexadecimal 00 (decimal 0) printf("!0x41 = 0x%02X\n", !value1); // Output: 0x00 printf("!0x00 = 0x%02X\n", !value2); // Output: 0x01 printf("!!0x41 = 0x%02X\n", !!value1); // Output: 0x01 printf("0x69 && 0x55 = %d\n", 0x69 && 0x55); // Output: 1 printf("0x69 || 0x55 = %d\n", 0x69 || 0x55); // Output: 1Shift Operations
Left Shift (x << y): Shift bit-vector x left y positions
Throw away extra bits on left
Fill with 0’s on right
Examples
Right Shift (x >> y): Shift bit-vector x left y positions
Throw away extra bits on right
Logical shift: Fill with 0's on left
Arithmetic shift: Replicate most significant bit on left
Examples
Undefined Behavior: Shift amount < 0 or >= word size
3.2 Multiplication & Division
3.2.1 Power-of-2 Multiply with Shift
u << k
gives
Examples
Most machines shift and add faster than multiply.
Compiler generates this code automatically.
3.2.2 Division
Unsigned power-of-2 division
u >> k
gives, use logical shift Examples
Division
Computed
Hex
Binary
x
15213
15213
3B6D
00111011 01101101
x >> 1
7606.5
7606
1DB6
00011101 10110110
x >> 4
950.8125
950
03B6
00000011 10110110
x >> 8
59.5
59
003B
00000000 00111011
Signed power-of-2 division
Want to compute
(Round to 0) Compute as
In C:
(x + (1 << k)-1) >> k
4 C Memory Layout
Program's address space contains 4 regions:
Stack: Local variables, grow downwards.
Heap: Space requested via
malloc()
and used with pointers; resizes dynamically, grow upward.Static Data: Global or static variables, does not grow or shrink.
Code: Loaded when program starts, does not change.
Storage for C Programs
Declared outside a function: Static Data
Declared inside a function: Stack
Freed when function returns.
Dynamically allocated (i.e.,
malloc
,calloc
&realloc
): Heap.
4.1 Stack
A stack frame includes:
Location of caller function
Function arguments
Space for local variables
Stack pointer (SP) tells where lowest (current) stack frame is.
When procedure ends, stack pointer is moved back (but data remains (garbage!)); frees memory for future stack frames (LIFO-Last In First Out).
Stack Misuse
4.2 Static Data
Place for variables that persist, and data doesn't subject to comings and goings like function calls, e.g. string literals, global variables.
String literal example:
char * str = "hi"
.char str[] = "hi"
is on stack!Size does not change, but sometimes part of the data can be writable.
4.3 Code
Copy of your code goes here, C code becomes data too!
Does (should) not change, typically read-only.
4.4 Endianness
Endianness
Big Endian: Descending numerical significance with ascending memory addresses.
Little Endian: Ascending numerical significance with ascending memory addresses.
4.5 Heap
Stack is not permanent - when the function returns, the memory will be deallocated and turn into garbage.
Dynamically allocated memory goes on the Heap, more permanent and persistent than Stack.
malloc(n)
Allocates a continuous block of n bytes of uninitialized memory (contains garbage!)
Returns a pointer to the beginning of an allocated block; NULL indicates failed request (check for this!)
- int *p = (int *) malloc(n * sizeof(int))
sizeof()
makes code more portable.malloc()
returnsvoid *
; typecast will help you catch coding errors when pointer types don't match.
calloc(n, size)
- void* calloc(size_t nmemb, size_t size)
nmemb is the number of the members
size is the size of each member
Like malloc, except it initializes the meory to 0.
Example for allocating space for 5 integers.
int *p = (int*)calloc(5, sizeof(int));
realloc()
Use it when you need more or less memory in an array.
- void *realloc(void *ptr, size_t size)
Takes in a ptr that has been the return of malloc/calloc/realloc and a new size.
Returns a pointer with now size space (or NULL) and copies any content from ptr.
Realloc can move or keep the address same, so DO NOT rely on old ptr values.
free()
Release memory on the heap: Pass the pointer p to the beginning of allocated block; releases the whole block.
p must be the address originally returned by m/c/realloc(), otherwise throws system exception.
Don't call
free()
on a block that has already been released or on NULL.Make sure you don't lose the original address.
4.6 Common Mistakes
Memory error types
Segmentation Fault: Segmentation fault (often shortened to segfault) or access violation is a fault, or failure condition, raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restricted area of memory (a memory access violation).
Bus error: Bus error is a fault raised by hardware, notifying an operating system (OS) that a process is trying to access memory that the CPU cannot physically address: an invalid address for the address bus, hence the name.
5 Floating Point
Floating point: floating-point arithmetic (FP) is arithmetic that represents subsets of real numbers using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base.
This allows us to represent more numbers without losing accuracy, from very small fractions to very large numbers.
5.1 IEEE 754 Floating Point Standard
Single precision (SP)
Sign Bit: 1 means negative, 0 means positive.
Exponent: IEEE 754 uses "biased exponent" representation.
Designers wanted FP numbers to be used even if no FP hardware; e.g., sort records with FP numbers using integer compares.
2's complement poses a problem (because negative numbers look bigger)
Bias notation preserves the linearity of value.
Significand:
To pack more bits, leading 1 implicit for normalized numbers.
1 + 23 bits single, 1 + 52 bits double.
Always true: 0 < Significand < 1 (for normalized numbers).
Bias notation preserves the linearity of value!
IEEE 754 uses bias of -127 for single prec, -1023 for double prec.
Example
1 10000001 11100000000000000000000
5.2 Special Numbers
Zero
Sign bit can be either 0 or 1.
Exponent is 0.
Significand is 0.
+0
-0
Infinity
OK to do further computations with ∞
Max exponent = 255.
All zero significand.
+∞
-∞
NaN (Not a Number)
, ,Useful for debugging.
Op(NaN, some number) = NaN
Max exponent = 255.
Non-zero significand.
+NaN
-Nan
Denorm Numbers (denormalized numbers)
Use Exp = 0, Significand ≠ 0 to represent very small numbers.
No leading 0!
Careful! Implicit exponent = -126 when Exp = 0x00
Smallest denorm:
Bigest denorm:
Smallest norm:
No uneven gap! Increments by
Conclusion
Exponent | Significand | Meaning |
---|---|---|
Non-zero | ||
Anything | ||
Non-zero | NaN |
5.3 Other Representations
binary64: 11 exponential bits, 52 significand bits
binary128: 15 exponent bits, 112 significand bits
binary256: 19 exponent bits, 238 significand bits
binary 16/fp16: 1/5/10 bits
5.4 Limitations
Two limitations
Numbers may get overflow or underflow when expressed in floating points, so rounding occurs and can get unexpected results.
In addition, the gap between adjacent floating points gets wider, due to increasing exponential number.
Floating points are denser around 0!
FP addition is not associative!
Example
float a = 1.0e20; float b = 1.0; float c = -1.0e20; float d = (a + b) + c; float e = a + (b + c); printf("a+b+c = %f\n", d); // 0.000000 printf("a+(b+c) = %f\n", e); // 1.000000IEEE FP Rounding Modes
Round towards
: ALWAYS round "up".Round towards
: ALWAYS round "down".Truncate: Just drop the last bits.
Unbiased (default mode): Midway? Round to even.
FP Addition
De-normalize to match exponents.
Add significands to get resulting one.
Keep the same exponent.
Normalize (possibly changing exponent).