Computer Architecture

Last Modified: 2/12/2025

Computer Architecture

In this part, we will explore computer architecture, both x86-64 and RISC-V architecture.

1 Bit, Bytes & Number Representation

1.1 Number Base

In computer science, there are three commonly-used number bases: binary, decimal & hexadecimal.

  1. Binary (base 2)
    • Symbols: 0, 1
    • Notation: 1010112=0b101011101011_2=\texttt{0b101011}
    • Converting numbers to base 2 lets us represent numbers as bits!
  2. Decimal (base 10)
    • Symbols: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
    • Notation: 947210=94729472_{10}=9472
    • Understandable by humans, used in our daily life.
  3. Hexadecimal (base 16)
    • Symbols: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F
    • Notation: 2A5D16=0x224002A5D_{16}=\texttt{0x22400}
    • A convenient shorthand for writing long sequences of bits.

Conversion between bases

  1. Convert from other bases to base-10: write out power of bases.

    For example, AC216=(10×162)+(12×161)+(2×160)=2754AC2_{16}=(10 \times 16^2)+(12 \times 16^1)+(2 \times 16^0)=2754

  2. Convert from base-10 to other bases: Use the “leftover algorthm”

    For example, convert 731073_{10} to base-4. For base-4, the powers of base include 256, 64, 16, 4, 1.

    • How many multiples of 6464 fit in 7373? 7364=973 - 64 = 9 left over.
    • How many multiples of 1616 fit in 99? Still 99 left over.
    • How many multiples of 44 fit in 99? 92×4=19 - 2 \times 4 = 1 left over.
    • How many multiples of 11 fit in 11? 11=01 - 1 = 0, which means we are done!

    Therefore, 7310=1021473_{10}=1021_4.

When converting between different bases, you can refer to the following table below.

DecimalBinaryHexadecimal
000000000000
110001000111
220010001022
330011001133
440100010044
550101010155
660110011066
770111011177
881000100088
991001100199
101010101010AA
111110111011BB
121211001100CC
131311011101DD
141411101110EE
151511111111FF

1.2 Integer Representation

1.2.1 Unsigned Integers

Properties

Conclusion

Unsigned Integer
Can represent negative numbers
Doing math is easy
Every bit sequence represents a unique number

1.2.2 Signed Integers

Properties

Sign-Magnitude
Can represent negative numbers
Doing math is easy
Every bit sequence represents a unique number

1.2.3 One’s Complement

Properties

One’s Complement
Can represent negative numbers
Doing math is easy
Every bit sequence represents a unique number

1.2.4 Two’s Complement

Properties

Conversion between two’s complement and signed integer

  1. Two’s Complement -> Signed Integer

    • If left-most digit is 0: Read it as unsigned
    • If left-most digit is 1:
      • Flip the bits, and add 1
      • Convert to base-10, and stick a negative sign in front

    Example: What is 0b1110 1100\texttt{0b1110 1100} in decimal?

    • Flip the bits: 0b0001 0011\texttt{0b0001 0011}
    • Add one: 0b0001 0100\texttt{0b0001 0100}
    • In base-10: 20-20
  2. Signed Integer -> Two’s Complement

    • If number is positive: Just convert it to base-2
    • If number is negative:
      • Pretend it’s unsigned, and convert to base-2
      • Flip the bits, and add 1

    Example: What is 20-20 in two’s complement binary?

    • In base-2: 0b0001 0100\texttt{0b0001 0100}
    • Flip the bits: 0b1110 1011\texttt{0b1110 1011}
    • Add one: 0b1110 1100\texttt{0b1110 1100}
Two’s Complement
Can represent negative numbers
Doing math is easy
Every bit sequence represents a unique number

1.2.5 Bias Notation

Properties

Bias Notation
Bias Notation
Can represent negative numbers
Doing math is easy
Every bit sequence represents a unique number

1.2.6 Sign Extension

For binary representation, leftmost bit is the most significant bit (MSB), and rightmost bit is the least significant bit (LSB).

Example

2 C Introduction

2.1 Variable C Types

2.2 Addresses & Pointers

A computer memory location has an address and holds a content. A pointer variable (or pointer in short) is basically the same as the other variables, but it stores a memory address.

The size of the address (and of course, the pointer) depends on the architecture of the computer. For example, in a 32-bit system, the size of the pointer is 4 bytes, while in a 64-bit system, the size of the pointer is 8 bytes.

Example

pointer.cpp
int *p; // declaration
int x = 3;
p = &x; // assign the address of x to p
printf("%u %d\n", p, *p);
// Output: 0xc7977ff934 3
*p = 5; // changes value of x to 5
void *p1; // can be used to store any address

2.3 Arrays

Declaration & Initialization:

array.cpp
int arr[5];
int arr1[] = {1, 2, 3, 4, 5};
printf("%d\n", arr1[2]);

A better pattern: single source of truth!

int ARRAY_SIZE = 5;
int arr[ARRAY_SIZE];
for (int i = 0; i < ARRAY_SIZE; i++) {
arr[i] = i;
}

2.4 Strings

In C language, strings are stored in an array of characters. The end of the string is marked by a special character, the null character \0.

string.cpp
char str[] = "Hello, World!";
char str1[6] = {'H', 'e', 'l', 'l', 'o', '\0'};
char str2[6] = "Hello";

Here are some commonly-used string functions defined in the string.h library:

  1. size_t strlen(const char * str): Returns the length of the string (not including the null terminator).
  2. char * strcpy ( char * destination, const char * source ): Copies the C string pointed by source into the array pointed by destination, including the terminating null character (and stopping at that point).
  3. char * strncpy ( char * destination, const char * source, size_t num ): Copies the first num characters of source to destination. If the end of the source C string (which is signaled by a null-character) is found before num characters have been copied, destination is padded with zeros until a total of num characters have been written to it.
  4. int strcmp ( const char * str1, const char * str2 );: Compares the C string str1 to the C string str2, return 0 if str1 and str2 are identical.
  5. int strncmp ( const char * str1, const char * str2, size_t num ): Compares up to num characters of the C string str1 to those of the C string str2.
  6. char * strcat ( char * destination, const char * source ): Appends a copy of the source string to the destination string. The terminating null character in destination is overwritten by the first character of source, and a null-character is included at the end of the new string formed by the concatenation of both in destination.
  7. char * strncat ( char * destination, const char * source, size_t num ): Appends the first num characters of source to destination, plus a terminating null-character.