Lesson 17 - Computer Architecture#

Lesson Outcomes#

By the end of this lesson, you should be able to:

  • Explain the major components of computer architecture: processor, memory, and I/O devices.

  • Describe how stack overflow occurs and why it matters.

  • Differentiate among the three types of system buses: address bus, data bus, and control bus.

  • Compare and contrast the Central Processing Unit (CPU) and the Graphics Processing Unit (GPU).


1. Architectural Foundation — The Von Neumann Model#

Modern computer systems are built on the Von Neumann Architecture, which introduced the stored-program concept.

The key idea:

Instructions and data are stored together in memory and treated identically by the processor.

This architecture consists of four major components:

  1. Input

  2. Processor (CPU)

  3. Memory

  4. Output

All communication between these components occurs over structured communication pathways called buses.

Because instructions and data share the same memory and communication pathways, coordination and control are essential. That coordination is the responsibility of the processor.

Understanding this structure allows us to understand how programs execute, how errors occur, and how systems can be optimized or exploited.


2. The Processor (CPU)#

The Central Processing Unit (CPU) is the control and computational engine of the system.

It repeatedly performs the:

fetch–decode–execute cycle

billions of times per second.

The CPU does not permanently store programs. Instead, it:

  • Retrieves instructions from memory

  • Interprets them

  • Executes them

  • Moves to the next instruction

Internally, the CPU consists of several coordinated components:


2.1 Control Unit#

The Control Unit (CU) directs instruction flow.

It manages the fetch–decode–execute cycle by coordinating:

  • Program Counter (PC)

  • Instruction Register (IR)

  • Status flags

The Control Unit determines:

  • Which instruction to fetch

  • Whether execution proceeds sequentially

  • Whether execution branches to a new address

Without the Control Unit, computation would be unstructured and chaotic.


2.2 Clock#

The clock is the processor’s timing signal.

Measured in Hertz (Hz), it determines how frequently operations occur.

Each clock pulse:

  • Synchronizes data movement

  • Coordinates register updates

  • Ensures predictable sequencing

The clock provides rhythm. The Control Unit provides direction.


2.3 Arithmetic Logic Unit (ALU)#

The Arithmetic Logic Unit (ALU) performs the actual computation.

It executes:

  • Arithmetic operations (ADD, SUB, MUL)

  • Logical operations (AND, OR, NOT)

  • Comparisons

When we refer to a processor’s computational strength, we are largely referring to the capability and throughput of the ALU.


2.4 Registers and Cache#

Inside the CPU are small, high-speed storage locations.

Registers#

Registers:

  • Store operands

  • Hold intermediate results

  • Contain addresses

They are the fastest accessible memory in the system.

Cache#

The CPU includes multi-level cache to reduce main memory latency.

It reduces the time required to access slower main memory and significantly improves overall performance.


3. Memory Architecture#

Memory stores both:

  • Program instructions

  • Data values

It is divided into two broad categories.


3.1 Primary Memory (Main Memory)#

Primary memory is directly accessible by the CPU.

It stores:

  • Currently executing programs

  • Active data

Two types:

RAM (Random Access Memory)#

  • Volatile

  • Loses data when power is removed

  • Used for active computation

ROM (Read-Only Memory)#

  • Non-volatile

  • Stores permanent instructions such as firmware

Primary memory acts as the processor’s workspace.


3.2 Secondary Memory#

Secondary memory stores data long term.

Examples:

  • Solid-state drives

  • Magnetic disks

  • Flash storage

Important distinction:

The processor does not execute instructions directly from secondary storage.

Data must first be loaded into primary memory. Beyond storing instructions and data, memory organization also affects program behavior and reliability. One critical example is the stack.


4. Stack Overflow in Memory#

The stack is a region of primary memory reserved during program execution.

It stores:

  • Function parameters

  • Local variables

  • Return addresses

Each function call adds data to the stack. When the function completes, that data is removed.


4.1 What Is a Stack Overflow?#

A stack overflow occurs when a program attempts to use more stack memory than allocated.

Common causes:

  • Infinite recursion

  • Deep function nesting

  • Large local variable allocation

Consequences:

  • Program crashes

  • Unpredictable behavior

  • Data corruption

  • Security vulnerabilities

From a cybersecurity perspective, stack overflows are dangerous because they can allow attackers to alter execution flow.


5. System Bus — Communication Pathways#

The processor, memory, and I/O devices communicate through a system bus.

There are three types of buses:

  1. Address Bus

  2. Data Bus

  3. Control Bus

Each has a specific role.


5.1 Address Bus#

The address bus specifies where data will be read from or written to.

It carries location information only.

Memory and I/O devices wait for address information before responding.


5.2 Data Bus#

The data bus carries the actual data being transferred.

It moves:

  • Instructions

  • Numerical values

  • Input/output signals

The width of the data bus determines how many bits are transferred at once.


5.3 Control Bus#

The control bus coordinates communication.

It determines:

  • Whether the operation is read or write

  • When the transfer occurs

  • Which device has permission to use the bus

Without the control bus, communication would be uncoordinated.


6. Input and Output (I/O)#

I/O devices allow interaction with external systems.

Examples:

  • Keyboard

  • Monitor

  • Printer

They connect to the system through I/O interfaces such as:

  • USB ports

  • HDMI ports

  • Network interface cards

All communication ultimately reduces to binary signals:

  • High voltage (logic 1)

  • Low voltage (logic 0)

The rules governing signal order and timing are defined by communication protocols.


7. Worked Example — Keyboard Button Press#

Let us walk through a conceptual example of pressing a key on a keyboard.

Step 1 — Key Press#

You press a key (for example, the letter “J”).

The keyboard detects the physical action and generates a digital signal.


Step 2 — I/O Interface Communication#

The keyboard sends the signal through its I/O interface.

The interface converts the signal into a standardized format (e.g., USB protocol).


Step 3 — Interrupt Signal#

The I/O interface sends an interrupt signal to the processor via the control bus.

This tells the processor:

“New input is available.”


Step 4 — Address Placement#

The processor places the address of the keyboard interface on the address bus.

This indicates:

“I am ready to read from this device.”


Step 5 — Data Transfer#

The keyboard interface places the encoded key value onto the data bus.

The processor reads this value.


Step 6 — Write to Memory#

The processor writes the input value to a specified memory location.

Now the key press is stored in RAM.


Step 7 — Output Operation#

If the program requires display:

  • The processor places the output value onto the data bus.

  • The monitor’s I/O interface receives a read command.

  • The monitor reads the data.

  • The character appears on the screen.

This entire sequence occurs extremely quickly and is coordinated by the control bus and clock.



8. CPU vs GPU#

Both CPUs and GPUs perform computation, but they are optimized differently.

Feature

CPU

GPU

Function

General-purpose tasks

Specialized parallel tasks

Processing

Sequential

Parallel

Core Count

2–64 cores

Thousands of cores

Best For

Complex decision-making

Repetitive mathematical workloads


Interpreting the CPU vs GPU Architecture Diagram#

This diagram visually compares how silicon resources are allocated inside a Central Processing Unit (CPU) and a Graphics Processing Unit (GPU).

The color legend identifies components:

  • 🔴 Red = Control Unit (instruction processing)

  • 🟢 Green = ALUs (computation power)

  • 🔵 Blue = Cache Memory (high-speed storage)

  • 🟡 Yellow = Main Memory (actively used data storage)


CPU Architecture (Left Side)#

The CPU dedicates a large portion of its silicon to:

  • A large Control Unit

  • Fewer but powerful ALUs

  • Substantial cache memory

This reflects its design goal:
Optimize for flexibility and complex decision-making.

The CPU excels when:

  • Execution paths change frequently

  • Branch prediction matters

  • Operating system coordination is required

  • Single-thread performance is critical


GPU Architecture (Right Side)#

The GPU dedicates most of its silicon to:

  • A very large number of ALUs

  • Minimal control logic per core

  • Smaller distributed cache

This reflects its design goal:
Maximize arithmetic throughput through parallelism.

GPUs excel when:

  • The same mathematical operation is repeated across large datasets

  • Workloads are structured and predictable

  • Massive parallel computation is possible


Why CPUs and GPUs Excel at Different Tasks#

Although both perform computations, their architectural priorities differ.

CPUs Excel At#

  • Branch-heavy code

  • Operating systems

  • Control-dominated tasks

These workloads require:

  • Rapid instruction redirection

  • Complex scheduling

  • Precise state management

CPUs include advanced branch prediction and strong control logic to handle these efficiently.


GPUs Excel At#

  • Graphics rendering

  • Matrix math

  • Machine learning

  • Scientific simulations

For example:

\[ C = A \times B \]

Each element of matrix \(C\) can be computed independently. GPUs distribute these computations across thousands of cores, dramatically increasing throughput.


Core Architectural Tradeoff#

  • CPU → Optimized for complex control and sequential logic

  • GPU → Optimized for large-scale parallel numerical computation

Modern systems combine both:

  • The CPU manages coordination and branching.

  • The GPU accelerates structured mathematical workloads.

Matching the workload to the architecture is essential for performance optimization.