Lesson 17 - Computer Architecture#
Lesson Outcomes#
By the end of this lesson, you should be able to:
Explain the major components of computer architecture: processor, memory, and I/O devices.
Describe how stack overflow occurs and why it matters.
Differentiate among the three types of system buses: address bus, data bus, and control bus.
Compare and contrast the Central Processing Unit (CPU) and the Graphics Processing Unit (GPU).
1. Architectural Foundation — The Von Neumann Model#
Modern computer systems are built on the Von Neumann Architecture, which introduced the stored-program concept.

The key idea:
Instructions and data are stored together in memory and treated identically by the processor.
This architecture consists of four major components:
Input
Processor (CPU)
Memory
Output
All communication between these components occurs over structured communication pathways called buses.
Because instructions and data share the same memory and communication pathways, coordination and control are essential. That coordination is the responsibility of the processor.
Understanding this structure allows us to understand how programs execute, how errors occur, and how systems can be optimized or exploited.
2. The Processor (CPU)#
The Central Processing Unit (CPU) is the control and computational engine of the system.

It repeatedly performs the:
fetch–decode–execute cycle
billions of times per second.
The CPU does not permanently store programs. Instead, it:
Retrieves instructions from memory
Interprets them
Executes them
Moves to the next instruction
Internally, the CPU consists of several coordinated components:
2.1 Control Unit#
The Control Unit (CU) directs instruction flow.
It manages the fetch–decode–execute cycle by coordinating:
Program Counter (PC)
Instruction Register (IR)
Status flags
The Control Unit determines:
Which instruction to fetch
Whether execution proceeds sequentially
Whether execution branches to a new address
Without the Control Unit, computation would be unstructured and chaotic.
2.2 Clock#
The clock is the processor’s timing signal.
Measured in Hertz (Hz), it determines how frequently operations occur.
Each clock pulse:
Synchronizes data movement
Coordinates register updates
Ensures predictable sequencing
The clock provides rhythm. The Control Unit provides direction.
2.3 Arithmetic Logic Unit (ALU)#
The Arithmetic Logic Unit (ALU) performs the actual computation.
It executes:
Arithmetic operations (ADD, SUB, MUL)
Logical operations (AND, OR, NOT)
Comparisons
When we refer to a processor’s computational strength, we are largely referring to the capability and throughput of the ALU.
2.4 Registers and Cache#
Inside the CPU are small, high-speed storage locations.
Registers#
Registers:
Store operands
Hold intermediate results
Contain addresses
They are the fastest accessible memory in the system.
Cache#
The CPU includes multi-level cache to reduce main memory latency.
It reduces the time required to access slower main memory and significantly improves overall performance.
3. Memory Architecture#
Memory stores both:
Program instructions
Data values
It is divided into two broad categories.

3.1 Primary Memory (Main Memory)#
Primary memory is directly accessible by the CPU.
It stores:
Currently executing programs
Active data
Two types:
RAM (Random Access Memory)#
Volatile
Loses data when power is removed
Used for active computation
ROM (Read-Only Memory)#
Non-volatile
Stores permanent instructions such as firmware
Primary memory acts as the processor’s workspace.
3.2 Secondary Memory#
Secondary memory stores data long term.
Examples:
Solid-state drives
Magnetic disks
Flash storage
Important distinction:
The processor does not execute instructions directly from secondary storage.
Data must first be loaded into primary memory. Beyond storing instructions and data, memory organization also affects program behavior and reliability. One critical example is the stack.
4. Stack Overflow in Memory#
The stack is a region of primary memory reserved during program execution.
It stores:
Function parameters
Local variables
Return addresses
Each function call adds data to the stack. When the function completes, that data is removed.
4.1 What Is a Stack Overflow?#
A stack overflow occurs when a program attempts to use more stack memory than allocated.
Common causes:
Infinite recursion
Deep function nesting
Large local variable allocation
Consequences:
Program crashes
Unpredictable behavior
Data corruption
Security vulnerabilities
From a cybersecurity perspective, stack overflows are dangerous because they can allow attackers to alter execution flow.

5. System Bus — Communication Pathways#
The processor, memory, and I/O devices communicate through a system bus.
There are three types of buses:
Address Bus
Data Bus
Control Bus

Each has a specific role.
5.1 Address Bus#
The address bus specifies where data will be read from or written to.
It carries location information only.
Memory and I/O devices wait for address information before responding.
5.2 Data Bus#
The data bus carries the actual data being transferred.
It moves:
Instructions
Numerical values
Input/output signals
The width of the data bus determines how many bits are transferred at once.
5.3 Control Bus#
The control bus coordinates communication.
It determines:
Whether the operation is read or write
When the transfer occurs
Which device has permission to use the bus
Without the control bus, communication would be uncoordinated.
6. Input and Output (I/O)#
I/O devices allow interaction with external systems.
Examples:
Keyboard
Monitor
Printer
They connect to the system through I/O interfaces such as:
USB ports
HDMI ports
Network interface cards
All communication ultimately reduces to binary signals:
High voltage (logic 1)
Low voltage (logic 0)
The rules governing signal order and timing are defined by communication protocols.
Understanding USB Signaling — High Speed vs Low Speed Links#

The figure illustrates how USB transmits binary data using differential signaling over two wires:
D+
D−
Two versions are shown:
Figure A — High Speed Link
Figure B — Low Speed Link
Although the voltage levels and timing differ slightly between speeds, the fundamental communication principles remain the same.
1. Differential Signaling in USB#
USB does not transmit data on a single wire. Instead, it uses differential signaling, meaning:
Data is represented by the voltage difference between D+ and D−.
Noise immunity is improved because external interference tends to affect both wires equally.
The receiver interprets the relative voltage levels, not the absolute voltage of one wire.
When:
D+ is high and D− is low, this represents one logical state.
D+ is low and D− is high, this represents the opposite logical state.
This improves reliability and reduces electromagnetic interference.
2. Voltage Levels and Line States#
The diagram shows voltage levels labeled:
\(V_{DD}\) (High voltage)
\(V_{SS}\) (Low voltage / ground)
\(V_{SE}\) (Single-ended threshold reference)
The two wires (D+ and D−) toggle between high and low voltages to represent encoded data.
Why This Matters#
Although the waveform diagram appears to show purely electrical behavior, those voltage transitions ultimately represent digital information that becomes stored inside the computer’s memory system. When a USB device transmits data, the differential voltage changes on D+ and D− are interpreted by the processor’s I/O interface as binary values. Those binary values are then placed onto the data bus, while the destination location is specified by the address bus. Once written into memory, the information is physically stored as electrical charge inside memory cells. For example, as charge on capacitors in DRAM or as stable transistor states in SRAM. In other words, the voltage transitions you see in the USB signal diagram are not just communication artifacts; they are the physical origins of the data that eventually becomes stored charge in memory arrays. This connects the physical layer of communication directly to the memory architecture studied in the previous lesson.
8. CPU vs GPU#
Both CPUs and GPUs perform computation, but they are optimized differently.
Feature |
CPU |
GPU |
|---|---|---|
Function |
General-purpose tasks |
Specialized parallel tasks |
Processing |
Sequential |
Parallel |
Core Count |
2–64 cores |
Thousands of cores |
Best For |
Complex decision-making |
Repetitive mathematical workloads |
Interpreting the CPU vs GPU Architecture Diagram#

This diagram visually compares how silicon resources are allocated inside a Central Processing Unit (CPU) and a Graphics Processing Unit (GPU).
The color legend identifies components:
🔴 Red = Control Unit (instruction processing)
🟢 Green = ALUs (computation power)
🔵 Blue = Cache Memory (high-speed storage)
🟡 Yellow = Main Memory (actively used data storage)
CPU Architecture (Left Side)#
The CPU dedicates a large portion of its silicon to:
A large Control Unit
Fewer but powerful ALUs
Substantial cache memory
This reflects its design goal:
Optimize for flexibility and complex decision-making.
The CPU excels when:
Execution paths change frequently
Branch prediction matters
Operating system coordination is required
Single-thread performance is critical
GPU Architecture (Right Side)#
The GPU dedicates most of its silicon to:
A very large number of ALUs
Minimal control logic per core
Smaller distributed cache
This reflects its design goal:
Maximize arithmetic throughput through parallelism.
GPUs excel when:
The same mathematical operation is repeated across large datasets
Workloads are structured and predictable
Massive parallel computation is possible
Why CPUs and GPUs Excel at Different Tasks#
Although both perform computations, their architectural priorities differ.
CPUs Excel At#
Branch-heavy code
Operating systems
Control-dominated tasks
These workloads require:
Rapid instruction redirection
Complex scheduling
Precise state management
CPUs include advanced branch prediction and strong control logic to handle these efficiently.
GPUs Excel At#
Graphics rendering
Matrix math
Machine learning
Scientific simulations
For example:
Each element of matrix \(C\) can be computed independently. GPUs distribute these computations across thousands of cores, dramatically increasing throughput.
Core Architectural Tradeoff#
CPU → Optimized for complex control and sequential logic
GPU → Optimized for large-scale parallel numerical computation
Modern systems combine both:
The CPU manages coordination and branching.
The GPU accelerates structured mathematical workloads.
Matching the workload to the architecture is essential for performance optimization.