Table of Contents
SoC Design Course - This article is part of a series.
Part 4: This Article

Introduction
#

In the previous posts, we covered digital logic fundamentals and computer arithmetic. Now we arrive at one of the most important concepts in computer architecture:

What exactly can a CPU do?

The answer is defined by the Instruction Set Architecture (ISA) — the complete specification of every instruction the processor understands. Think of it as a contract between software and hardware:

  • Software (compilers, operating systems, applications) promises to express all computation using only the instructions defined in the ISA.
  • Hardware (the processor) promises to execute every instruction correctly and predictably.

This separation is powerful because it allows software and hardware to evolve independently, as long as both sides honor the contract.


1. What Is an ISA?
#

1.1 Definition
#

An Instruction Set Architecture specifies:

  1. Instructions: The operations the CPU can perform (add, subtract, load, store, branch, etc.)
  2. Data types: What kinds of data the CPU can operate on (integers, floating-point, vectors)
  3. Registers: How many registers are available and their purpose
  4. Memory model: How the CPU accesses memory (addressing modes, alignment, endianness)
  5. Encoding: How instructions are represented as binary bit patterns

1.2 The ISA as an Abstraction Layer
#

┌─────────────────────────┐
│     Application          │  (Python, Java, C++)
├─────────────────────────┤
│     Operating System     │  (Linux, Windows, RTOS)
├─────────────────────────┤
│     Compiler             │  (GCC, LLVM, Clang)
├═════════════════════════╡
│     ISA                  │  ◄── THE CONTRACT
├═════════════════════════╡
│     Microarchitecture    │  (Pipeline, Cache, OoO)
├─────────────────────────┤
│     Logic / RTL          │  (Gates, Flip-flops)
├─────────────────────────┤
│     Physics / Silicon    │  (Transistors, Metal layers)
└─────────────────────────┘

Everything above the ISA is software. Everything below is hardware implementation. The ISA is the boundary.

Key insight: Multiple different microarchitectures can implement the same ISA. For example:

  • Intel’s Alder Lake and AMD’s Zen 4 both implement the x86-64 ISA, but with completely different internal designs
  • ARM’s Cortex-A78 and Cortex-A55 both implement ARMv8-A, but one is high-performance while the other is energy-efficient

1.3 Why ISA Matters for SoC Design
#

When designing an SoC, the choice of ISA determines:

AspectImpact
Software ecosystemWhat compilers, OS, and libraries are available
Hardware complexityHow many gates are needed to implement the decoder
PerformanceHow efficiently the ISA maps to the microarchitecture
Power efficiencySimpler ISAs generally lead to simpler, lower-power designs
Licensing costProprietary ISAs (ARM, x86) require licensing; open ISAs (RISC-V) are free

2. Anatomy of an Instruction
#

Every instruction tells the CPU three things:

  1. What to do (the operation) → encoded in the opcode
  2. What to do it to (the data) → specified by operands
  3. Where to put the result → specified by the destination operand

2.1 A Simple Example
#

Consider this high-level operation:

c = a + b;

In assembly (RISC-V):

add  x3, x1, x2    # x3 = x1 + x2

The instruction has four fields:

FieldValueMeaning
OperationaddAddition
Destinationx3Where to store the result
Source 1x1First operand
Source 2x2Second operand

2.2 Instruction Fields
#

In general, instructions contain these types of fields:

┌──────────┬──────────┬──────────┬──────────┬──────────┐
│  Opcode  │   Dest   │  Source1 │  Source2 │  Other   │
│  (what)  │ (where)  │  (from)  │  (from)  │ (extra)  │
└──────────┴──────────┴──────────┴──────────┴──────────┘
FieldPurpose
OpcodeIdentifies the operation (add, sub, load, branch, etc.)
rd (destination register)The register that receives the result
rs1, rs2 (source registers)Registers providing input operands
ImmediateA constant value embedded directly in the instruction
functAdditional opcode bits for distinguishing similar operations

3. Types of Instructions
#

A typical ISA provides four main categories of instructions:

3.1 Arithmetic and Logic Instructions
#

Perform computation on register values:

OperationExample (RISC-V)Meaning
Addadd x3, x1, x2x3 = x1 + x2
Subtractsub x3, x1, x2x3 = x1 - x2
ANDand x3, x1, x2x3 = x1 & x2
ORor x3, x1, x2x3 = x1 | x2
XORxor x3, x1, x2x3 = x1 ^ x2
Shift Leftsll x3, x1, x2x3 = x1 « x2
Set Less Thanslt x3, x1, x2x3 = (x1 < x2) ? 1 : 0

With immediate values (constant operands):

OperationExampleMeaning
Add Immediateaddi x3, x1, 10x3 = x1 + 10
AND Immediateandi x3, x1, 0xFFx3 = x1 & 0xFF

3.2 Memory Access Instructions (Load/Store)
#

Transfer data between registers and memory:

   Registers                    Memory
  ┌────────┐                 ┌────────────┐
  │   x1   │ ──── Store ──► │ Address A  │
  │   x2   │ ◄─── Load ──── │ Address B  │
  │   ...  │                 │    ...     │
  └────────┘                 └────────────┘
OperationExampleMeaning
Load Wordlw x3, 0(x1)x3 = Memory[x1 + 0]
Store Wordsw x3, 8(x1)Memory[x1 + 8] = x3
Load Bytelb x3, 0(x1)x3 = sign-extend(Memory[x1])
Load Byte Unsignedlbu x3, 0(x1)x3 = zero-extend(Memory[x1])

The syntax offset(base) means: compute the memory address as base register + offset.

3.3 Control Flow Instructions (Branch/Jump)
#

Change the order of instruction execution:

Conditional branches (decide based on comparison):

OperationExampleMeaning
Branch if Equalbeq x1, x2, labelif (x1 == x2) goto label
Branch if Not Equalbne x1, x2, labelif (x1 != x2) goto label
Branch if Less Thanblt x1, x2, labelif (x1 < x2) goto label
Branch if ≥bge x1, x2, labelif (x1 >= x2) goto label

Unconditional jumps:

OperationExampleMeaning
Jump and Linkjal x1, labelx1 = PC+4; goto label
Jump and Link Registerjalr x1, 0(x2)x1 = PC+4; goto (x2+0)

jal is used for function calls — it saves the return address in the destination register before jumping.

3.4 System Instructions
#

Special operations for OS interaction and hardware control:

OperationExamplePurpose
ECALLecallSystem call (request OS service)
EBREAKebreakDebugger breakpoint
FENCEfenceMemory ordering barrier
CSR Read/Writecsrrw x1, csr, x2Access control/status registers

4. Instruction Encoding
#

4.1 Why Encoding Matters
#

Every instruction must be stored in memory as a sequence of bits. The encoding format determines:

  • How the CPU decodes (interprets) instructions
  • How much memory instructions consume
  • How complex the decoder hardware needs to be

4.2 Fixed-Length vs. Variable-Length
#

ApproachExample ISAProsCons
Fixed-lengthRISC-V (32-bit)Simple decoding, easy pipeliningMay waste bits
Variable-lengthx86 (1–15 bytes)Compact codeComplex decoder

RISC-V uses fixed 32-bit instructions (with an optional 16-bit compressed extension). This means every instruction is exactly 4 bytes, which makes the hardware decoder much simpler.

4.3 RISC-V Base Instruction Formats
#

RISC-V defines six instruction formats, all exactly 32 bits wide:

R-type:  [  funct7  |  rs2  |  rs1  | funct3 |   rd   | opcode ]
         [  31:25   | 24:20 | 19:15 | 14:12  |  11:7  |  6:0   ]

I-type:  [     imm[11:0]    |  rs1  | funct3 |   rd   | opcode ]
         [      31:20       | 19:15 | 14:12  |  11:7  |  6:0   ]

S-type:  [ imm[11:5] |  rs2  |  rs1  | funct3 |imm[4:0]| opcode ]
         [  31:25    | 24:20 | 19:15 | 14:12  |  11:7  |  6:0   ]

B-type:  [imm[12|10:5]| rs2  |  rs1  | funct3 |imm[4:1|11]|opcode]
         [   31:25    | 24:20| 19:15 | 14:12  |  11:7  |  6:0   ]

U-type:  [          imm[31:12]          |   rd   | opcode ]
         [            31:12             |  11:7  |  6:0   ]

J-type:  [  imm[20|10:1|11|19:12]       |   rd   | opcode ]
         [            31:12             |  11:7  |  6:0   ]

Design principle: Notice that rs1, rs2, and rd are always in the same bit positions across all formats. This allows the register file to be read before the instruction is fully decoded — a critical optimization for pipelined processors.

4.4 Format Usage
#

FormatUsed ForExample
R-typeRegister-register ALU opsadd x3, x1, x2
I-typeImmediate ALU ops, loadsaddi x3, x1, 10 / lw x3, 0(x1)
S-typeStoressw x3, 8(x1)
B-typeConditional branchesbeq x1, x2, label
U-typeUpper immediatelui x3, 0x12345
J-typeUnconditional jumpsjal x1, label

5. Registers
#

5.1 Why Registers?
#

Registers are the fastest storage in a computer — they are built directly into the CPU and can be accessed in a single clock cycle (or even less). Memory access, by contrast, takes many cycles.

Speed Hierarchy:
  Registers    ──→  ~0.5 ns   (within CPU)
  L1 Cache     ──→  ~1–2 ns
  L2 Cache     ──→  ~5–10 ns
  Main Memory  ──→  ~50–100 ns  (100× slower than registers!)
  SSD          ──→  ~100 μs

5.2 RISC-V Register File
#

RISC-V has 32 general-purpose registers, each 32 bits wide (in RV32I) or 64 bits (in RV64I):

RegisterABI NamePurpose
x0zeroHardwired to 0 (always reads as 0)
x1raReturn address
x2spStack pointer
x3gpGlobal pointer
x4tpThread pointer
x5–x7t0–t2Temporaries
x8s0/fpSaved register / Frame pointer
x9s1Saved register
x10–x11a0–a1Function arguments / return values
x12–x17a2–a7Function arguments
x18–x27s2–s11Saved registers
x28–x31t3–t6Temporaries

Why is x0 hardwired to 0? It simplifies many operations:

  • add x3, x1, x0 → move (copy x1 to x3)
  • addi x0, x0, 0 → nop (no operation)
  • slt x3, x0, x1 → test if x1 > 0

5.3 Register Design Trade-offs
#

More RegistersFewer Registers
Fewer memory accesses (faster)Simpler hardware
More bits needed per instructionShorter instructions
Larger register file (more area/power)Less context switch overhead

RISC-V’s choice of 32 registers is a well-established sweet spot — enough to keep most operands in registers, but not so many that instruction encoding becomes bloated (5 bits per register specifier × 3 registers = 15 bits, leaving room for opcode and immediates in 32-bit instructions).


6. The Program Counter (PC)
#

6.1 What Is the PC?
#

The Program Counter is a special register that holds the memory address of the current instruction being executed. After each instruction, the PC is typically updated to point to the next instruction:

$$ PC_{next} = PC + 4 \quad \text{(for 32-bit fixed-length instructions)} $$

Unless a branch or jump instruction redirects execution elsewhere.

6.2 Program Execution Flow
#

Memory:
┌──────────┬──────────────────┐
│ Address  │ Instruction      │
├──────────┼──────────────────┤
│ 0x0000   │ addi x1, x0, 5  │ ◄── PC starts here
│ 0x0004   │ addi x2, x0, 3  │
│ 0x0008   │ add  x3, x1, x2 │
│ 0x000C   │ sw   x3, 0(x4)  │
│ 0x0010   │ beq  x3, x5, L  │ ── Branch: if taken, PC jumps to L
│ 0x0014   │ addi x1, x1, 1  │
│ 0x0018   │ ...              │ ◄── L (branch target)
└──────────┴──────────────────┘

The CPU repeats this cycle endlessly:

┌────────────────────────────────┐
│   1. FETCH instruction at PC   │
│   2. DECODE the instruction    │
│   3. EXECUTE the operation     │
│   4. UPDATE the PC             │
│              │                 │
│              ▼                 │
│       (repeat forever)         │
└────────────────────────────────┘

This is the fetch-decode-execute cycle — the fundamental heartbeat of every processor.


7. Operand Types
#

Instructions can get their data from three sources:

7.1 Register Operands
#

Data comes from the register file. This is the fastest option.

add  x3, x1, x2   # All operands are registers

7.2 Immediate Operands
#

A constant value is encoded directly in the instruction bits. No memory or register lookup needed.

addi x3, x1, 42   # 42 is the immediate value

Immediates have limited range because they must fit within the instruction:

  • I-type: 12 bits → range $[-2048, +2047]$
  • U-type: 20 bits → for loading upper bits of large constants

Loading a full 32-bit constant requires two instructions:

lui  x3, 0x12345    # Load upper 20 bits: x3 = 0x12345000
addi x3, x3, 0x678  # Add lower 12 bits:  x3 = 0x12345678

7.3 Memory Operands
#

Data is loaded from or stored to memory at a computed address:

lw  x3, 8(x1)     # x3 = Memory[x1 + 8]
sw  x3, 8(x1)     # Memory[x1 + 8] = x3

In RISC architectures like RISC-V, only load and store instructions access memory. All computation happens on registers. This is called a load-store architecture.


8. Instruction Execution: Putting It All Together
#

Let’s trace through a complete example — computing a[3] = a[1] + a[2]:

Given: base address of array a is in x10, each element is 4 bytes (word).

# Step 1: Load a[1] into x5
lw   x5, 4(x10)     # x5 = Memory[x10 + 4] = a[1]

# Step 2: Load a[2] into x6
lw   x6, 8(x10)     # x6 = Memory[x10 + 8] = a[2]

# Step 3: Add them
add  x7, x5, x6     # x7 = x5 + x6 = a[1] + a[2]

# Step 4: Store result into a[3]
sw   x7, 12(x10)    # Memory[x10 + 12] = x7  a[3] = a[1] + a[2]

Execution trace:

Step  PC      Instruction         Registers Changed
────  ──────  ──────────────────  ─────────────────────
  1   0x0000  lw  x5, 4(x10)     x5 ← Memory[x10+4]
  2   0x0004  lw  x6, 8(x10)     x6 ← Memory[x10+8]
  3   0x0008  add x7, x5, x6     x7 ← x5 + x6
  4   0x000C  sw  x7, 12(x10)    Memory[x10+12] ← x7

9. Design Principles Behind ISA
#

Several guiding principles shape good ISA design:

Principle 1: Simplicity Favors Regularity
#

All RISC-V arithmetic instructions have the same format: op rd, rs1, rs2. This regularity makes the hardware decoder simple and fast.

Principle 2: Smaller Is Faster
#

RISC-V has 32 registers — not 64 or 128. A smaller register file is faster to access, consumes less power, and requires fewer bits in each instruction to specify.

Principle 3: Good Design Demands Compromise
#

The ISA must balance competing goals:

  • Large immediates (more flexibility) vs. short instructions (less memory)
  • Many instruction types (more expressiveness) vs. simple decoder (less hardware)

Principle 4: Make the Common Case Fast
#

The most frequently used instructions should be the simplest and fastest. RISC-V’s base integer ISA (RV32I) contains only 47 instructions — just enough for a complete computer, but no more.


10. Summary
#

ConceptKey Takeaway
ISAThe contract between software and hardware; defines what the CPU can do
Instruction typesArithmetic/logic, memory access, control flow, system
EncodingHow instructions are represented in binary; RISC-V uses fixed 32-bit formats
Registers32 fast storage locations (x0–x31) inside the CPU
Program CounterTracks the address of the current instruction
OperandsCan come from registers, immediates, or memory
Load-store architectureOnly load/store instructions access memory; all computation uses registers
Design principlesSimplicity, regularity, and making the common case fast

In the next post ([SoC-05]), we will dive deeper into memory addressing modes, compare CISC vs. RISC architectures, and explore the design philosophy of RISC-V.


This post is part of the SoC Design Course series. Navigate to the next post to continue your learning journey.

SoC Design Course - This article is part of a series.
Part 4: This Article