Virtual Memory Management

Video: But, what is Virtual Memory?

Virtual memory is a memory management technique that gives each program the illusion of having its own complete memory space, even when physical memory is limited.

The Problem Virtual Memory Solves

Imagine you have a computer with only 4GB of RAM, but you want to run:

A web browser (needs a maximum of 2GB)
A video editor (needs a maximum of 3GB)
A database (needs a maximum of 2GB)
An email client (needs a maximum of 1GB)

Total needed: 8GB Available: 4GB Without virtual memory: Only 2 programs could run at once With virtual memory: All programs can run simultaneously!

How Virtual Memory Works

Think of virtual memory like a banking system:

Each program gets its own "account numbers" (virtual address space)
Money (data) is stored in vaults (physical memory + disk)
When you need money, the bank retrieves it from the vault (page fault handling)
If the vaults are full, some money goes to long-term storage (swapping to disk)

Virtual Pages and Physical Frames

What are Virtual Pages?

A virtual page is a fixed-size block of memory that your program thinks it owns. Think of it like a bank account number - it's an abstract identifier (like account #12345) that represents where money is held, but the actual money might be stored in different physical locations.

What are Physical Frames?

A physical frame is the actual piece of physical memory (RAM) where your data is stored. Think of it like a vault or safe - it's the real physical location where money (data) is actually kept.

How They Work Together

Virtual pages are like bank account numbers - they're abstract identifiers that programs use to reference their data. Physical frames are like actual vaults where the real data is stored. Key Point: Multiple programs can share the same physical memory! When one program finishes using a physical frame, another program can reuse it for its own virtual pages. Just like how another bank account might share the same vault space if one account is closed.

The Connection: Page Tables

Page tables are like a bank's account ledger that maps virtual pages to physical frames:

cpp

Virtual Page 1 → Physical Frame 5
Virtual Page 2 → Physical Frame 12
Virtual Page 3 → Physical Frame 3
Virtual Page 4 → Not in memory (on disk)

How it works: When your program tries to access a virtual page (account number), the page table (ledger) tells the system which physical frame (vault) actually contains the data, or whether the data needs to be loaded from disk first.

If a page is not being used for some time, the OS will swap it to disk to make room for other pages. This allows the program to continue running even if it is not using all of its memory. It also enables the program to access more memory than is physically available.

Page Table Entry Structure

Each page table entry contains:

Physical Frame Number (52 bits): Points to the actual physical memory location
Control Bits (12 bits total): Represents the state of the page. For example, is this page in physical memory? Can this page be modified? Can user programs access this?

Multi-Level Page Tables

Modern systems use a 4-level 512-ary tree to manage large address spaces efficiently:

Level 4: 512 entries → Points to Level 3 tables Level 3: 512 entries → Points to Level 2 tables Level 2: 512 entries → Points to Level 1 tables Level 1: 512 entries → Points to physical memory pages Why 4 levels?

Efficiency: Only create page tables for memory regions you actually use
Space savings: Don't waste memory on unused address ranges
Scalability: Can handle massive address spaces (256TB on x86_64)

Address Translation: How It Works

Step-by-Step Process

When your program accesses memory address 0x1234567890ABCDEF:

Split the address into parts:
- Level 4 index: 0x123 (bits 39-47)
- Level 3 index: 0x456 (bits 30-38)
- Level 2 index: 0x789 (bits 21-29)
- Level 1 index: 0x0AB (bits 12-20)
- Page offset: 0xCDEF (bits 0-11)
Follow the chain:
- Look up entry 0x123 in Level 4 table
- Follow pointer to Level 3 table, look up entry 0x456
- Follow pointer to Level 2 table, look up entry 0x789
- Follow pointer to Level 1 table, look up entry 0x0AB
- Get physical frame number from Level 1 entry
Calculate final address:
- Combine physical frame number with page offset 0xCDEF

What Happens During Page Faults?

If any step fails, a page fault occurs:

Minor Page Fault: Page exists but is not loaded in memory

Load page from disk into physical memory
Update page table entry
Resume program execution

Major Page Fault: Page doesn't exist

Create new page in physical memory
Initialize with zeros
Update page table entry
Resume program execution

Protection Fault: Access violation

Program tried to access memory it doesn't have permission for
Operating system terminates the program

Page Replacement: When Memory is Full

The Problem

When physical memory is full and a new page is needed, the OS must decide which existing page to remove.

1. First-In-First-Out (FIFO)

How it works: Remove the page that has been in memory the longest. Real-world analogy: Like a queue at a restaurant - first person in line gets served first. Example:

cpp

Memory: [Page A, Page B, Page C, Page D]
New page needed: Page E
FIFO removes: Page A (oldest)
Result: [Page B, Page C, Page D, Page E]

Pros: Simple to implement Cons: Poor performance - doesn't consider if page is still being used. Therefore it is possible that a page that is actively being used is evicted from memory and would need to be loaded back into memory.

2. Least Recently Used (LRU)

How it works: Remove the page that hasn't been accessed for the longest time. Real-world analogy: Like a bank - accounts that haven't been accessed recently get moved to long-term storage. Example:

cpp

Memory: [Page A, Page B, Page C, Page D]
Access pattern: A, B, C, A, B, D
LRU removes: Page C (least recently used)

Pros: Good performance, considers actual usage Cons: Complex to implement, requires tracking access times

3. Clock Algorithm (Second Chance)

How it works: Give each page a "second chance" before removing it. Real-world analogy: Like a clock hand that moves around memory, giving pages a chance to prove they're still needed. How it works:

Each page has a "reference bit" (R)
When page is accessed, set R = 1
Clock hand moves around memory
If R = 1, set R = 0 and move to next page
If R = 0, remove this page

Example:

cpp

Memory: [Page A(R=1), Page B(R=0), Page C(R=1), Page D(R=0)]
Clock hand at Page B
Action: Set R=0, move to Page C
Next: Set R=0, move to Page D
Next: Remove Page D (R=0)

Pros: Good performance, simple implementation Cons: Not as good as LRU but much simpler

Performance Comparison

Algorithm	Performance	Complexity	Real-world Use
FIFO	Poor	Simple	Rarely used
LRU	Excellent	Complex	Used in theory
Clock	Good	Medium	Used in practice

Thrashing: When the System Gets Overwhelmed

What is Thrashing?

ThrashingWhen the system spends more time swapping pages in and out of memory than doing actual work occurs when the system spends more time swapping pages in and out of memory than doing actual work.

Real-World Analogy

Think of a bank during a financial crisis:

Customers keep withdrawing and depositing the same money
Bank tellers spend all day moving money between vaults and storage
Very little actual banking gets done!

How to Detect Thrashing

Symptoms:

High page fault rate: 100+ page faults per second
Low CPU utilization: CPU is idle while waiting for disk I/O
Poor system responsiveness: Everything feels slow
High disk activity: Disk light constantly blinking

Example:

cpp

Normal system: 5 page faults/second, 80% CPU usage
Thrashing system: 200 page faults/second, 10% CPU usage

1. Reduce the Number of Active Processes

Example: Instead of running 50 processes, run only 20.

2. Increase Physical Memory

Example: Upgrade from 8GB to 16GB RAM.

3. Optimize Memory Usage

Example: Close unnecessary browser tabs, reduce database cache size.

4. Use Working Set Management

Working setThe set of pages a process needs to run efficiently: The set of pages a process needs to run efficiently.

Example:

Process A working set: 50MB
Process B working set: 30MB
Process C working set: 20MB
Total needed: 100MB
Available memory: 80MB
Solution: Suspend Process C until more memory is available

Fragmentation: Memory Wastage

1. External Fragmentation

What it is: Free memory exists but is scattered in small, non-contiguous blocks. Real-world analogy: Like a parking lot where cars are parked randomly, leaving small gaps that can't fit a large truck. Example:

cpp

Memory layout: [Process A][ ][Process B][ ][Process C][ ][Process D]
Free spaces: 30MB + 25MB + 10MB = 65MB total
Problem: Can't allocate 40MB because no single block is large enough

Impact:

Memory appears full even when plenty is available
Allocation requests fail
System performance degrades

2. Internal Fragmentation

What it is: Allocated memory blocks are larger than what was requested. Real-world analogy: Like buying a small item that comes in a large box - you pay for empty space. Example:

cpp

System uses 4KB pages:
- Request 1000 bytes → Get 4096 bytes (3096 bytes wasted)
- Request 5000 bytes → Get 8192 bytes (3092 bytes wasted)
- Request 2000 bytes → Get 4096 bytes (2096 bytes wasted)
Total waste: 8,284 bytes out of 16,384 bytes (51% waste)

Impact:

Reduced memory efficiency
Higher memory costs
Slower performance

1. Memory Compaction

What it is: Moving allocated blocks together to create large free spaces. Real-world analogy: Like reorganizing books on a shelf to create one large empty space. Example:

cpp

Before: [A][ ][B][ ][C][ ][D]
After:  [A][B][C][D][        ]

Pros: Eliminates external fragmentation Cons: Expensive operation, requires updating all pointers

2. Buddy Allocation

What it is: Always allocate memory in power-of-2 sizes. Real-world analogy: Like cutting paper - always cut in half, then half again. Example:

cpp

Request 3KB → Round up to 4KB → Split 8KB block → Allocate 4KB

Pros: Predictable, easy to merge adjacent blocks Cons: Can cause internal fragmentation

3. Slab Allocation

What it is: Pre-allocate pools for commonly used sizes. Real-world analogy: Like a restaurant that pre-cooks popular dishes. Example:

cpp

Pool 1: 1000 blocks of 64 bytes each
Pool 2: 500 blocks of 128 bytes each
Pool 3: 250 blocks of 256 bytes each

Pros: Fast allocation, no fragmentation within pools Cons: Only works for common object sizes

Performance Optimization

Spatial Locality

What it is: Tendency to access memory locations near recently accessed locations. Real-world analogy: Like reading a book - you're more likely to read the next page than jump to a random page. Good examples:

Sequential array access: for (int i = 0; i < 1000; i++) array[i] = 0;
Matrix operations: Accessing elements row by row
File reading: Reading consecutive bytes

Performance impact:

Good spatial locality: Fewer page faults, better cache performance
Poor spatial locality: More page faults, slower execution

Temporal Locality

What it is: Tendency to access the same memory locations repeatedly. Real-world analogy: Like your desk - you keep frequently used items close at hand. Good examples:

Loop variables: The counter i is accessed repeatedly
Function parameters: Accessed multiple times within a function
Global variables: Frequently used data

Performance impact:

Good temporal locality: Data stays in cache, fast access
Poor temporal locality: Data gets evicted from cache, slow access

Optimization Strategies

In systems where performance is critical, it is important to avoid TLB misses and page faults. A page fault can cause milliseconds of latency where your program is waiting for the page to be loaded into memory and sitting idle. That is wasted time that could be used to process more data! Therefore you must be aware of the following strategies to optimize your program's memory usage.

1. Page Size Selection

Large Pages (2MB, 1GB):

Pros: Fewer page table entries, better for large datasets
Cons: More internal fragmentation

Small Pages (4KB):

Pros: Less internal fragmentation, better memory utilization
Cons: More page table entries, more TLB misses

When to use each:

Large pages: Databases, scientific computing
Small pages: General applications, web servers

1.5. Hugepages

What are hugepages? Hugepages are memory pages that are significantly larger than the standard 4KB pages. Common sizes include 2MB and 1GB pages on x86_64 systems. How hugepages help performance:

Reduced TLB Pressure: The Translation Lookaside Buffer (TLB) is a small, fast cache that stores recent virtual-to-physical address translations. With standard 4KB pages, the TLB can only cache a limited number of translations. Hugepages dramatically increase the amount of memory that can be cached in the TLB.

Example: If your TLB can hold 64 entries:

With 4KB pages: 64 × 4KB = 256KB of memory cached
With 2MB hugepages: 64 × 2MB = 128MB of memory cached

Fewer Page Table Entries: Large pages require fewer page table entries, reducing memory overhead and improving page table walk performance.
Better Memory Locality: Hugepages ensure that large data structures are stored in contiguous physical memory, improving cache performance and reducing memory access latency.

When to use hugepages:

Databases: Large database buffers and working sets
Scientific Computing: Large matrices and datasets
High-Performance Computing: Applications with large memory footprints
Virtual Machines: Large guest memory allocations

Example - Database Performance:

cpp

Standard 4KB pages:
- Database buffer: 8GB
- TLB entries needed: ~2 million
- TLB misses: Frequent, causing page table walks

2MB hugepages:
- Database buffer: 8GB  
- TLB entries needed: ~4,000
- TLB misses: Rare, much better performance

How to use hugepages:

Reserve hugepages at boot time:

cpp

# In /etc/default/grub
   GRUB_CMDLINE_LINUX="hugepagesz=2M hugepages=1024"

Mount hugepage filesystem:

cpp

mount -t hugetlbfs nodev /mnt/huge

Allocate hugepages in your application:

cpp

// C example
   void *ptr = mmap(NULL, size, PROT_READ | PROT_WRITE,
                   MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB,
                   -1, 0);

Trade-offs:

Pros: Better performance for large memory workloads, reduced TLB misses
Cons: More internal fragmentation, requires system configuration, not suitable for small allocations

2. Memory Mapping

What it is: Using memory-mapped files instead of traditional file I/O. Real-world analogy: Instead of copying a book page by page, put the entire book on a table and read directly from it. Benefits:

Faster access (no system call overhead)
Shared memory between processes
Lazy loading (pages loaded only when accessed)

Examples:

Database files for fast query processing
Configuration files for quick access
Large data files to avoid loading entire file

3. Prefetching

What it is: Loading data into memory before it's needed. Real-world analogy: Like a waiter who brings your next course before you finish the current one. Types:

Hardware prefetching: CPU automatically detects patterns
Software prefetching: Program explicitly requests data
OS prefetching: Operating system predicts and loads pages

Examples:

Sequential file reading: Prefetch next few pages
Array processing: Prefetch next array elements
Database queries: Prefetch related data pages

Key Takeaways

Virtual Memory Concepts

Virtual pages provide abstraction and protection
Page tables map virtual to physical addresses
Multi-level page tables reduce memory overhead
TLB caches page table entries for performance

Page Replacement

FIFO: Simple but poor performance
LRU: Good performance but complex implementation
Clock: Good compromise between performance and complexity
Working set: Important for thrashing prevention

Thrashing

High page fault rate with low CPU utilization
Working set exceeds available physical memory
Prevention: Reduce processes, increase memory, optimize patterns
Detection: Monitor page fault rates and CPU utilization

Fragmentation

External: Free memory scattered in small blocks
Internal: Allocated blocks larger than requested
Mitigation: Compaction, buddy allocation, slab allocation
Impact: Reduced memory utilization and performance

Performance Optimization

Spatial locality: Access nearby memory locations
Temporal locality: Reuse recently accessed data
Page size: Choose appropriate page size for workload
Memory mapping: Use mmap() for large I/O operations
NUMA awareness: Allocate memory on local nodes

Virtual Memory Management ​

The Problem Virtual Memory Solves ​

How Virtual Memory Works ​

Virtual Pages and Physical Frames ​

What are Virtual Pages? ​

What are Physical Frames? ​

How They Work Together ​

The Connection: Page Tables ​

Page Table Entry Structure ​

Multi-Level Page Tables ​

Address Translation: How It Works ​

Step-by-Step Process ​

What Happens During Page Faults? ​

Page Replacement: When Memory is Full ​

The Problem ​

1. First-In-First-Out (FIFO) ​

2. Least Recently Used (LRU) ​

3. Clock Algorithm (Second Chance) ​

Performance Comparison ​

Thrashing: When the System Gets Overwhelmed ​

What is Thrashing? ​

Real-World Analogy ​

How to Detect Thrashing ​

1. Reduce the Number of Active Processes ​

2. Increase Physical Memory ​

3. Optimize Memory Usage ​

4. Use Working Set Management ​

Fragmentation: Memory Wastage ​

1. External Fragmentation ​

2. Internal Fragmentation ​

1. Memory Compaction ​

2. Buddy Allocation ​

3. Slab Allocation ​

Performance Optimization ​

Spatial Locality ​

Temporal Locality ​

Optimization Strategies ​

1. Page Size Selection ​

1.5. Hugepages ​

2. Memory Mapping ​

3. Prefetching ​

Key Takeaways ​

Virtual Memory Concepts ​

Page Replacement ​

Thrashing ​

Fragmentation ​

Performance Optimization ​

Virtual Memory Management

The Problem Virtual Memory Solves

How Virtual Memory Works

Virtual Pages and Physical Frames

What are Virtual Pages?

What are Physical Frames?

How They Work Together

The Connection: Page Tables

Page Table Entry Structure

Multi-Level Page Tables

Address Translation: How It Works

Step-by-Step Process

What Happens During Page Faults?

Page Replacement: When Memory is Full

The Problem

1. First-In-First-Out (FIFO)

2. Least Recently Used (LRU)

3. Clock Algorithm (Second Chance)

Performance Comparison

Thrashing: When the System Gets Overwhelmed

What is Thrashing?

Real-World Analogy

How to Detect Thrashing

1. Reduce the Number of Active Processes

2. Increase Physical Memory

3. Optimize Memory Usage

4. Use Working Set Management

Fragmentation: Memory Wastage

1. External Fragmentation

2. Internal Fragmentation

1. Memory Compaction

2. Buddy Allocation

3. Slab Allocation

Performance Optimization

Spatial Locality

Temporal Locality

Optimization Strategies

1. Page Size Selection

1.5. Hugepages

2. Memory Mapping

3. Prefetching

Key Takeaways

Virtual Memory Concepts

Page Replacement

Thrashing

Fragmentation

Performance Optimization