Skip to content

Process Scheduling

Process scheduling is the heart of multitasking operating systems. The scheduler decides which process runs when, balancing fairness, efficiency, and responsiveness. The kernel is responsible for scheduling the processes and threads.

The scheduler actually makes sure that multiple processes can run at the same time. On each core, it saves the state of the current process and loads the state of the next process; giving the illusion of concurrent execution. That's why you'll notice that you can run more processes on your system than you have cores.

Process Hierarchy and Init System

The Init Process (PID 1)

The init process is the first process started by the kernel during system boot and is assigned PIDProcess ID - a unique identifier for each process 1. It serves as the root of the process tree and has several critical responsibilities:

  1. Process Tree Root: All processes in the system are descendants of init
  2. Orphan Adoption: When a parent process dies before its children, init automatically adopts the orphaned processes
  3. Zombie Cleanup: Init collects exit status from adopted processes, preventing zombies
  4. System Services: Starts and manages essential system services

Process Hierarchy Structure

cpp
init (PID 1)
├── systemd (PID 2) - system and service manager
│   ├── sshd (PID 100) - SSH daemon
│   │   ├── sshd (PID 101) - SSH session
│   │   └── bash (PID 102) - User shell
│   │       ├── vim (PID 103) - Text editor
│   │       └── gcc (PID 104) - Compiler
│   └── apache2 (PID 200) - Web server
│       ├── apache2 (PID 201) - Worker process
│       └── apache2 (PID 202) - Worker process
└── cron (PID 300) - Task scheduler
    └── backup.sh (PID 301) - Backup script

Why Init Must Be PID 1

The init process must have PID 1 for several reasons:

  1. Kernel Expectation: The kernel expects PID 1 to exist and be the ultimate parent
  2. Orphan Adoption: When a process becomes orphaned, the kernel automatically assigns it to PID 1
  3. System Stability: Ensures no process is left without a parent
  4. Boot Process: The kernel starts init directly, making it the first user process

Orphaned Process Handling

When a parent process terminates before its children:

cpp
// Example of orphaned process creation
int main() {
    pid_t child_pid = fork();

    if (child_pid == 0) {
        // Child process
        printf("Child PID: %d, Parent PID: %d\n", getpid(), getppid());
        sleep(10);  // Child sleeps for 10 seconds
        printf("Child after parent died - Parent PID: %d\n", getppid());
        exit(0);
    } else {
        // Parent process
        printf("Parent PID: %d, Child PID: %d\n", getpid(), child_pid);
        sleep(2);   // Parent dies after 2 seconds
        exit(0);    // Parent exits, child becomes orphaned
    }
}

Output:

cpp
Parent PID: 1234, Child PID: 1235
Child PID: 1235, Parent PID: 1234
Child after parent died - Parent PID: 1  // Now adopted by init

Zombie Processes and Cleanup

What Are Zombie Processes?

A zombie process is a process that has completed execution but still has an entry in the process tableA table that keeps track of all the processes in the system. The process is "dead" but not yet "buried" - its exit status is waiting to be collected by its parent.

Why Zombies Are Created

Zombies are created when:

  1. A child process calls exit() or _exit()
  2. The parent process doesn't call wait() or waitpid() to collect the exit status
  3. The kernel keeps the process table entry until the parent collects the status

The Problem with Zombies

cpp
// This creates zombie processes
int main() {
    for (int i = 0; i < 10; i++) {
        pid_t child_pid = fork();

        if (child_pid == 0) {
            // Child exits immediately
            printf("Child %d exiting\n", getpid());
            exit(0);
        } else {
            // Parent doesn't wait for children
            printf("Created child %d\n", child_pid);
        }
    }

    // Parent sleeps, children become zombies
    printf("Parent sleeping, check for zombies with: ps aux | grep Z\n");
    sleep(30);

    return 0;
}

Check for zombies:

cpp
ps aux | grep Z
# Shows zombie processes with state 'Z'

Why Parents Must Collect Children

Parents must collect their children's exit status for several reasons:

  1. Resource Cleanup: Process table entries are limited system resources
  2. Exit Status: The parent needs to know how the child terminated
  3. System Stability: Too many zombies can exhaust process table space
  4. Proper Process Management: Ensures complete process lifecycle

Proper Child Cleanup

cpp
// Proper child cleanup
int main() {
    pid_t child_pid = fork();

    if (child_pid == 0) {
        // Child process
        printf("Child working...\n");
        sleep(2);
        exit(42);  // Exit with status 42
    } else {
        // Parent process
        int status;
        pid_t terminated_pid = wait(&status);

        if (WIFEXITED(status)) {
            printf("Child %d exited with status: %d\n", 
                   terminated_pid, WEXITSTATUS(status));
        }
    }

    return 0;
}

Double Fork Technique (Daemon Creation)

To create a daemon process that won't create zombies:

cpp
int main() {
    pid_t first_child = fork();

    if (first_child == 0) {
        // First child
        pid_t second_child = fork();

        if (second_child == 0) {
            // Second child (grandchild) - this becomes the daemon
            printf("Daemon process (PID: %d)\n", getpid());
            printf("Parent: %d (will be 1 after first child exits)\n", getppid());

            // Daemon work here
            while (1) {
                sleep(1);
                // Daemon tasks
            }
        } else {
            // First child exits immediately
            exit(0);
        }
    } else {
        // Parent waits for first child
        wait(nullptr);
        printf("First child completed, grandchild adopted by init\n");
    }

    return 0;
}

Scheduling Goals

The scheduler must balance several competing goals:

  1. Fairness: All processes should get CPU time
  2. Efficiency: Minimize context switch overhead
  3. Responsiveness: Interactive processes should respond quickly
  4. Throughput: Maximize overall system performance
  5. Real-time requirements: Meet deadlines for critical tasks

Process States

A process can be in one of the following states:

  • Running: The process is currently executing on a core.
  • Ready: The process is ready to run on a core.
  • Blocked: The process is waiting for an event to happen. This could be waiting for a resource to become available, or waiting for an I/O operation to complete.
  • Zombie: The process has finished executing and is waiting for the parent process to collect its exit status.

Priority-Based Scheduling

Static Priority Scheduling

Each process has a static priority that determines its base scheduling priority:

cpp
// Priority levels (Linux-style)
#define MAX_PRIO        139
#define MIN_PRIO        0
#define DEFAULT_PRIO    120

// Nice values (user-adjustable priority)
#define NICE_TO_PRIO(nice) (DEFAULT_PRIO + (nice))
#define PRIO_TO_NICE(prio) ((prio) - DEFAULT_PRIO)

// Set process priority
int setpriority(int which, id_t who, int prio) {
    struct task_struct *task = find_task(who);
    if (!task) return -ESRCH;

    // Validate priority range
    if (prio < MIN_PRIO || prio > MAX_PRIO)
        return -EINVAL;

    // Update priority
    task->priority = prio;

    // Recalculate dynamic priority
    recalculate_priority(task);

    return 0;
}

Dynamic Priority Adjustment

The scheduler adjusts priorities based on process behavior:

cpp
// Dynamic priority calculation
void recalculate_priority(struct task_struct *task) {
    int bonus = task->sleep_avg / 32;  // Bonus for I/O bound processes
    int penalty = task->cpu_time / 32; // Penalty for CPU bound processes

    task->dynamic_priority = task->priority + bonus - penalty;

    // Clamp to valid range
    if (task->dynamic_priority < MIN_PRIO)
        task->dynamic_priority = MIN_PRIO;
    if (task->dynamic_priority > MAX_PRIO)
        task->dynamic_priority = MAX_PRIO;
}

Priority-Based Scheduler

cpp
// Priority-based scheduler
struct task_struct *schedule_priority() {
    struct task_struct *next = NULL;
    int highest_priority = MAX_PRIO + 1;

    // Find highest priority ready task
    list_for_each_entry(task, &run_queue, run_list) {
        if (task->state == TASK_RUNNING && 
            task->dynamic_priority < highest_priority) {
            highest_priority = task->dynamic_priority;
            next = task;
        }
    }

    return next;
}

Round-Robin Scheduling

Round-robin gives each process a time slice and cycles through them:

cpp
// Round-robin scheduler
struct task_struct *schedule_round_robin() {
    struct task_struct *current = get_current_task();
    struct task_struct *next = NULL;

    // Find next task in round-robin order
    if (current->time_slice > 0) {
        // Current task still has time
        next = current;
    } else {
        // Find next task in queue
        next = find_next_task(current);
        if (!next) {
            next = get_first_task(); // Wrap around
        }

        // Reset time slice
        next->time_slice = DEFAULT_TIME_SLICE;
    }

    return next;
}

// Time slice management
void update_time_slice(struct task_struct *task) {
    if (task->time_slice > 0) {
        task->time_slice--;
    }

    // If time slice expired, reschedule
    if (task->time_slice == 0) {
        schedule();
    }
}

Real-Time Scheduling

Real-Time Requirements

Real-time systems must meet strict timing requirements:

  1. Hard Real-Time: Missing a deadline is a system failure
  2. Soft Real-Time: Missing a deadline degrades performance
  3. Deterministic: Predictable response times
  4. Low Latency: Minimal interrupt and context switch overhead

Rate Monotonic (RM) Scheduling

Assigns priorities based on task frequency (higher frequency = higher priority):

cpp
// Rate Monotonic scheduling
struct rt_task {
    int priority;
    unsigned long period;      // Task period
    unsigned long deadline;    // Relative deadline
    unsigned long next_release; // Next release time
    void (*function)(void);    // Task function
};

// Rate Monotonic priority assignment
void assign_rm_priorities(struct rt_task *tasks, int num_tasks) {
    // Sort tasks by period (ascending)
    qsort(tasks, num_tasks, sizeof(struct rt_task), compare_periods);

    // Assign priorities (shorter period = higher priority)
    for (int i = 0; i < num_tasks; i++) {
        tasks[i].priority = num_tasks - i;  // Higher number = higher priority
    }
}

// Rate Monotonic scheduler
struct rt_task *schedule_rm(struct rt_task *tasks, int num_tasks) {
    struct rt_task *highest_priority = NULL;
    int max_priority = 0;

    for (int i = 0; i < num_tasks; i++) {
        if (tasks[i].next_release <= current_time && 
            tasks[i].priority > max_priority) {
            max_priority = tasks[i].priority;
            highest_priority = &tasks[i];
        }
    }

    return highest_priority;
}

Earliest Deadline First (EDF) Scheduling

Assigns priorities based on absolute deadlines (earlier deadline = higher priority):

cpp
// EDF scheduling
struct edf_task {
    int id;
    unsigned long period;
    unsigned long deadline;
    unsigned long absolute_deadline;  // Absolute deadline
    void (*function)(void);
};

// EDF scheduler
struct edf_task *schedule_edf(struct edf_task *tasks, int num_tasks) {
    struct edf_task *earliest_deadline = NULL;
    unsigned long earliest = ULONG_MAX;

    for (int i = 0; i < num_tasks; i++) {
        if (tasks[i].absolute_deadline <= current_time && 
            tasks[i].absolute_deadline < earliest) {
            earliest = tasks[i].absolute_deadline;
            earliest_deadline = &tasks[i];
        }
    }

    return earliest_deadline;
}

// Update absolute deadlines
void update_edf_deadlines(struct edf_task *tasks, int num_tasks) {
    for (int i = 0; i < num_tasks; i++) {
        if (current_time >= tasks[i].absolute_deadline) {
            // Task missed deadline or completed
            tasks[i].absolute_deadline += tasks[i].period;
        }
    }
}

CPU Affinity and Load Balancing

CPU Affinity

Binding processes to specific CPU cores reduces cache misses and improves performance:

cpp
// Set CPU affinity
int set_cpu_affinity(pid_t pid, cpu_set_t *mask) {
    struct task_struct *task = find_task(pid);
    if (!task) return -ESRCH;

    // Update CPU affinity mask
    task->cpus_allowed = *mask;

    // If task is running on disallowed CPU, migrate it
    if (!cpu_isset(task_cpu(task), &task->cpus_allowed)) {
        migrate_task(task);
    }

    return 0;
}

// Example: Bind process to CPU 0
cpu_set_t mask;
CPU_ZERO(&mask);
CPU_SET(0, &mask);
set_cpu_affinity(getpid(), &mask);

Since each core has its own dedicated L1/L2 cache, binding a process to a specific core can improve performance by keeping the same process's data in the local cache. If the process is not bounded, it can get scheduled on a different core and cache misses would result in degraded performance.

Load Balancing

Distributing work across multiple CPUs:

cpp
// Load balancer
void load_balance(int cpu) {
    struct rq *this_rq = cpu_rq(cpu);
    struct rq *busiest_rq = find_busiest_queue();

    if (!busiest_rq || busiest_rq == this_rq)
        return;

    // Calculate load difference
    int load_diff = busiest_rq->load.weight - this_rq->load.weight;

    if (load_diff > LOAD_BALANCE_THRESHOLD) {
        // Migrate tasks from busiest to current CPU
        migrate_tasks(busiest_rq, this_rq, load_diff / 2);
    }
}

Key Concepts Summary

Scheduling Algorithms

  • Priority-based: Higher priority processes run first
  • Round-robin: Equal time slices for all processes
  • Rate Monotonic: Priority based on task frequency
  • Earliest Deadline First: Priority based on deadlines

Performance Considerations

  • CPU Affinity: Reduces cache misses
  • Load Balancing: Distributes work across CPUs
  • Scheduling Overhead: Minimize context switches
  • Real-time Requirements: Meet strict deadlines

Optimization Strategies

  • Process Priority: Set appropriate priorities
  • CPU Binding: Use CPU affinity for performance-critical tasks
  • Load Distribution: Balance work across multiple cores
  • Real-time Scheduling: Use appropriate algorithms for time-critical tasks