Appearance
Process Scheduling
Process scheduling is the heart of multitasking operating systems. The scheduler decides which process runs when, balancing fairness, efficiency, and responsiveness. The kernel is responsible for scheduling the processes and threads.
The scheduler actually makes sure that multiple processes can run at the same time. On each core, it saves the state of the current process and loads the state of the next process; giving the illusion of concurrent execution. That's why you'll notice that you can run more processes on your system than you have cores.
Process Hierarchy and Init System
The Init Process (PID 1)
The init process is the first process started by the kernel during system boot and is assigned PIDProcess ID - a unique identifier for each process 1. It serves as the root of the process tree and has several critical responsibilities:
- Process Tree Root: All processes in the system are descendants of init
- Orphan Adoption: When a parent process dies before its children, init automatically adopts the orphaned processes
- Zombie Cleanup: Init collects exit status from adopted processes, preventing zombies
- System Services: Starts and manages essential system services
Process Hierarchy Structure
cpp
init (PID 1)
├── systemd (PID 2) - system and service manager
│ ├── sshd (PID 100) - SSH daemon
│ │ ├── sshd (PID 101) - SSH session
│ │ └── bash (PID 102) - User shell
│ │ ├── vim (PID 103) - Text editor
│ │ └── gcc (PID 104) - Compiler
│ └── apache2 (PID 200) - Web server
│ ├── apache2 (PID 201) - Worker process
│ └── apache2 (PID 202) - Worker process
└── cron (PID 300) - Task scheduler
└── backup.sh (PID 301) - Backup scriptWhy Init Must Be PID 1
The init process must have PID 1 for several reasons:
- Kernel Expectation: The kernel expects PID 1 to exist and be the ultimate parent
- Orphan Adoption: When a process becomes orphaned, the kernel automatically assigns it to PID 1
- System Stability: Ensures no process is left without a parent
- Boot Process: The kernel starts init directly, making it the first user process
Orphaned Process Handling
When a parent process terminates before its children:
cpp
// Example of orphaned process creation
int main() {
pid_t child_pid = fork();
if (child_pid == 0) {
// Child process
printf("Child PID: %d, Parent PID: %d\n", getpid(), getppid());
sleep(10); // Child sleeps for 10 seconds
printf("Child after parent died - Parent PID: %d\n", getppid());
exit(0);
} else {
// Parent process
printf("Parent PID: %d, Child PID: %d\n", getpid(), child_pid);
sleep(2); // Parent dies after 2 seconds
exit(0); // Parent exits, child becomes orphaned
}
}Output:
cpp
Parent PID: 1234, Child PID: 1235
Child PID: 1235, Parent PID: 1234
Child after parent died - Parent PID: 1 // Now adopted by initZombie Processes and Cleanup
What Are Zombie Processes?
A zombie process is a process that has completed execution but still has an entry in the process tableA table that keeps track of all the processes in the system. The process is "dead" but not yet "buried" - its exit status is waiting to be collected by its parent.
Why Zombies Are Created
Zombies are created when:
- A child process calls
exit()or_exit() - The parent process doesn't call
wait()orwaitpid()to collect the exit status - The kernel keeps the process table entry until the parent collects the status
The Problem with Zombies
cpp
// This creates zombie processes
int main() {
for (int i = 0; i < 10; i++) {
pid_t child_pid = fork();
if (child_pid == 0) {
// Child exits immediately
printf("Child %d exiting\n", getpid());
exit(0);
} else {
// Parent doesn't wait for children
printf("Created child %d\n", child_pid);
}
}
// Parent sleeps, children become zombies
printf("Parent sleeping, check for zombies with: ps aux | grep Z\n");
sleep(30);
return 0;
}Check for zombies:
cpp
ps aux | grep Z
# Shows zombie processes with state 'Z'Why Parents Must Collect Children
Parents must collect their children's exit status for several reasons:
- Resource Cleanup: Process table entries are limited system resources
- Exit Status: The parent needs to know how the child terminated
- System Stability: Too many zombies can exhaust process table space
- Proper Process Management: Ensures complete process lifecycle
Proper Child Cleanup
cpp
// Proper child cleanup
int main() {
pid_t child_pid = fork();
if (child_pid == 0) {
// Child process
printf("Child working...\n");
sleep(2);
exit(42); // Exit with status 42
} else {
// Parent process
int status;
pid_t terminated_pid = wait(&status);
if (WIFEXITED(status)) {
printf("Child %d exited with status: %d\n",
terminated_pid, WEXITSTATUS(status));
}
}
return 0;
}Double Fork Technique (Daemon Creation)
To create a daemon process that won't create zombies:
cpp
int main() {
pid_t first_child = fork();
if (first_child == 0) {
// First child
pid_t second_child = fork();
if (second_child == 0) {
// Second child (grandchild) - this becomes the daemon
printf("Daemon process (PID: %d)\n", getpid());
printf("Parent: %d (will be 1 after first child exits)\n", getppid());
// Daemon work here
while (1) {
sleep(1);
// Daemon tasks
}
} else {
// First child exits immediately
exit(0);
}
} else {
// Parent waits for first child
wait(nullptr);
printf("First child completed, grandchild adopted by init\n");
}
return 0;
}Scheduling Goals
The scheduler must balance several competing goals:
- Fairness: All processes should get CPU time
- Efficiency: Minimize context switch overhead
- Responsiveness: Interactive processes should respond quickly
- Throughput: Maximize overall system performance
- Real-time requirements: Meet deadlines for critical tasks
Process States
A process can be in one of the following states:
- Running: The process is currently executing on a core.
- Ready: The process is ready to run on a core.
- Blocked: The process is waiting for an event to happen. This could be waiting for a resource to become available, or waiting for an I/O operation to complete.
- Zombie: The process has finished executing and is waiting for the parent process to collect its exit status.
Priority-Based Scheduling
Static Priority Scheduling
Each process has a static priority that determines its base scheduling priority:
cpp
// Priority levels (Linux-style)
#define MAX_PRIO 139
#define MIN_PRIO 0
#define DEFAULT_PRIO 120
// Nice values (user-adjustable priority)
#define NICE_TO_PRIO(nice) (DEFAULT_PRIO + (nice))
#define PRIO_TO_NICE(prio) ((prio) - DEFAULT_PRIO)
// Set process priority
int setpriority(int which, id_t who, int prio) {
struct task_struct *task = find_task(who);
if (!task) return -ESRCH;
// Validate priority range
if (prio < MIN_PRIO || prio > MAX_PRIO)
return -EINVAL;
// Update priority
task->priority = prio;
// Recalculate dynamic priority
recalculate_priority(task);
return 0;
}Dynamic Priority Adjustment
The scheduler adjusts priorities based on process behavior:
cpp
// Dynamic priority calculation
void recalculate_priority(struct task_struct *task) {
int bonus = task->sleep_avg / 32; // Bonus for I/O bound processes
int penalty = task->cpu_time / 32; // Penalty for CPU bound processes
task->dynamic_priority = task->priority + bonus - penalty;
// Clamp to valid range
if (task->dynamic_priority < MIN_PRIO)
task->dynamic_priority = MIN_PRIO;
if (task->dynamic_priority > MAX_PRIO)
task->dynamic_priority = MAX_PRIO;
}Priority-Based Scheduler
cpp
// Priority-based scheduler
struct task_struct *schedule_priority() {
struct task_struct *next = NULL;
int highest_priority = MAX_PRIO + 1;
// Find highest priority ready task
list_for_each_entry(task, &run_queue, run_list) {
if (task->state == TASK_RUNNING &&
task->dynamic_priority < highest_priority) {
highest_priority = task->dynamic_priority;
next = task;
}
}
return next;
}Round-Robin Scheduling
Round-robin gives each process a time slice and cycles through them:
cpp
// Round-robin scheduler
struct task_struct *schedule_round_robin() {
struct task_struct *current = get_current_task();
struct task_struct *next = NULL;
// Find next task in round-robin order
if (current->time_slice > 0) {
// Current task still has time
next = current;
} else {
// Find next task in queue
next = find_next_task(current);
if (!next) {
next = get_first_task(); // Wrap around
}
// Reset time slice
next->time_slice = DEFAULT_TIME_SLICE;
}
return next;
}
// Time slice management
void update_time_slice(struct task_struct *task) {
if (task->time_slice > 0) {
task->time_slice--;
}
// If time slice expired, reschedule
if (task->time_slice == 0) {
schedule();
}
}Real-Time Scheduling
Real-Time Requirements
Real-time systems must meet strict timing requirements:
- Hard Real-Time: Missing a deadline is a system failure
- Soft Real-Time: Missing a deadline degrades performance
- Deterministic: Predictable response times
- Low Latency: Minimal interrupt and context switch overhead
Rate Monotonic (RM) Scheduling
Assigns priorities based on task frequency (higher frequency = higher priority):
cpp
// Rate Monotonic scheduling
struct rt_task {
int priority;
unsigned long period; // Task period
unsigned long deadline; // Relative deadline
unsigned long next_release; // Next release time
void (*function)(void); // Task function
};
// Rate Monotonic priority assignment
void assign_rm_priorities(struct rt_task *tasks, int num_tasks) {
// Sort tasks by period (ascending)
qsort(tasks, num_tasks, sizeof(struct rt_task), compare_periods);
// Assign priorities (shorter period = higher priority)
for (int i = 0; i < num_tasks; i++) {
tasks[i].priority = num_tasks - i; // Higher number = higher priority
}
}
// Rate Monotonic scheduler
struct rt_task *schedule_rm(struct rt_task *tasks, int num_tasks) {
struct rt_task *highest_priority = NULL;
int max_priority = 0;
for (int i = 0; i < num_tasks; i++) {
if (tasks[i].next_release <= current_time &&
tasks[i].priority > max_priority) {
max_priority = tasks[i].priority;
highest_priority = &tasks[i];
}
}
return highest_priority;
}Earliest Deadline First (EDF) Scheduling
Assigns priorities based on absolute deadlines (earlier deadline = higher priority):
cpp
// EDF scheduling
struct edf_task {
int id;
unsigned long period;
unsigned long deadline;
unsigned long absolute_deadline; // Absolute deadline
void (*function)(void);
};
// EDF scheduler
struct edf_task *schedule_edf(struct edf_task *tasks, int num_tasks) {
struct edf_task *earliest_deadline = NULL;
unsigned long earliest = ULONG_MAX;
for (int i = 0; i < num_tasks; i++) {
if (tasks[i].absolute_deadline <= current_time &&
tasks[i].absolute_deadline < earliest) {
earliest = tasks[i].absolute_deadline;
earliest_deadline = &tasks[i];
}
}
return earliest_deadline;
}
// Update absolute deadlines
void update_edf_deadlines(struct edf_task *tasks, int num_tasks) {
for (int i = 0; i < num_tasks; i++) {
if (current_time >= tasks[i].absolute_deadline) {
// Task missed deadline or completed
tasks[i].absolute_deadline += tasks[i].period;
}
}
}CPU Affinity and Load Balancing
CPU Affinity
Binding processes to specific CPU cores reduces cache misses and improves performance:
cpp
// Set CPU affinity
int set_cpu_affinity(pid_t pid, cpu_set_t *mask) {
struct task_struct *task = find_task(pid);
if (!task) return -ESRCH;
// Update CPU affinity mask
task->cpus_allowed = *mask;
// If task is running on disallowed CPU, migrate it
if (!cpu_isset(task_cpu(task), &task->cpus_allowed)) {
migrate_task(task);
}
return 0;
}
// Example: Bind process to CPU 0
cpu_set_t mask;
CPU_ZERO(&mask);
CPU_SET(0, &mask);
set_cpu_affinity(getpid(), &mask);Since each core has its own dedicated L1/L2 cache, binding a process to a specific core can improve performance by keeping the same process's data in the local cache. If the process is not bounded, it can get scheduled on a different core and cache misses would result in degraded performance.
Load Balancing
Distributing work across multiple CPUs:
cpp
// Load balancer
void load_balance(int cpu) {
struct rq *this_rq = cpu_rq(cpu);
struct rq *busiest_rq = find_busiest_queue();
if (!busiest_rq || busiest_rq == this_rq)
return;
// Calculate load difference
int load_diff = busiest_rq->load.weight - this_rq->load.weight;
if (load_diff > LOAD_BALANCE_THRESHOLD) {
// Migrate tasks from busiest to current CPU
migrate_tasks(busiest_rq, this_rq, load_diff / 2);
}
}Key Concepts Summary
Scheduling Algorithms
- Priority-based: Higher priority processes run first
- Round-robin: Equal time slices for all processes
- Rate Monotonic: Priority based on task frequency
- Earliest Deadline First: Priority based on deadlines
Performance Considerations
- CPU Affinity: Reduces cache misses
- Load Balancing: Distributes work across CPUs
- Scheduling Overhead: Minimize context switches
- Real-time Requirements: Meet strict deadlines
Optimization Strategies
- Process Priority: Set appropriate priorities
- CPU Binding: Use CPU affinity for performance-critical tasks
- Load Distribution: Balance work across multiple cores
- Real-time Scheduling: Use appropriate algorithms for time-critical tasks