Skip to content

The C++ Compiler

Video: In 54 Minutes, Understand the whole C and C++ compilation process

The GNU Compiler Collection (GCC) is the standard compiler for C++ development. It is a powerful tool that allows you to write, compile, and debug C++ code. It is also the compiler that is used by the GNU project, which is the foundation of many C++ libraries and frameworks.

Your First C++ Program

Let's start with the simplest possible C++ program:

cpp
#include <iostream>

int main() {
    std::cout << "Hello, World!" << std::endl;
    return 0;
}

To compile this program with GCC, you would type:

cpp
g++ hello.cpp -o hello

This command tells GCC to:

  • Take your source file (hello.cpp)
  • Compile it into an executable program
  • Name the output file hello

Compiler Flags

Compiler flags are options you give to GCC to control how it compiles your code. They're like settings that tell the compiler what to do.

Basic Compilation

The simplest way to compile a C++ program:

cpp
g++ hello.cpp -o hello

This command:

  • g++ - tells your computer to use the C++ compiler
  • hello.cpp - the source file to compile
  • -o hello - name the output executable "hello"

Warning Flags

Warnings are the compiler's way of saying "this might be a problem":

cpp
g++ -Wall -Wextra hello.cpp -o hello
  • -Wall: Enable most common warnings
  • -Wextra: Enable additional warnings

Why use warnings? They help you catch mistakes before your program runs. It's like having a spell-checker for your code.

Optimization Flags

Optimization makes your program run faster:

cpp
g++ -O2 hello.cpp -o hello
  • -O0: No optimization (fastest to compile, slowest to run)
  • -O1: Basic optimization
  • -O2: Good optimization (recommended for most programs)
  • -O3: Maximum optimization (might make the program bigger)

Think of it like: The difference between driving carefully to avoid accidents (no optimization) vs. taking shortcuts to get there faster (optimization).

Debug Information

When you're learning or fixing bugs, you want debug information:

cpp
g++ -g hello.cpp -o hello

The -g flag adds information that helps debuggers show you exactly where problems occur in your source code.

CPP Standard Flags

The C++ committee releases a new standard every few years. To use the new features, you need to use the corresponding flag. For example, to use C++20 features, you need to use the -std=c++20 flag.

cpp
g++ -std=c++20 hello.cpp -o hello

Of course, you need to make sure that the compiler version you are using supports the standard you are trying to use.

The Build Process

There are three steps to the build process:

  1. Preprocessing
  2. Compilation
  3. Linking

The first two steps are done by the compiler, the third step is done by the linker. We'll discuss each step in more detail below.

Step 1: Preprocessing

Before GCC compiles your code, it first runs a preprocessor. The preprocessor handles special instructions that start with #. These are called preprocessor directives.

The preprocessor handles all the #include and #define directives

To check the output of the preprocessing, we can use the -E flag.

cpp
g++ -E hello.cpp -o hello.i

#include Directive

The most common preprocessor directive is #include. It tells the preprocessor to copy the contents of another file into your current file.

cpp
#include <iostream>  // Include the iostream library
#include "myheader.h" // Include your own header file

Internally, the preprocessor is just copying the contents of the other file at the point of the directive in your current file.

#define Directive

The #define directive creates a simple text replacement (called a macro):

cpp
#define PI 3.14159
#define MAX_SIZE 100

int radius = 5;
double area = PI * radius * radius;  // Becomes: 3.14159 * radius * radius

It is important to note that preprocessor directives are a literal text replacement and not a function. Therfore, you may see unintended side effects when using macros. For example:

cpp
#define SQUARE(x) x * x
int x = SQUARE(2); // Becomes: 2 * 2
int y = SQUARE(2 + 2); // Becomes: 2 + 2 * 2 + 2

Header Files

Header files (.h or .hpp) contain declarations - they tell the compiler "this function or class exists, and here's how to use it." They don't contain the actual implementation.

cpp
// math_utils.h - Header file
#ifndef MATH_UTILS_H
#define MATH_UTILS_H

// Declaration: "This function exists"
int add(int a, int b);
double multiply(double x, double y);

#endif
cpp
// math_utils.cpp - Source file
#include "math_utils.h"

// Definition: "Here's what the function actually does"
int add(int a, int b) {
    return a + b;
}

double multiply(double x, double y) {
    return x * y;
}

Include Guards

When you include a header file, the preprocessor literally copies all its contents into your file. If you include the same header twice, you get duplicate content, which causes errors.

Include guards prevent this:

cpp
#ifndef MATH_UTILS_H  // "If MATH_UTILS_H is not defined yet..."
#define MATH_UTILS_H  // "...then define it and include the content"

// Header content here

#endif  // End the conditional inclusion

#ifndef only lets the preprocessor proceed further if the constant mentioned in front of it has not been defined. Once we're in the if statement, we define the constant and can then write the content of the header file.

If someone includes the file twice and the preprocessor runs again, the constant MATH_UTILS_H would already be defined and the if statement would be false. This would prevent accidently including the file twice.

Most modern compilers include the directive #pragma once that is a shorthand for these include guards. It is important to note that #pragma once is not in the C++ standard and may not be supported by some compilers.

Translation Units

A translation unit is what your source file becomes after the preprocessor has finished processing all the #include directives and other preprocessor commands.

Simple example:

cpp
// main.cpp
#include <iostream>
#include "myheader.h"

int main() {
    std::cout << "Hello!" << std::endl;
    return 0;
}

After preprocessing, main.cpp becomes a much larger file containing:

  • All the contents of <iostream>
  • All the contents of "myheader.h"
  • Your original code

This expanded file is the translation unit that the compiler actually processes. A translation unit contains both declarations and some definitions so that the compiler can fully understand the code and generate the corresponding machine code.

Step 2: Compilation

The compiler turns your C++ code into machine code:

cpp
g++ -c hello.cpp -o hello.o

You can also turn the output of preprocessing into assembly code using the -S flag.

cpp
g++ -S hello.i -o hello.o

In this step, the compiler goes over the entire translation unit and generates assembly code for each function and class. Rememeber, that once a translation unit is created, the compiler doesn't need to look at any other files to generate the assembly code for the functions and classes that are already in the translation unit.

Step 3: Linking

The linker combines your code with libraries to create an executable:

cpp
g++ hello.o -o hello

This is the step where any external code/libraries are "linked" into the binary. When you use external header files, say <iostream>, the compiler will use a library like libc++ to provide the implementation of the functions/classes inside the iostream header.

Basically, while the header files provide declarations to the compiler, the linker needs to provide the actual definition to make the binary executable.

Summary

You've learned:

  • What a compiler is and how GCC works
  • How preprocessor directives work
  • The difference between headers and source files
  • How to use compiler flags

Don't worry if you don't understand everything perfectly yet. These concepts will become clearer as you write more code. The important thing is to start simple and build up gradually!

Questions

Q: What is a translation unit in C++?

A translation unit is a single source file (.cpp) after all preprocessing directives have been processed. It's the basic unit of compilation that the compiler processes independently.

Q: What is the purpose of include guards in header files?

Include guards prevent multiple definitions and declarations when a header is included more than once in a translation unit, avoiding compilation errors.

Q: Which compiler flag enables all warnings?

-Wall enables most common warnings. -Wextra enables additional warnings, -pedantic ensures strict ISO C++ compliance, and -O2 enables optimization.

Q: What is the One Definition Rule (ODR) violation?

ODR violation occurs when the same entity (function, class, variable) is defined in multiple translation units, leading to linker errors.

Q: What does the -g flag do in GCC?

The -g flag includes debug information in the compiled object files, allowing debuggers to provide meaningful information about source code locations.