Skip to content

std::vector

Video: Back to Basics: Classic STL - Bob Steagall - CppCon 2021

std::vector is the most commonly used container in C++. It's a dynamic array that automatically manages memory, provides fast random access, and offers excellent performance for most use cases. Understanding std::vector is essential for effective C++ programming.

What is std::vector?

std::vector is a sequence container that represents a dynamic array. It provides:

  • Dynamic sizing: Grows and shrinks automatically
  • Fast random access: O(1) access to any element
  • Contiguous memory: Elements are stored in adjacent memory locations
  • Automatic memory management: Handles allocation and deallocation

Basic Usage

Creating and Initializing Vectors

cpp
#include <vector>
#include <iostream>

// Vector with initial size
std::vector<int> sized_vec(10);  // 10 elements, all 0
// Vector from initializer list
std::vector<int> init_vec = {1, 2, 3, 4, 5};

// Vector with custom allocator (advanced)
std::vector<int, std::allocator<int>> custom_vec;

Adding and Removing Elements

cpp
std::vector<int> vec = {1, 2, 3};

// Add elements at the end
vec.push_back(4);        // vec = {1, 2, 3, 4}
vec.emplace_back(5);     // vec = {1, 2, 3, 4, 5}

// Insert elements at specific positions
vec.insert(vec.begin() + 1, 10);  // vec = {1, 10, 2, 3, 4, 5}

// Remove elements
vec.pop_back();          // vec = {1, 10, 2, 3, 4}
vec.erase(vec.begin() + 1);  // vec = {1, 2, 3, 4}

// Clear all elements
vec.clear();             // vec = {}

emplace_back is similar to push_back, but it constructs the element in place, which is more efficient than push_back if the element is already constructed.

Performance Characteristics

Time Complexity

OperationTime ComplexityNotes
Random AccessO(1)Direct memory access
Insert/Delete at EndO(1) amortizedpush_back, pop_back
Insert/Delete in MiddleO(n)Requires shifting elements
SearchO(n)Linear search (use algorithms for better performance)

Memory Management

cpp
std::vector<int> vec;

// Check current size and capacity
std::cout << "Size: " << vec.size() << std::endl;           // 0
std::cout << "Capacity: " << vec.capacity() << std::endl;   // 0

// Add elements and watch capacity grow
vec.push_back(1);  // size=1, capacity=1
vec.push_back(2);  // size=2, capacity=2
vec.push_back(3);  // size=3, capacity=4 (doubled!)
vec.push_back(4);  // size=4, capacity=4
vec.push_back(5);  // size=5, capacity=8 (doubled again!)

// Reserve memory to avoid reallocations
vec.reserve(100);  // Allocates space for 100 elements
std::cout << "Capacity after reserve: " << vec.capacity() << std::endl;  // 100

Key Insight: Vectors grow exponentially (typically doubling) to maintain O(1) amortized performance for push_back operations.

Common Operations

Accessing Elements

cpp
std::vector<int> vec = {10, 20, 30, 40, 50};

// Bounds-checked access
int first = vec.at(0);    // 10, throws if out of bounds
int last = vec.at(vec.size() - 1);  // 50

// Unchecked access (faster)
int first_fast = vec[0];  // 10, undefined behavior if out of bounds
int last_fast = vec[vec.size() - 1];  // 50

// Front and back access
int front = vec.front();  // 10
int back = vec.back();    // 50

// Safe access with bounds checking
if (!vec.empty()) {
    int first_safe = vec[0];  // Safe because we checked
}

Iterating Over Vectors

cpp
std::vector<int> vec = {1, 2, 3, 4, 5};

// Range-based for loop (C++11)
for (const auto& item : vec) {
    std::cout << item << " ";  // 1 2 3 4 5
}

// Range-based for loop with modification
for (auto& item : vec) {
    item *= 2;  // Double each element
}

// Traditional index-based loop
for (size_t i = 0; i < vec.size(); ++i) {
    std::cout << vec[i] << " ";
}

// Iterator-based loop
for (auto it = vec.begin(); it != vec.end(); ++it) {
    std::cout << *it << " ";
}

// Reverse iteration
for (auto it = vec.rbegin(); it != vec.rend(); ++it) {
    std::cout << *it << " ";  // 5 4 3 2 1
}

Finding and Searching

cpp
#include <algorithm>

std::vector<int> vec = {10, 20, 30, 40, 50};

// Find element
auto it = std::find(vec.begin(), vec.end(), 30);
if (it != vec.end()) {
    std::cout << "Found 30 at position: " << (it - vec.begin()) << std::endl;
}

// Find element with custom predicate
auto even_it = std::find_if(vec.begin(), vec.end(),
                           (int x) { return x % 2 == 0; });
if (even_it != vec.end()) {
    std::cout << "First even number: " << *even_it << std::endl;
}

// Count occurrences
int count = std::count(vec.begin(), vec.end(), 20);  // 1

// Check if element exists
bool has_30 = std::find(vec.begin(), vec.end(), 30) != vec.end();  // true

Memory Management Best Practices

Pre-allocating Memory

cpp
std::vector<int> vec;

// Reserve memory when you know the size
vec.reserve(1000);  // Allocate space for 1000 elements

// Add elements without reallocation
for (int i = 0; i < 1000; ++i) {
    vec.push_back(i);  // No reallocation needed
}

// Shrink to fit (reduce capacity to match size)
vec.shrink_to_fit();
std::cout << "Size: " << vec.size() << std::endl;           // 1000
std::cout << "Capacity: " << vec.capacity() << std::endl;   // 1000

Efficient Construction

cpp
// Avoid multiple push_back calls when possible
std::vector<int> inefficient;
for (int i = 0; i < 1000; ++i) {
    inefficient.push_back(i);  // May cause multiple reallocations
}

// More efficient: reserve first
std::vector<int> efficient;
efficient.reserve(1000);
for (int i = 0; i < 1000; ++i) {
    efficient.push_back(i);  // No reallocations
}

// Most efficient: construct directly
std::vector<int> direct(1000);
for (int i = 0; i < 1000; ++i) {
    direct[i] = i;  // Direct assignment, no bounds checking
}

Advanced Features

Cache Locality Benefits

One of std::vector's biggest advantages is cache locality. Since elements are stored contiguously in memory, accessing elements sequentially provides excellent cache performance:

cpp
std::vector<int> data(1000000);

// Excellent cache performance - sequential access
for (int i = 0; i < data.size(); ++i) {
    data[i] = i * 2;  // Cache-friendly: adjacent memory locations
}

// Also cache-friendly - iterators follow memory layout
for (auto it = data.begin(); it != data.end(); ++it) {
    *it += 1;
}

// Range-based for loop is also cache-friendly
for (auto& item : data) {
    item *= 2;
}

Why This Matters:

  • CPU Cache Lines: Modern CPUs fetch data in 64-byte cache lines
  • Spatial Locality: Accessing adjacent elements maximizes cache line utilization
  • Prefetching: CPU can predict and prefetch the next elements
  • Performance Impact: Can be 10-100x faster than random access patterns

Compare with Linked Lists:

cpp
std::list<int> linked_data(1000000);

// Poor cache performance - elements scattered in memory
for (auto it = linked_data.begin(); it != linked_data.end(); ++it) {
    *it *= 2;  // Cache misses: elements not adjacent
}

std::vector<bool> Specialization

std::vector<bool> is a specialized template that stores booleans as individual bits rather than bytes, saving memory but with some trade-offs:

cpp
#include <vector>

std::vector<bool> flags(8, true);  // 8 bits, stored in 1 byte

// Access individual bits
flags[0] = false;   // Set first bit to false
flags[1] = true;    // Set second bit to true

// Check bit values
bool first = flags[0];   // false
bool second = flags[1];  // true

// Size and capacity
std::cout << "Size: " << flags.size() << std::endl;        // 8
std::cout << "Capacity: " << flags.capacity() << std::endl; // 1 (byte)

Key Differences from Regular Vectors:

  1. Memory Layout: 8 bits per byte instead of 1 element per byte
  2. Reference Type: flags[0] returns a proxy object, not a bool&
  3. No Direct References: Cannot take address of individual bits
  4. Memory Efficiency: 8x less memory usage for boolean data

Common Pitfalls:

cpp
std::vector<bool> flags = {true, false, true};

// WRONG: Can't take address of a bit
// bool* ptr = &flags[0];  // Compilation error!

// WRONG: Can't bind to non-const reference
// bool& ref = flags[0];   // Compilation error!

// CORRECT: Use auto or const reference
auto bit = flags[0];           // Proxy object
const auto& const_bit = flags[0];  // Const reference to proxy

// CORRECT: Store in regular bool
bool value = flags[0];         // Copy the bit value

When to Use std::vector<bool>:

cpp
// GOOD: Large boolean collections where memory matters
std::vector<bool> large_flags(1000000);  // ~125KB instead of 1MB

// GOOD: Bit flags, permissions, presence/absence tracking
std::vector<bool> user_permissions(1000);
std::vector<bool> item_availability(50000);

// CONSIDER ALTERNATIVES: When you need references or addresses
// Use std::vector<char> or std::vector<uint8_t> instead
std::vector<char> alternative_flags(1000);  // Can take addresses

std::vector<bool> has some performance penalties because it requires bitwise operations to return a reference to a bit.

Common Pitfalls and Solutions

1. Iterator Invalidation

cpp
std::vector<int> vec = {1, 2, 3, 4, 5};

// WRONG: Modifying vector while iterating
for (auto it = vec.begin(); it != vec.end(); ++it) {
    if (*it == 3) {
        vec.erase(it);  // Iterator becomes invalid!
        break;  // Must break or use erase return value
    }
}

// CORRECT: Use erase return value
for (auto it = vec.begin(); it != vec.end();) {
    if (*it == 3) {
        it = vec.erase(it);  // erase returns next valid iterator
    } else {
        ++it;
    }
}

// CORRECT: Use remove-erase idiom
vec.erase(std::remove(vec.begin(), vec.end(), 3), vec.end());

2. Performance Issues

cpp
// WRONG: Frequent reallocations
std::vector<int> vec;
for (int i = 0; i < 1000000; ++i) {
    vec.push_back(i);  // Many reallocations
}

// CORRECT: Reserve memory
std::vector<int> vec;
vec.reserve(1000000);
for (int i = 0; i < 1000000; ++i) {
    vec.push_back(i);  // No reallocations
}

3. Bounds Checking

cpp
std::vector<int> vec = {1, 2, 3};

// WRONG: Unchecked access
int value = vec[10];  // Undefined behavior

// CORRECT: Bounds checking
if (10 < vec.size()) {
    int value = vec[10];
} else {
    // Handle out of bounds
}

// CORRECT: Use at() for automatic bounds checking
try {
    int value = vec.at(10);  // Throws std::out_of_range
} catch (const std::out_of_range& e) {
    // Handle exception
}

When to Use std::vector

Use std::vector when:

  • You need fast random access to elements
  • You mostly add/remove elements at the end
  • You need contiguous memory layout
  • You want automatic memory management
  • You're working with algorithms that expect random access iterators
  • Performance is critical (Low-Latency, real-time systems)
  • Cache locality matters (large data processing)

Questions

Q: What is the time complexity of inserting an element at the end of a std::vector?

Inserting at the end (push_back) is O(1) amortized. While individual operations might be O(n) due to reallocation, the average cost over many operations is O(1) because the vector grows exponentially.

Q: What happens when you call reserve() on a std::vector?

reserve() allocates memory for the specified capacity but doesn't change the size. This prevents reallocation when adding elements up to that capacity, improving performance.

Q: Which of the following is NOT a valid way to iterate over a std::vector?

for (auto item : vec) item = 0; doesn't modify the original vector because 'item' is a copy, not a reference. Use 'auto& item' to modify elements.

Q: What is the difference between size() and capacity() in std::vector?

size() returns the current number of elements in the vector, while capacity() returns the total memory allocated (which can be larger than size to avoid frequent reallocations).

Q: When should you use std::vector over other containers?

std::vector is ideal when you need fast random access and mostly add/remove elements at the end. For frequent middle insertions/deletions, consider std::list or std::deque.