Skip to content

POSIX Shared Memory

Imagine you have two processes that need to work with the same data. Maybe one process is collecting market data and another needs to analyze it in real-time. How do you get data from one process to another? What if you want the fastest possible way to share data between processes?

That's where shared memory comes in. Instead of copying data between processes, you create a region of memory that both processes can access directly. Think of it like a shared whiteboard - both processes can read and write to the same physical memory.

The Basic Idea: One Memory, Multiple Processes

Let's start with a simple example. You have a trading system where:

  • Process A receives market data and updates prices
  • Process B needs to read those prices to make trading decisions

With traditional IPC, Process A would write the data to a pipe or message queue, and Process B would read it. Each time, the data gets copied from one process to another.

With shared memory, you create a region of memory that both processes can see. Process A writes directly to this shared region, and Process B reads directly from it. No copying involved.

How Does It Work?

The key insight is that while each process has its own virtual address space, you can map the same physical memory pages into multiple processes' address spaces.

Here's the basic flow:

  1. Create a shared memory object (like creating a file, but in memory)
  2. Set its size (how much data you want to share)
  3. Map it into each process's address space
  4. Now both processes can access the same physical memory

The beauty is that once it's set up, accessing shared memory is just like accessing regular memory - no system calls needed.

The Performance Advantage

Why is this faster? Let's compare:

Traditional IPC (pipes/message queues):

  • Process A: Write data to kernel buffer
  • Kernel: Copy data from user space to kernel space
  • Kernel: Copy data from kernel space to Process B's user space
  • Process B: Read data from kernel buffer

Shared Memory:

  • Process A: Write directly to shared memory
  • Process B: Read directly from shared memory

No kernel involvement for data transfer. No copying. Just direct memory access.

The Synchronization Problem

Here's where it gets interesting. Shared memory gives you speed, but it also gives you a new problem: synchronization.

Imagine both processes trying to update the same price at the same time:

  • Process A reads the current price: $100
  • Process B reads the current price: $100
  • Process A adds $5:$<!-- -->105
  • Process B subtracts $3:$<!-- -->97
  • Process A writes: $105
  • Process B writes: $97

The final price is $97,butitshouldbe$<!-- -->102 ($100 +$<!-- -->5 - $3). This is a classic race condition.

Solving the Synchronization Problem

You need some way to coordinate access to the shared memory. The most common approaches are:

Mutexes: Only one process can access the data at a time Semaphores: Control how many processes can access the data Atomic operations: Use CPU instructions that are guaranteed to be atomic

For our trading example, you might use a mutex:

  • Process A locks the mutex, updates the price, unlocks the mutex
  • Process B waits for the mutex, reads the price, unlocks the mutex

This ensures that only one process can modify the data at a time.

Real-World Example: Market Data Distribution

Let's walk through a real example. You're building a high-frequency trading system with these components:

  1. Market Data Receiver: Receives price updates from exchanges
  2. Order Manager: Manages trading orders
  3. Risk Engine: Calculates risk metrics
  4. Strategy Engine: Makes trading decisions

All these components need access to the same market data. Here's how you'd design it:

Shared Memory Layout:

cpp
[Header] [Price Data] [Order Book] [Risk Data]

Header contains:

  • Version number
  • Timestamp
  • Number of instruments
  • Status flags

Price Data contains:

  • Array of price updates for each instrument
  • Each update has: instrument ID, price, timestamp, volume

Synchronization:

  • Reader-writer locks for different data sections
  • Atomic operations for simple counters
  • Memory barriers to ensure proper ordering

When Shared Memory Makes Sense

Shared memory isn't always the right choice. Here's when it makes sense:

Use shared memory when:

  • You need the absolute fastest communication between processes
  • You're sharing large amounts of data
  • Processes communicate frequently
  • You can handle the synchronization complexity

Avoid shared memory when:

  • You need simple, occasional communication
  • You want network transparency (shared memory is local only)
  • You can't afford the complexity of proper synchronization
  • You need strong process isolation for security

The Complexity Trade-off

Shared memory gives you performance, but it comes with complexity:

What you gain:

  • Ultra-low latency (nanoseconds instead of microseconds)
  • Maximum throughput
  • No data copying overhead
  • Memory efficiency (shared physical pages)

What you lose:

  • Simplicity (you must handle synchronization)
  • Safety (race conditions can corrupt data)
  • Debugging ease (shared memory bugs are notoriously hard to debug)
  • Portability (not all systems support POSIX shared memory)

Common Patterns

Once you understand the basics, you'll see these patterns emerge:

Producer-Consumer: One process writes, another reads

  • Market data receiver → Strategy engine
  • Order generator → Order manager

Multiple Readers, Single Writer: Many processes read, one writes

  • Market data (many consumers, one source)
  • Configuration data (many processes read, admin process writes)

Multiple Producers, Multiple Consumers: Complex coordination

  • Multiple market data sources → Multiple strategy engines
  • Requires careful synchronization design

Integration with Other IPC

Shared memory doesn't exist in isolation. You often combine it with other IPC mechanisms:

Shared Memory + Pipes: Use shared memory for bulk data, pipes for control messages

  • Example: Market data in shared memory, order commands via pipes

Shared Memory + Signals: Use shared memory for data, signals for urgent notifications

  • Example: Price data in shared memory, "new data available" signals

Shared Memory + Message Queues: Use shared memory for large data, message queues for small control messages

  • Example: Order book in shared memory, order acknowledgments via message queues

Questions

Q: Which function creates a shared memory object?

shm_open() creates or opens a shared memory object.

Q: What is the purpose of ftruncate() with shared memory?

ftruncate() sets the size of the shared memory object.

Q: How do you remove a shared memory object?

shm_unlink() removes the shared memory object from the system.

Implement a shared memory ring buffer that can be used by multiple processes. Include proper synchronization using semaphores.

cpp
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <semaphore.h>
#include <cstring>
#include <string>

struct RingBufferHeader {
    size_t head;
    size_t tail;
    size_t capacity;
};

class SharedMemoryRingBuffer {
private:
    static constexpr size_t BUFFER_CAPACITY = 16;
    static constexpr size_t SHM_SIZE = sizeof(RingBufferHeader) + sizeof(int) * BUFFER_CAPACITY;

    std::string shm_name;
    std::string sem_empty_name;
    std::string sem_full_name;
    std::string sem_mutex_name;

    int shm_fd;
    RingBufferHeader* header;
    int* buffer;

    sem_t* sem_empty;  // Counts empty slots