Skip to content
C++ Better Explained
Go back

C++ Concurrency Tutorial: Threads, Mutex, and Thread Safety Explained

Edit page

C++ Concurrency Tutorial: Threads, Mutex, and Thread Safety Explained

Modern computers have multiple CPU cores. A single-threaded program uses one core while the rest sit idle. Concurrency lets your program run multiple tasks simultaneously — downloading a file while processing data, handling multiple network connections at once, or parallelising computation across all your CPU cores.

C++11 introduced a standard threading library, making concurrent programming portable and practical. This tutorial covers the essentials: creating threads, protecting shared data with mutexes, and writing thread-safe code.


What Is a Thread?

A thread is an independent sequence of execution within a program. Every program starts with one thread (the main thread). You can create additional threads that run concurrently — potentially in parallel on multiple CPU cores.

Main thread:   Task A ─────────────────────────── done
Thread 2:           Task B ───────────────── done
Thread 3:           Task C ────────── done
                    ↕ Running at the same time

Creating Threads with std::thread

Include <thread> and construct a std::thread with a callable (function, lambda, or functor):

#include <iostream>
#include <thread>

void printMessage(const std::string& msg) {
    std::cout << msg << "\n";
}

int main() {
    std::thread t1(printMessage, "Hello from thread 1");
    std::thread t2(printMessage, "Hello from thread 2");

    t1.join();  // Wait for t1 to finish
    t2.join();  // Wait for t2 to finish

    std::cout << "Both threads finished.\n";
    return 0;
}

join() blocks the calling thread until the target thread finishes. Always join (or detach) a thread before it goes out of scope — not joining causes std::terminate().

Using lambdas

int main() {
    std::thread t([]() {
        std::cout << "Lambda running in a thread\n";
    });
    t.join();
    return 0;
}

Passing arguments

void compute(int id, int n) {
    int result = 0;
    for (int i = 0; i < n; i++) result += i;
    std::cout << "Thread " << id << " result: " << result << "\n";
}

int main() {
    std::thread t1(compute, 1, 1000);
    std::thread t2(compute, 2, 2000);

    t1.join();
    t2.join();
    return 0;
}

Arguments after the function are forwarded to it. Note: they are copied by default. Use std::ref() to pass by reference:

void increment(int& counter) { counter++; }

int main() {
    int value = 0;
    std::thread t(increment, std::ref(value));
    t.join();
    std::cout << value << "\n";  // 1
    return 0;
}

The Problem: Data Races

When two or more threads access the same data concurrently and at least one writes, you have a data race — undefined behavior that causes crashes, corruption, or silently wrong results.

A classic example:

#include <iostream>
#include <thread>

int counter = 0;

void increment() {
    for (int i = 0; i < 100000; i++) {
        counter++;  // NOT thread-safe!
    }
}

int main() {
    std::thread t1(increment);
    std::thread t2(increment);

    t1.join();
    t2.join();

    std::cout << "Counter: " << counter << "\n";  // Should be 200000
    // But you might get 143821 or 187654 — different every run!
    return 0;
}

Why? counter++ compiles to three operations:

  1. Read counter from memory
  2. Add 1
  3. Write counter back to memory

If both threads read the same value before either writes, one increment is lost. This is a classic race condition.


std::mutex: Protecting Shared Data

A mutex (mutual exclusion) is a lock. Only one thread can hold it at a time. Other threads that try to acquire it will block until the holder releases it.

#include <iostream>
#include <thread>
#include <mutex>

int counter = 0;
std::mutex mtx;

void increment() {
    for (int i = 0; i < 100000; i++) {
        mtx.lock();    // Acquire the lock
        counter++;     // Safe — only one thread here at a time
        mtx.unlock();  // Release the lock
    }
}

int main() {
    std::thread t1(increment);
    std::thread t2(increment);

    t1.join();
    t2.join();

    std::cout << "Counter: " << counter << "\n";  // Always 200000
    return 0;
}

The section between lock() and unlock() is called a critical section — code that only one thread executes at a time.

Never use lock() / unlock() directly in real code. If an exception is thrown between them, the mutex is never unlocked and the program deadlocks. Use RAII wrappers instead.


std::lock_guard: Exception-Safe Locking

std::lock_guard is an RAII wrapper that acquires the mutex on construction and releases it on destruction — automatically, even if an exception is thrown.

#include <mutex>

std::mutex mtx;
int counter = 0;

void increment() {
    for (int i = 0; i < 100000; i++) {
        std::lock_guard<std::mutex> lock(mtx);  // Lock acquired here
        counter++;
        // Lock released automatically at end of scope
    }
}

This is the idiomatic way to use a mutex in modern C++. The lock is always released when lock goes out of scope — no exceptions, no forgetting.


std::unique_lock: More Flexible Locking

std::unique_lock is similar to std::lock_guard but more flexible. Use it when you need to:

#include <mutex>

std::mutex mtx;

void example() {
    std::unique_lock<std::mutex> lock(mtx);  // Locks immediately

    // Do work...

    lock.unlock();   // Manually unlock mid-scope
    // Do non-critical work...
    lock.lock();     // Re-lock

    // lock releases on scope exit
}

For simple cases, prefer lock_guard. For condition variables or deferred locking, use unique_lock.


Avoiding Deadlock

A deadlock occurs when two or more threads each hold a lock the other needs — and both wait forever.

Thread 1: holds mutex A, waiting for mutex B
Thread 2: holds mutex B, waiting for mutex A
→ Both wait forever

To avoid deadlock:

std::mutex mtxA, mtxB;

void safeTransfer() {
    // Acquires both locks atomically — no deadlock
    std::lock(mtxA, mtxB);
    std::lock_guard<std::mutex> lockA(mtxA, std::adopt_lock);
    std::lock_guard<std::mutex> lockB(mtxB, std::adopt_lock);

    // Both protected now
}

Condition Variables: Waiting for Events

A condition variable lets a thread wait until another thread signals that some condition is true. It pairs with a unique_lock.

Classic producer-consumer example:

#include <iostream>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <queue>

std::queue<int> dataQueue;
std::mutex mtx;
std::condition_variable cv;
bool done = false;

void producer() {
    for (int i = 0; i < 5; i++) {
        {
            std::lock_guard<std::mutex> lock(mtx);
            dataQueue.push(i);
            std::cout << "Produced: " << i << "\n";
        }
        cv.notify_one();  // Wake up the consumer
    }

    {
        std::lock_guard<std::mutex> lock(mtx);
        done = true;
    }
    cv.notify_one();
}

void consumer() {
    while (true) {
        std::unique_lock<std::mutex> lock(mtx);
        cv.wait(lock, []{ return !dataQueue.empty() || done; });

        while (!dataQueue.empty()) {
            int value = dataQueue.front();
            dataQueue.pop();
            std::cout << "Consumed: " << value << "\n";
        }

        if (done) break;
    }
}

int main() {
    std::thread t1(producer);
    std::thread t2(consumer);

    t1.join();
    t2.join();
    return 0;
}

cv.wait(lock, predicate) atomically releases the lock and waits. When notify_one() is called, it reacquires the lock and checks the predicate. If true, it continues; if false, it waits again. Always pass a predicate to handle spurious wakeups.


std::atomic: Lightweight Synchronisation

For simple shared variables like counters or flags, std::atomic provides thread-safe access without a mutex:

#include <atomic>
#include <thread>
#include <iostream>

std::atomic<int> counter{0};

void increment() {
    for (int i = 0; i < 100000; i++) {
        counter++;  // Atomic increment — thread-safe, no mutex needed
    }
}

int main() {
    std::thread t1(increment);
    std::thread t2(increment);

    t1.join();
    t2.join();

    std::cout << "Counter: " << counter << "\n";  // Always 200000
    return 0;
}

std::atomic operations are lock-free on most platforms and significantly faster than mutex-protected operations for simple types. Use it for:


Practical Example: Parallel Sum

Split a large computation across multiple threads:

#include <iostream>
#include <thread>
#include <vector>
#include <numeric>

void partialSum(const std::vector<int>& data, int start, int end, long long& result) {
    result = 0;
    for (int i = start; i < end; i++) {
        result += data[i];
    }
}

int main() {
    const int SIZE = 1000000;
    std::vector<int> data(SIZE, 1);  // 1 million 1s

    long long sum1 = 0, sum2 = 0;
    int mid = SIZE / 2;

    // Split work across two threads
    std::thread t1(partialSum, std::cref(data), 0, mid, std::ref(sum1));
    std::thread t2(partialSum, std::cref(data), mid, SIZE, std::ref(sum2));

    t1.join();
    t2.join();

    long long total = sum1 + sum2;
    std::cout << "Total: " << total << "\n";  // 1000000
    return 0;
}

Each thread operates on a separate portion of the array — no shared writes, no mutex needed. This is the ideal pattern for parallel computation.


Hardware Concurrency

Check how many threads your hardware can run truly in parallel:

unsigned int cores = std::thread::hardware_concurrency();
std::cout << "Hardware threads: " << cores << "\n";  // e.g., 8

Creating far more threads than CPU cores doesn’t speed things up — the OS has to context-switch between them, adding overhead.


Common Mistakes

Forgetting to join a thread

void doWork() { /* ... */ }

int main() {
    std::thread t(doWork);
    // Forgot t.join() or t.detach() — std::terminate() called!
    return 0;
}

Always call join() or detach() before the thread object is destroyed.

Accessing moved-from thread object

std::thread t(doWork);
std::thread t2 = std::move(t);  // t is now empty

t.join();   // BUG: t no longer owns the thread
t2.join();  // CORRECT

Holding a lock longer than necessary

// BAD: lock held during slow I/O
void processFile() {
    std::lock_guard<std::mutex> lock(mtx);
    auto data = readFileFromDisk();  // Slow! Other threads blocked during entire I/O
    processData(data);
}

// GOOD: lock held only during data access
void processFile() {
    auto data = readFileFromDisk();  // No lock during slow I/O
    std::lock_guard<std::mutex> lock(mtx);
    processData(data);  // Lock only for the critical section
}

Keep critical sections as short as possible.

If you're looking to go deeper with C++, the C++ Better Explained Ebook is perfect for you — whether you're a complete beginner or looking to solidify your understanding. Just $19.

Summary

C++ concurrency starts with std::thread for running code in parallel. The core challenge is protecting shared data — use std::mutex with std::lock_guard (never raw lock()/unlock()). Use std::condition_variable for thread coordination when one thread needs to wait for another. For simple shared counters and flags, std::atomic is faster and simpler than a mutex.

The golden rule: if multiple threads touch the same data and any of them writes, you need synchronisation.


Take Your C++ Further

If you’re looking to go deeper with C++, the C++ Better Explained Ebook is perfect for you — whether you’re a complete beginner or looking to solidify your understanding. Just $19.

👉 Get the C++ Better Explained Ebook — $19



Edit page
Share this post on:

Next Post
C++ Conditionals Tutorial: if, else, and switch Explained