How do you create a thread in C++?

Use std::thread from the header: std::thread t(myFunction); creates a thread that runs myFunction. Call t.join() to wait for it to finish, or t.detach() to let it run independently. Always join or detach before the std::thread object is destroyed.

What is the difference between join and detach in C++?

join() blocks the calling thread until the target thread finishes — use this when you need the result before continuing. detach() lets the thread run independently in the background — use this for fire-and-forget tasks. A thread must be either joined or detached before it goes out of scope.

How do you share data safely between threads in C++?

Protect shared data with std::mutex and std::lock_guard to prevent race conditions. For simple shared counters, use std::atomic which is faster than a mutex. For producer-consumer patterns, use std::condition_variable to signal between threads without busy-waiting.

Multithreading in C++: Threads, Mutexes, and Writing Thread-Safe Code

Introduction: Why Concurrency Matters in Modern Software

Imagine you’re building a web server. While one request is waiting for data from the database, shouldn’t you be able to handle other requests simultaneously? Or picture a desktop application: while the program loads a large file from disk, shouldn’t the UI remain responsive?

This is where multithreading comes in. Modern C++ gives you powerful tools to write concurrent programs that can do multiple things at the same time. But with great power comes great responsibility—threads introduce complexity, and mistakes can lead to subtle, hard-to-catch bugs.

In this guide, we’ll build a solid mental model of multithreading, starting from the basics and working up to practical, production-ready patterns.

What is a Thread? Building the Mental Model

Think of a thread as a “worker” that executes code independently. If your program is a factory, a single-threaded program is like having one worker doing all the jobs sequentially. A multithreaded program is like hiring multiple workers who can tackle different jobs in parallel.

Here’s the crucial insight: all threads in a single process share the same memory. This is powerful (threads can easily communicate), but dangerous (if two workers modify the same piece of equipment without coordinating, chaos ensues).

In C++, threads are managed by the operating system. You can’t control exactly when each thread runs—the OS scheduler decides that. This unpredictability is a key source of threading bugs.

Creating Threads with std::thread

The std::thread class, introduced in C++11, makes thread creation straightforward:

#include <iostream>
#include <thread>

void worker() {
    std::cout << "Hello from thread!" << std::endl;
}

int main() {
    // Create a thread that runs the worker function
    std::thread t(worker);

    // The main thread continues here while t runs in parallel
    std::cout << "Main thread continues" << std::endl;

    return 0;  // DANGER: t is still running!
}

Problem: The program ends while t is still running. The thread is destroyed, which causes undefined behavior.

Solution: Use join() to wait for the thread to finish:

int main() {
    std::thread t(worker);

    std::cout << "Main thread continues" << std::endl;

    t.join();  // Wait for t to complete
    std::cout << "Thread finished" << std::endl;

    return 0;
}

Passing Arguments to Threads

Threads can run functions with parameters:

void greet(const std::string& name, int id) {
    std::cout << "Hello " << name << " (thread " << id << ")" << std::endl;
}

int main() {
    std::thread t1(greet, "Alice", 1);
    std::thread t2(greet, "Bob", 2);

    t1.join();
    t2.join();

    return 0;
}

Important: Arguments are copied into the thread by default. If you want to pass by reference, use std::ref:

void modify(int& value) {
    value += 10;
}

int main() {
    int x = 5;
    std::thread t(modify, std::ref(x));
    t.join();
    std::cout << x << std::endl;  // Prints: 15
    return 0;
}

Using Lambda Functions

Modern C++ threads often use lambdas:

int main() {
    std::thread t([]() {
        std::cout << "Running in a thread" << std::endl;
    });

    t.join();
    return 0;
}

With captured variables:

int main() {
    int value = 42;
    std::thread t([value]() {
        std::cout << "Value: " << value << std::endl;
    });

    t.join();
    return 0;
}

Joining and Detaching Threads

join(): Waiting for Completion

join() blocks the calling thread until the target thread finishes. Use this when you need the thread’s work to complete before continuing.

detach(): Fire and Forget

detach() releases the thread, letting it run independently:

std::thread t(worker);
t.detach();  // Thread runs independently
// Warning: Be careful with detached threads!

Caution: With detached threads, you lose control. The thread might still be running when your program exits, potentially causing crashes. Use detach() only when you’re certain about the thread’s lifetime.

The Race Condition Problem: A Real Example

Here’s where multithreading gets tricky. Suppose two threads increment a shared counter:

#include <thread>
#include <iostream>

int counter = 0;

void increment() {
    for (int i = 0; i < 100000; ++i) {
        counter++;  // DANGER: Race condition!
    }
}

int main() {
    std::thread t1(increment);
    std::thread t2(increment);

    t1.join();
    t2.join();

    std::cout << "Counter: " << counter << std::endl;  // Expect 200000, but...?
    return 0;
}

Run this several times. You’ll likely get different results each time: 176543, 194201, 188900… never 200000. Why?

The operation counter++ is not atomic. At the CPU level, it’s three steps:

Load the current value of counter into a register
Increment the register
Store the register back to counter

With two threads running simultaneously, this can happen:

Thread 1: Load counter (0)
Thread 2: Load counter (0)
Thread 1: Increment (1), store back to counter
Thread 2: Increment (1), store back to counter
Result: counter is 1, but we did 2 increments!

This is a race condition—the outcome depends on the unpredictable ordering of thread execution.

Mutexes: std::mutex and How to Use Them

A mutex (mutual exclusion) is a lock. Only one thread can hold a mutex at a time. When a thread needs to access shared data, it locks the mutex, does its work, then unlocks it.

#include <thread>
#include <mutex>
#include <iostream>

int counter = 0;
std::mutex counter_mutex;

void increment() {
    for (int i = 0; i < 100000; ++i) {
        counter_mutex.lock();
        counter++;
        counter_mutex.unlock();
    }
}

int main() {
    std::thread t1(increment);
    std::thread t2(increment);

    t1.join();
    t2.join();

    std::cout << "Counter: " << counter << std::endl;  // Now consistently 200000
    return 0;
}

Problem: If an exception occurs between lock() and unlock(), the mutex stays locked forever. Other threads wait infinitely—a deadlock.

std::lock_guard and std::unique_lock: RAII Locking

RAII (Resource Acquisition Is Initialization) applies to locking too. std::lock_guard automatically unlocks when it goes out of scope:

void increment() {
    for (int i = 0; i < 100000; ++i) {
        std::lock_guard<std::mutex> lock(counter_mutex);
        counter++;
        // Mutex automatically unlocked here
    }
}

This is exception-safe. Even if an exception occurs inside the block, the lock is released.

std::unique_lock is more flexible—you can manually unlock and move ownership:

std::unique_lock<std::mutex> lock(counter_mutex);
counter++;
lock.unlock();
// lock can be re-locked or moved to another variable

Best practice: Use std::lock_guard for simple cases and std::unique_lock when you need flexibility.

If you're looking to go deeper with C++, the C++ Better Explained Ebook is perfect for you — whether you're a complete beginner or looking to solidify your understanding. Just $19.

Deadlock: What It Is and How to Avoid It

Deadlock occurs when threads are stuck waiting for each other. Classic example:

std::mutex m1, m2;

void thread1_func() {
    std::lock_guard<std::mutex> l1(m1);  // Thread 1 locks m1
    std::this_thread::sleep_for(std::chrono::milliseconds(100));
    std::lock_guard<std::mutex> l2(m2);  // Thread 1 waits for m2
}

void thread2_func() {
    std::lock_guard<std::mutex> l2(m2);  // Thread 2 locks m2
    std::this_thread::sleep_for(std::chrono::milliseconds(100));
    std::lock_guard<std::mutex> l1(m1);  // Thread 2 waits for m1
}

Both threads wait forever. Thread 1 holds m1 and wants m2; Thread 2 holds m2 and wants m1.

How to avoid deadlock:

Always acquire locks in the same order across all threads
Use std::lock() to acquire multiple mutexes safely:

void thread1_func() {
    std::unique_lock<std::mutex> l1(m1, std::defer_lock);
    std::unique_lock<std::mutex> l2(m2, std::defer_lock);
    std::lock(l1, l2);  // Safe: acquires both or neither
    // ... critical section ...
}

std::atomic: Lock-Free Thread-Safe Operations

For simple data types, atomic operations are faster than mutexes. Atomics provide thread-safe access without explicit locks:

#include <atomic>

std::atomic<int> counter(0);  // Initialize to 0

void increment() {
    for (int i = 0; i < 100000; ++i) {
        counter++;  // Atomic increment, thread-safe!
    }
}

Behind the scenes, the CPU provides atomic instructions. No mutex overhead.

Other atomic operations:

std::atomic<int> x(0);

x.store(5);           // Atomic write
int val = x.load();   // Atomic read
x.exchange(10);       // Swap and return old value
x.compare_exchange_strong(expected, new_val);  // CAS operation

When to use atomics: Simple counters, flags, or state variables. For complex data structures, use mutexes.

Condition Variables: Signalling Between Threads

Condition variables let threads wait for a specific event. Classic use case: producer-consumer pattern.

#include <condition_variable>
#include <queue>

std::queue<int> data_queue;
std::mutex queue_mutex;
std::condition_variable cv;

void producer() {
    for (int i = 0; i < 5; ++i) {
        {
            std::lock_guard<std::mutex> lock(queue_mutex);
            data_queue.push(i);
            std::cout << "Produced: " << i << std::endl;
        }
        cv.notify_one();  // Wake one waiting thread
        std::this_thread::sleep_for(std::chrono::milliseconds(100));
    }
}

void consumer() {
    while (true) {
        std::unique_lock<std::mutex> lock(queue_mutex);
        cv.wait(lock, []() { return !data_queue.empty(); });  // Wait for data

        if (!data_queue.empty()) {
            int val = data_queue.front();
            data_queue.pop();
            std::cout << "Consumed: " << val << std::endl;
        }
    }
}

The wait() call releases the mutex while waiting, then reacquires it when notified. This prevents deadlocks and is more efficient than busy-waiting.

Thread-Local Storage with thread_local

Variables declared thread_local are independent for each thread:

thread_local int thread_id_cache = -1;

void worker(int id) {
    thread_id_cache = id;  // Each thread has its own copy
    std::cout << "Thread " << thread_id_cache << std::endl;
}

Useful for caches, stateful objects, or thread-specific data.

Common Concurrency Patterns

Producer-Consumer Pattern

One or more threads produce data; others consume it. Use a queue and condition variables to coordinate:

std::queue<std::string> message_queue;
std::mutex queue_lock;
std::condition_variable queue_cv;
bool done = false;

void producer() {
    for (const auto& msg : messages) {
        {
            std::lock_guard<std::mutex> lock(queue_lock);
            message_queue.push(msg);
        }
        queue_cv.notify_one();
    }
    {
        std::lock_guard<std::mutex> lock(queue_lock);
        done = true;
    }
    queue_cv.notify_all();
}

void consumer() {
    while (true) {
        std::unique_lock<std::mutex> lock(queue_lock);
        queue_cv.wait(lock, []() { return !message_queue.empty() || done; });

        if (message_queue.empty() && done) break;

        std::string msg = message_queue.front();
        message_queue.pop();
        std::cout << "Processing: " << msg << std::endl;
    }
}

Thread Pool Concept

Instead of creating a new thread for each task, reuse a pool of worker threads. Tasks are queued, and workers pick them up as they become available. This avoids the overhead of thread creation.

std::async and std::future: Higher-Level Concurrency

For simpler cases, std::async handles thread management for you:

#include <future>

int expensive_calculation() {
    std::this_thread::sleep_for(std::chrono::seconds(2));
    return 42;
}

int main() {
    std::future<int> result = std::async(std::launch::async, expensive_calculation);

    std::cout << "Doing other work..." << std::endl;

    int value = result.get();  // Block until calculation completes
    std::cout << "Result: " << value << std::endl;

    return 0;
}

std::async returns a std::future that you can wait on with get(). Cleaner than manual thread management for many scenarios.

Practical Example: Parallel File Processing

Process multiple files in parallel:

#include <vector>
#include <thread>
#include <fstream>

std::vector<std::string> files = {"file1.txt", "file2.txt", "file3.txt"};
std::mutex result_mutex;
std::vector<std::string> results;

void process_file(const std::string& filename) {
    std::ifstream file(filename);
    std::string content((std::istreambuf_iterator<char>(file)),
                        std::istreambuf_iterator<char>());

    // Process content...
    std::string result = "Processed: " + filename;

    {
        std::lock_guard<std::mutex> lock(result_mutex);
        results.push_back(result);
    }
}

int main() {
    std::vector<std::thread> workers;

    for (const auto& file : files) {
        workers.emplace_back(process_file, file);
    }

    for (auto& worker : workers) {
        worker.join();
    }

    std::cout << "Processed " << results.size() << " files" << std::endl;
    return 0;
}

Performance Considerations

Lock Contention: The more threads compete for the same mutex, the slower things get. Use fine-grained locking (protect small critical sections).
Context Switching: Too many threads cause the OS to spend more time switching between threads than actually running them. A good rule of thumb: number of threads ≈ number of CPU cores.
False Sharing: Two threads modifying different variables that happen to be on the same CPU cache line causes the cache to constantly invalidate. Align data carefully.
Atomic vs Mutex: Atomics are faster for simple operations. Mutexes are clearer for protecting complex data structures.
Lock-Free Data Structures: For extreme performance, use lock-free queues or other structures, but they’re complex to implement correctly.

Conclusion

Multithreading is powerful but requires careful design. Start with clear mental models: threads share memory, race conditions arise from unsynchronized access, and mutexes/atomics protect shared data. Use RAII-style locking (std::lock_guard), avoid deadlock by consistent lock ordering, and test thoroughly—threading bugs are notoriously hard to reproduce.

The patterns shown here—producer-consumer, fine-grained locking, condition variables—form the foundation of robust concurrent C++ code. Master these, and you’ll write programs that are both correct and efficient.

What aspect of multithreading do you want to explore next? Drop a question in the comments!

Take Your C++ Further

If you’re looking to go deeper with C++, the C++ Better Explained Ebook is perfect for you — whether you’re a complete beginner or looking to solidify your understanding. Just $19.

👉 Get the C++ Better Explained Ebook — $19

C++ Concurrency Tutorial: Threads, Mutex, and Thread Safety Explained — a beginner-focused companion guide covering std::thread, lock_guard, condition variables, and std::atomic with step-by-step examples.
C++ Move Semantics Explained: rvalue References, std::move, and Performance — moving data between threads efficiently with std::move avoids expensive copies in concurrent code.
C++ Design Patterns Explained: Singleton, Factory, and Observer in Modern C++ — thread-safe design patterns like the Singleton are essential in concurrent C++ programs.
Top 50 C++ Interview Questions and Answers — multithreading is a common advanced topic in C++ interviews; test your knowledge here.

Multithreading in C++: Threads, Mutexes, and Writing Thread-Safe Code

Introduction: Why Concurrency Matters in Modern Software

What is a Thread? Building the Mental Model

Creating Threads with std::thread

Passing Arguments to Threads

Using Lambda Functions

Joining and Detaching Threads

join(): Waiting for Completion

detach(): Fire and Forget

The Race Condition Problem: A Real Example

Mutexes: std::mutex and How to Use Them

std::lock_guard and std::unique_lock: RAII Locking

Deadlock: What It Is and How to Avoid It

std::atomic: Lock-Free Thread-Safe Operations

Condition Variables: Signalling Between Threads

Thread-Local Storage with thread_local

Common Concurrency Patterns

Producer-Consumer Pattern

Thread Pool Concept

std::async and std::future: Higher-Level Concurrency

Practical Example: Parallel File Processing

Performance Considerations

Conclusion

Take Your C++ Further

Free Download: The 10 Mistakes Every C++ Beginner Makes

📋 Free Download: The 10 Mistakes Every C++ Beginner Makes

Keep Learning

C++ String Handling: std::string & string_view Guide

C++ Templates Explained: Function & Class Templates

Virtual Functions & Polymorphism in C++ Explained

Best C++ Books and Resources for Beginners in 2026

Download Free: The 10 Mistakes Every C++ Beginner Makes (And How to Avoid Them)

Multithreading in C++: Threads, Mutexes, and Writing Thread-Safe Code

Introduction: Why Concurrency Matters in Modern Software

What is a Thread? Building the Mental Model

Creating Threads with std::thread

Passing Arguments to Threads

Using Lambda Functions

Joining and Detaching Threads

join(): Waiting for Completion

detach(): Fire and Forget

The Race Condition Problem: A Real Example

Mutexes: std::mutex and How to Use Them

std::lock_guard and std::unique_lock: RAII Locking

Deadlock: What It Is and How to Avoid It

std::atomic: Lock-Free Thread-Safe Operations

Condition Variables: Signalling Between Threads

Thread-Local Storage with thread_local

Common Concurrency Patterns

Producer-Consumer Pattern

Thread Pool Concept

std::async and std::future: Higher-Level Concurrency

Practical Example: Parallel File Processing

Performance Considerations

Conclusion

Take Your C++ Further

Related Articles

Free Download: The 10 Mistakes Every C++ Beginner Makes

📋 Free Download: The 10 Mistakes Every C++ Beginner Makes

Keep Learning

C++ String Handling: std::string & string_view Guide

C++ Templates Explained: Function & Class Templates

Virtual Functions & Polymorphism in C++ Explained

Best C++ Books and Resources for Beginners in 2026