Skip to content

Shallow Copy vs. Deep Copy

C++ Notes: Shallow Copy vs. Deep Copy

In C++, when you copy an object, the compiler needs to know how to duplicate its member variables. The way this duplication happens is defined by either a shallow copy or a deep copy. The distinction is critical when your class manages dynamic resources, such as memory allocated with new.

1. Shallow Copy

A shallow copy duplicates the member variables of an object as-is. If a member is a primitive type (like int or double), its value is copied. If a member is a pointer, the address stored in the pointer is copied, not the data it points to.

How It Happens

This is the default behavior provided by the C++ compiler. If you don't define your own copy constructor and copy assignment operator, the compiler generates them for you, and they perform a shallow copy.

The Problem

After a shallow copy, two different objects end up with pointers pointing to the exact same memory location. This leads to two major issues:

  1. Double Free Corruption: When the first object is destroyed, its destructor frees the shared memory. When the second object is destroyed, its destructor tries to free the same memory again, leading to a crash or undefined behavior.

  2. Dangling Pointers & Unintended Side Effects: If one object modifies the data through its pointer, the change is visible to the other object, which is often not the intended behavior. If one object is destroyed, the other object is left with a "dangling pointer" pointing to invalid memory.

Example: The Problem with Shallow Copy

#include <iostream>
#include <cstring>

class ShallowString {
public:
    char* data;
    int size;

    ShallowString(const char* str = "") {
        size = strlen(str);
        data = new char[size + 1];
        strcpy(data, str);
        std::cout << "Constructor called. Data at " << (void*)data << std::endl;
    }

    ~ShallowString() {
        std::cout << "Destructor called. Deleting data at " << (void*)data << std::endl;
        delete[] data; // This will cause a double free
    }

    // The compiler-generated copy constructor would do this:
    // ShallowString(const ShallowString& other) 
    //     : data(other.data), size(other.size) {}
};

int main() {
    std::cout << "--- Creating s1 ---" << std::endl;
    ShallowString s1("hello");

    std::cout << "\n--- Copying s1 to s2 (shallow copy) ---" << std::endl;
    ShallowString s2 = s1; // Shallow copy happens here

    std::cout << "\ns1.data points to: " << (void*)s1.data << std::endl;
    std::cout << "s2.data points to: " << (void*)s2.data << " (Same address!)" << std::endl;

    std::cout << "\n--- Program ends, destructors are called ---" << std::endl;
    // s2 is destroyed first, deleting the data.
    // Then s1 is destroyed, trying to delete the same data again. CRASH!
    return 0;
}

2. Deep Copy

A deep copy, in addition to copying the member variables, also duplicates the resources they point to. It allocates new memory for the copy and then copies the data from the original resource into the newly allocated space.

How It's Implemented

To perform a deep copy, you must manually override the special member functions. This is known as the Rule of Three (or Rule of Five in C++11 and later). If you need to manage a resource, you should implement:

  1. Copy Constructor: Creates a new object as a copy of another.

  2. Copy Assignment Operator: Overwrites an existing object with a copy of another.

  3. Destructor: Cleans up the resource.

The Solution

By implementing a deep copy, each object manages its own independent resource. The destruction of one object has no effect on the other.

Example: Implementing a Deep Copy

#include <iostream>
#include <cstring>

class DeepString {
public:
    char* data;
    int size;

    // 1. Constructor
    DeepString(const char* str = "") {
        size = strlen(str);
        data = new char[size + 1]; // Allocate memory
        strcpy(data, str);         // Copy data
        std::cout << "Constructor called. Data at " << (void*)data << std::endl;
    }

    // 2. Destructor
    ~DeepString() {
        std::cout << "Destructor called. Deleting data at " << (void*)data << std::endl;
        delete[] data; // Safely deletes its own data
    }

    // 3. Copy Constructor (DEEP COPY)
    DeepString(const DeepString& other) {
        size = other.size;
        data = new char[size + 1];  // <-- ALLOCATE NEW MEMORY
        strcpy(data, other.data);   // <-- COPY THE CONTENT
        std::cout << "DEEP COPY Constructor called. New data at " << (void*)data << std::endl;
    }

    // 4. Copy Assignment Operator (DEEP COPY)
    DeepString& operator=(const DeepString& other) {
        std::cout << "DEEP COPY Assignment called." << std::endl;
        // 1. Self-assignment check
        if (this == &other) {
            return *this;
        }

        // 2. Deallocate old resource
        delete[] data;

        // 3. Allocate new resource and copy
        size = other.size;
        data = new char[size + 1];
        strcpy(data, other.data);

        // 4. Return a reference to the current object
        return *this;
    }
};

int main() {
    std::cout << "--- Creating s1 ---" << std::endl;
    DeepString s1("world");

    std::cout << "\n--- Copying s1 to s2 (deep copy) ---" << std::endl;
    DeepString s2 = s1; // Deep copy constructor is called

    std::cout << "\ns1.data points to: " << (void*)s1.data << std::endl;
    std::cout << "s2.data points to: " << (void*)s2.data << " (Different address!)" << std::endl;

    std::cout << "\n--- Program ends, destructors are called ---" << std::endl;
    // s2 is destroyed, deleting its own data.
    // s1 is destroyed, deleting its own separate data. NO CRASH!
    return 0;
}

3. Summary and Best Practices

Feature Shallow Copy Deep Copy
Pointers Copies the pointer's address. Both objects point to the same memory. Allocates new memory and copies the content. Each object has its own resource.
Implementation Default compiler behavior. No extra code needed. Must be manually implemented (Copy Constructor, Copy Assignment, Destructor).
Ownership Creates shared ownership of a resource, which is dangerous with raw pointers. Each object has exclusive ownership of its resource.
Issues Leads to double-free errors, dangling pointers, and unintended data modification. Safe for resource management but requires more code and overhead (memory allocation is slow).
When to use Safe only when a class contains no raw pointers to dynamic resources, or if shared ownership is explicitly desired (and managed, e.g., with std::shared_ptr). Required whenever a class manages a raw dynamic resource (new/delete).