Shallow Copy vs. Deep Copy
C++ Notes: Shallow Copy vs. Deep Copy
In C++, when you copy an object, the compiler needs to know how to duplicate its member variables. The way this duplication happens is defined by either a shallow copy or a deep copy. The distinction is critical when your class manages dynamic resources, such as memory allocated with new
.
1. Shallow Copy
A shallow copy duplicates the member variables of an object as-is. If a member is a primitive type (like int
or double
), its value is copied. If a member is a pointer, the address stored in the pointer is copied, not the data it points to.
How It Happens
This is the default behavior provided by the C++ compiler. If you don't define your own copy constructor and copy assignment operator, the compiler generates them for you, and they perform a shallow copy.
The Problem
After a shallow copy, two different objects end up with pointers pointing to the exact same memory location. This leads to two major issues:
-
Double Free Corruption: When the first object is destroyed, its destructor frees the shared memory. When the second object is destroyed, its destructor tries to free the same memory again, leading to a crash or undefined behavior.
-
Dangling Pointers & Unintended Side Effects: If one object modifies the data through its pointer, the change is visible to the other object, which is often not the intended behavior. If one object is destroyed, the other object is left with a "dangling pointer" pointing to invalid memory.
Example: The Problem with Shallow Copy
#include <iostream>
#include <cstring>
class ShallowString {
public:
char* data;
int size;
ShallowString(const char* str = "") {
size = strlen(str);
data = new char[size + 1];
strcpy(data, str);
std::cout << "Constructor called. Data at " << (void*)data << std::endl;
}
~ShallowString() {
std::cout << "Destructor called. Deleting data at " << (void*)data << std::endl;
delete[] data; // This will cause a double free
}
// The compiler-generated copy constructor would do this:
// ShallowString(const ShallowString& other)
// : data(other.data), size(other.size) {}
};
int main() {
std::cout << "--- Creating s1 ---" << std::endl;
ShallowString s1("hello");
std::cout << "\n--- Copying s1 to s2 (shallow copy) ---" << std::endl;
ShallowString s2 = s1; // Shallow copy happens here
std::cout << "\ns1.data points to: " << (void*)s1.data << std::endl;
std::cout << "s2.data points to: " << (void*)s2.data << " (Same address!)" << std::endl;
std::cout << "\n--- Program ends, destructors are called ---" << std::endl;
// s2 is destroyed first, deleting the data.
// Then s1 is destroyed, trying to delete the same data again. CRASH!
return 0;
}
2. Deep Copy
A deep copy, in addition to copying the member variables, also duplicates the resources they point to. It allocates new memory for the copy and then copies the data from the original resource into the newly allocated space.
How It's Implemented
To perform a deep copy, you must manually override the special member functions. This is known as the Rule of Three (or Rule of Five in C++11 and later). If you need to manage a resource, you should implement:
-
Copy Constructor: Creates a new object as a copy of another.
-
Copy Assignment Operator: Overwrites an existing object with a copy of another.
-
Destructor: Cleans up the resource.
The Solution
By implementing a deep copy, each object manages its own independent resource. The destruction of one object has no effect on the other.
Example: Implementing a Deep Copy
#include <iostream>
#include <cstring>
class DeepString {
public:
char* data;
int size;
// 1. Constructor
DeepString(const char* str = "") {
size = strlen(str);
data = new char[size + 1]; // Allocate memory
strcpy(data, str); // Copy data
std::cout << "Constructor called. Data at " << (void*)data << std::endl;
}
// 2. Destructor
~DeepString() {
std::cout << "Destructor called. Deleting data at " << (void*)data << std::endl;
delete[] data; // Safely deletes its own data
}
// 3. Copy Constructor (DEEP COPY)
DeepString(const DeepString& other) {
size = other.size;
data = new char[size + 1]; // <-- ALLOCATE NEW MEMORY
strcpy(data, other.data); // <-- COPY THE CONTENT
std::cout << "DEEP COPY Constructor called. New data at " << (void*)data << std::endl;
}
// 4. Copy Assignment Operator (DEEP COPY)
DeepString& operator=(const DeepString& other) {
std::cout << "DEEP COPY Assignment called." << std::endl;
// 1. Self-assignment check
if (this == &other) {
return *this;
}
// 2. Deallocate old resource
delete[] data;
// 3. Allocate new resource and copy
size = other.size;
data = new char[size + 1];
strcpy(data, other.data);
// 4. Return a reference to the current object
return *this;
}
};
int main() {
std::cout << "--- Creating s1 ---" << std::endl;
DeepString s1("world");
std::cout << "\n--- Copying s1 to s2 (deep copy) ---" << std::endl;
DeepString s2 = s1; // Deep copy constructor is called
std::cout << "\ns1.data points to: " << (void*)s1.data << std::endl;
std::cout << "s2.data points to: " << (void*)s2.data << " (Different address!)" << std::endl;
std::cout << "\n--- Program ends, destructors are called ---" << std::endl;
// s2 is destroyed, deleting its own data.
// s1 is destroyed, deleting its own separate data. NO CRASH!
return 0;
}
3. Summary and Best Practices
Feature | Shallow Copy | Deep Copy |
---|---|---|
Pointers | Copies the pointer's address. Both objects point to the same memory. | Allocates new memory and copies the content. Each object has its own resource. |
Implementation | Default compiler behavior. No extra code needed. | Must be manually implemented (Copy Constructor, Copy Assignment, Destructor). |
Ownership | Creates shared ownership of a resource, which is dangerous with raw pointers. | Each object has exclusive ownership of its resource. |
Issues | Leads to double-free errors, dangling pointers, and unintended data modification. | Safe for resource management but requires more code and overhead (memory allocation is slow). |
When to use | Safe only when a class contains no raw pointers to dynamic resources, or if shared ownership is explicitly desired (and managed, e.g., with std::shared_ptr ). |
Required whenever a class manages a raw dynamic resource (new /delete ). |