In Java, deep copy and shallow copy are two ways of copying objects. The main difference lies in the handling of reference types inside objects.
Shallow copy
Definition:
A shallow copy is a process of creating a new object, but the properties of the new object (including properties of reference type) still point to the properties of the original object. In other words, if the property in the original object is a reference type, a shallow copy will only copy the address of the reference, and the new and old objects will share the same memory area. Therefore, when modifying the reference type property of one object, the same property of the other object will also be affected.
advantage:
- Save resources.
Suitable for:
- A scenario where only reading is done but no writing is performed;
- If the object structure is simple or you want to save resources, shallow copy is more appropriate;
code:
class Person implements Cloneable {
String name;
Address address; // Assume Address is a class
public Person clone() {
try {
return (Person) super.clone();
} catch (CloneNotSupportedException e) {
throw new AssertionError(); // This should never happen since we are Cloneable
}
}
}
The Object ancestor class implements shallow copy by default. Without overriding, using the constructor or custom methods such as copy() directly assigns reference type fields, which are all shallow copies.
Deep Copy
Definition:
Deep copy not only creates a new object, but also recursively creates copies of all reference type attributes, ensuring that the source object and the copy object are completely independent, and modifying one object will not affect the other object.
advantage:
- There is no advantage, it can only be said to be more suitable for write operations;
Suitable for:
- Scenarios that require write operations;
- A completely independent copy is required to avoid mutual influence during modification;
Example of overriding the cloe method to implement deep copy:
// For each reference type attribute, manually call its copy method (if that class also supports deep copying) to create a new instance.
class Person implements Cloneable {
String name;
Address address;
public Person deepClone() {
Person clone = new Person();
clone.name = this.name;
clone.address = this.address.clone(); // Assume the Address class also implements deep cloning
return clone;
}
}
Note: In deep copy, you need to pay attention to whether the nested reference type also implements the rewriting problem, otherwise the nested class will still be a shallow copy of the reference to the type.
Background explanation:
Reference data types refer to data types such as classes, interfaces, arrays, and strings, as opposed to basic data types such as int, double, and boolean.
How to implement deep copy:
- Rewrite the clone() method to implement deep copy
- Use third-party libraries such as Apache Commons Lang
- Manually implement deep copy logic
- Serialization to achieve deep copy
- …
Distinguishing between deep and shallow copy in code
Look at the assignment:
- When an object instance is assigned to another variable (e.g. Object obj2 = obj1;), whether in method parameter passing or ordinary variable assignment, this is always a shallow copy without special handling. Both variables actually point to the same object in memory.
Look at method parameter passing:
- In Java, method parameter passing is essentially value passing. When an object is passed as a parameter to a method, a copy of the object reference (that is, a copy of the pointer) is passed, not a new copy of the object itself. Therefore, modifications to the object content within the method will affect the original object, which is shallow copy behavior. But if a new instance of the object is created inside the method and returned, a deep copy can be achieved.
- When a method is called, check whether only a reference is passed or an actual copy of the object’s contents is made.
- Check if there is any logic in the code that explicitly creates new instances for reference type fields. This is a key sign of deep copy.
Note: There is a difference between basic data types and reference types. The assignment of basic data types is always passed by value and does not involve deep and shallow copy issues.