Deep Dive into Java String: Memory Layout, Immutability, and Optimization Techniques
This article explores Java String internals across JDK versions, explaining its memory representation, immutability, substring behavior, and how techniques like StringBuilder, intern(), and shared objects can dramatically reduce memory usage from gigabytes to megabytes in high‑scale applications.
String Body Dissection
To understand String deeply, we start from its basic composition.
The "String creator" performed many optimizations to save memory and improve performance.
Java 6 and Earlier
Data is stored in a char[] array. The String object uses offset and count fields to locate the character array.
Why can sharing a char array cause memory leaks?
String(int offset, int count, char value[]) {
this.value = value;
this.offset = offset;
this.count = count;
}
public String substring(int beginIndex, int endIndex) {
// check boundary
return new String(offset + beginIndex, endIndex - beginIndex, value);
}Calling substring() creates a new String object, but its value still points to the same underlying array, which can keep the large original array alive and cause memory leaks when many substrings of a huge string are kept.
JDK 7 & 8
The offset and count fields were removed, reducing the memory footprint of String objects.
New substring implementation copies the required range:
public String(char value[], int offset, int count) {
this.value = Arrays.copyOfRange(value, offset, offset + count);
}
public String substring(int beginIndex, int endIndex) {
int subLen = endIndex - beginIndex;
return new String(value, beginIndex, subLen);
}This eliminates sharing of the internal char[] and prevents the previous memory‑leak scenario.
Java 9
The internal storage changed from char[] to byte[] with an additional coder field to indicate the encoding (0 for Latin‑1, 1 for UTF‑16). This reduces memory for strings that contain only single‑byte characters.
String Immutability
String is declared final , and its internal char[] is final and private , making the object immutable.
Immutability brings several benefits:
Security – the value cannot change after validation.
High‑performance caching – hash codes remain stable, enabling efficient use in hash‑based collections.
String constant pool – identical literals share a single instance, saving memory.
String creation methods:
String str1 = "example";
String str2 = new String("example");
The first checks the constant pool; the second always creates a new object on the heap.
Optimization Practices
Optimizing Massive String Concatenation
Because String is immutable, repeated concatenation can create many temporary objects. The compiler automatically rewrites simple concatenations to use StringBuilder , but inside loops it still creates a new StringBuilder each iteration.
String str = "small";
for (int i = 0; i < 1000; i++) {
str += i;
}Compiled version (simplified):
String str = "small";
for (int i = 0; i < 1000; i++) {
str = (new StringBuilder(String.valueOf(str))).append(i).toString();
}Therefore, explicit use of StringBuilder (or StringBuffer in multithreaded contexts) is recommended.
Reducing Highly Repetitive String Data
Using String.intern() moves duplicate strings into the constant pool, dramatically shrinking memory usage. Example from Twitter: address information originally required ~20 GB; after interning repeated fields, it dropped to a few hundred megabytes.
SharedLocation sharedLocation = new SharedLocation();
sharedLocation.setCity(messageInfo.getCity().intern());
sharedLocation.setCountryCode(messageInfo.getRegion().intern());
sharedLocation.setRegion(messageInfo.getCountryCode().intern());Simple demonstration:
String a = new String("abc").intern();
String b = new String("abc").intern();
System.out.print(a == b); // prints trueString Split Optimization
The split() method relies on regular expressions, which can cause costly backtracking and high CPU usage. Replacing it with indexOf() and manual parsing can be more efficient.
Quiz
Three strings are created in different ways; determine which pairs are equal:
String str1 = "abc";
String str2 = new String("abc");
String str3 = str2.intern();
assertSame(str1 == str2);
assertSame(str2 == str3);
assertSame(str1 == str3);Understanding the constant pool and interning explains why str1 and str3 refer to the same object.
IT Services Circle
Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.