Fundamentals 13 min read

Deep Dive into Java String: Memory Layout, Immutability, and Optimization Techniques

This article explores Java String internals across JDK versions, explaining its memory representation, immutability, substring behavior, and how techniques like StringBuilder, intern(), and shared objects can dramatically reduce memory usage from gigabytes to megabytes in high‑scale applications.

IT Services Circle
IT Services Circle
IT Services Circle
Deep Dive into Java String: Memory Layout, Immutability, and Optimization Techniques

String Body Dissection

To understand String deeply, we start from its basic composition.

The "String creator" performed many optimizations to save memory and improve performance.

Java 6 and Earlier

Data is stored in a char[] array. The String object uses offset and count fields to locate the character array.

Why can sharing a char array cause memory leaks?
String(int offset, int count, char value[]) {
    this.value = value;
    this.offset = offset;
    this.count = count;
}

public String substring(int beginIndex, int endIndex) {
    // check boundary
    return new String(offset + beginIndex, endIndex - beginIndex, value);
}

Calling substring() creates a new String object, but its value still points to the same underlying array, which can keep the large original array alive and cause memory leaks when many substrings of a huge string are kept.

JDK 7 & 8

The offset and count fields were removed, reducing the memory footprint of String objects.

New substring implementation copies the required range:

public String(char value[], int offset, int count) {
    this.value = Arrays.copyOfRange(value, offset, offset + count);
}

public String substring(int beginIndex, int endIndex) {
    int subLen = endIndex - beginIndex;
    return new String(value, beginIndex, subLen);
}

This eliminates sharing of the internal char[] and prevents the previous memory‑leak scenario.

Java 9

The internal storage changed from char[] to byte[] with an additional coder field to indicate the encoding (0 for Latin‑1, 1 for UTF‑16). This reduces memory for strings that contain only single‑byte characters.

String Immutability

String is declared final , and its internal char[] is final and private , making the object immutable.

Immutability brings several benefits:

Security – the value cannot change after validation.

High‑performance caching – hash codes remain stable, enabling efficient use in hash‑based collections.

String constant pool – identical literals share a single instance, saving memory.

String creation methods:

String str1 = "example";

String str2 = new String("example");

The first checks the constant pool; the second always creates a new object on the heap.

Optimization Practices

Optimizing Massive String Concatenation

Because String is immutable, repeated concatenation can create many temporary objects. The compiler automatically rewrites simple concatenations to use StringBuilder , but inside loops it still creates a new StringBuilder each iteration.

String str = "small";
for (int i = 0; i < 1000; i++) {
    str += i;
}

Compiled version (simplified):

String str = "small";
for (int i = 0; i < 1000; i++) {
    str = (new StringBuilder(String.valueOf(str))).append(i).toString();
}

Therefore, explicit use of StringBuilder (or StringBuffer in multithreaded contexts) is recommended.

Reducing Highly Repetitive String Data

Using String.intern() moves duplicate strings into the constant pool, dramatically shrinking memory usage. Example from Twitter: address information originally required ~20 GB; after interning repeated fields, it dropped to a few hundred megabytes.

SharedLocation sharedLocation = new SharedLocation();
sharedLocation.setCity(messageInfo.getCity().intern());
sharedLocation.setCountryCode(messageInfo.getRegion().intern());
sharedLocation.setRegion(messageInfo.getCountryCode().intern());

Simple demonstration:

String a = new String("abc").intern();
String b = new String("abc").intern();
System.out.print(a == b); // prints true

String Split Optimization

The split() method relies on regular expressions, which can cause costly backtracking and high CPU usage. Replacing it with indexOf() and manual parsing can be more efficient.

Quiz

Three strings are created in different ways; determine which pairs are equal:

String str1 = "abc";
String str2 = new String("abc");
String str3 = str2.intern();
assertSame(str1 == str2);
assertSame(str2 == str3);
assertSame(str1 == str3);

Understanding the constant pool and interning explains why str1 and str3 refer to the same object.

JavaperformanceMemory OptimizationstringimmutabilityStringBuilderintern
IT Services Circle
Written by

IT Services Circle

Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.