Understanding Java String Length Limits and JVM Constraints
This article explains the theoretical and practical limits of Java String length, covering compiler restrictions, UTF‑8 encoding effects, constant‑pool constraints, runtime constructor limits, memory consumption, and the JDK9 optimization that changes internal storage.
1. The String.length() method returns an int , so the theoretical maximum length cannot exceed Integer.MAX_VALUE .
2. The Java compiler enforces a hard limit: any string constant whose length is greater than or equal to 65,535 characters causes a compilation error. The relevant source code is:
private void checkStringConstant(DiagnosticPosition var1, Object var2) {
if (this.nerrs == 0 && var2 != null && var2 instanceof String && ((String) var2).length() >= 65535) {
this.log.error(var1, "limit.string", new Object[0]);
++this.nerrs;
}
}3. Java stores string literals in the class file using UTF‑8 encoding, where each character occupies 1‑4 bytes. Most Chinese characters need 3 bytes, while ASCII letters need only 1 byte.
// 65,534 ASCII characters – compiles successfully
String s1 = "dd...d";
// 21,845 Chinese characters "自" – compiles successfully
String s2 = "自自...自";
// One ASCII 'd' plus 21,845 Chinese "自" – compilation fails
String s3 = "d自自...自";Explanation of the examples:
For s1 , each 'd' uses 1 byte, so 65,534 bytes are well below the 65,535‑byte limit.
For s2 , each Chinese character uses 3 bytes; 21,845 × 3 = 65,535 bytes, which matches the limit but the character count (21,845) is still below 65,535, so compilation succeeds.
For s3 , the extra ASCII 'd' adds one more byte, resulting in 65,536 bytes, exceeding the limit and causing a compilation error.
4. The JVM constant‑pool stores UTF‑8 strings as CONSTANT_Utf8 entries with the following structure:
CONSTANT_Utf8_info {
u1 tag;
u2 length;
u1 bytes[length];
}The length field is an unsigned 16‑bit integer ( u2 ), so the maximum number of bytes a constant‑pool string can hold is 2^16‑1 = 65,535 .
5. At runtime, the maximum length of a String is governed by the constructor that takes a char[] , an offset , and a count . The count parameter can be up to Integer.MAX_VALUE (2^31‑1) , so the theoretical runtime limit is 2,147,483,647 characters.
6. In practice, memory availability is the real constraint. Assuming each character occupies 2 bytes (UTF‑16), the largest possible string would require roughly:
(2^31‑1) * 2 bytes ≈ 4 GB of heap memoryIf the JVM cannot allocate that much memory, an OutOfMemoryError will be thrown.
7. Since JDK 9, the internal representation of String has been optimized: Latin‑1 strings are stored in a byte array instead of a char array, halving the memory consumption for pure ASCII strings.
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.