String creation and char memory allocation

I read many conflicting articles about memory allocation when creating String. Some articles say that a new statement creates a string on the heap, and a string literal is created on the String Pool [Heap], while some say that the new statement creates an object on the heap and another object on the string pool.

To analyze this, I wrote the following program that prints the hash code of a String char array and a String object:

import java.lang.reflect.Field; public class StringAnalysis { private int showInternalCharArrayHashCode(String s) throws SecurityException, NoSuchFieldException, IllegalArgumentException, IllegalAccessException { final Field value = String.class.getDeclaredField("value"); value.setAccessible(true); return value.get(s).hashCode(); } public void printStringAnalysis(String s) throws SecurityException, IllegalArgumentException, NoSuchFieldException, IllegalAccessException { System.out.println(showInternalCharArrayHashCode(s)); System.out.println(System.identityHashCode(s)); } public static void main(String args[]) throws SecurityException, IllegalArgumentException, NoSuchFieldException, IllegalAccessException, InterruptedException { StringAnalysis sa = new StringAnalysis(); String s1 = new String("myTestString"); String s2 = new String("myTestString"); String s3 = s1.intern(); String s4 = "myTestString"; System.out.println("Analyse s1"); sa.printStringAnalysis(s1); System.out.println("Analyse s2"); sa.printStringAnalysis(s2); System.out.println("Analyse s3"); sa.printStringAnalysis(s3); System.out.println("Analyse s4"); sa.printStringAnalysis(s4); } } 

This program prints the following result:

 Analyse s1 1569228633 778966024 Analyse s2 1569228633 1021653256 Analyse s3 1569228633 1794515827 Analyse s4 1569228633 1794515827 

From this conclusion, it is very clear that no matter how String is created, if the strings have the same value, then they use the same char array.

Now my question is: where is this chararray stored, is it stored on the heap or goes to permgen? I also want to understand how to distinguish between heap memory addresses and address memory addresses.

I have a big problem if it is stored in perdgen, as it will eat my precious limited space forward. and if the char array is not stored in permgen, but on the heap, then this means that String literals also use a lot of space [this is something that I never read].

+6
source share
3 answers

From this conclusion it is very clear that no matter how String is created, if the strings have the same value, then they use the same char array

Not really: this is because you start with one literal string and create multiple instances from it. In an OpenJDK implementation (Sun / Oracle), the backup array will be copied if it represents the entire row. You can see it in src.jar or here: http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/lang/String.java#String.% 3Cinit% 3E% 28java.lang.String% 29

If you carefully construct the source lines so that they start with different arrays of characters, you will find that they do not use a common array.

Now my question is: where is this chararray stored

As far as I know, an array of characters for a string literal is stored on the heap (those who have more knowledge about loading classes, feel free to comment). Lines loaded from files will always store their backup arrays on the heap.

I know for sure that the data structure used by intern() refers only to a String object and not to its array of characters.

+2
source

From src line

  public String(String original) { this.value = original.value; this.hash = original.hash; } 

it is clear that the string created using this constructor shares the char array (value) with the original string.

It is important to note that the API does not guarantee this exchange:

Initializes a newly created String object to represent the same sequence of characters as the argument; In other words, the newly created string is a copy of the argument string. If an explicit copy of the original is not needed, the use of this constructor is optional since the strings are immutable.

For example, String.substring is used to share the char array with the original string, but in recent versions of Java 1.7, String.substring makes a copy of the char array.

+3
source

First of all: by definition, the literal "myTestString" is interned, and all interned String references with the same value refer to the same physical String object. Thus, the EXACT HARD LINE as a result from intern will be literal.

[Fixed] By definition, hashCode (but not a HashCode identifier) โ€‹โ€‹of two strings with the same character sequence values โ€‹โ€‹will be identical.

The hash code of the char[] array, on the other hand, is simply a mess of its address bits and has nothing to do with the contents of the array. This indicates that the value array has the same array in all of the above cases.

(Additional information: the old String implementation included a pointer to char[] , the offset, length, and hashCode value. New implementations depreciate the offset value with a String value starting at element 0 of the array. Other implementations (non-Sun / non-Oracle) eliminate a separate char[] array char[] and include String bytes inside the main distribution of the heap. There is no need for the value field to really exist.)

[Continued] Copied over the test case and added a few lines. hashCode and identityHashCode produce the same value for the given char[] , but produce different values โ€‹โ€‹on different arrays with the same contents.

The fact that arrays are the same in s1 and s2 is almost certain because they share the char[] array of the interned literal "myTestString". If strings were separately constructed from "fresh" char[] arrays, they would be different.

The main conclusion from all this is that String literals are interned, and the tested implementation "borrows" the source array when String is copied using new String(String) .

 Char array hash codes a1.hashCode() = 675303090 a2.hashCode() = 367959235 a1 identityHashCode = 675303090 a2 identityHashCode = 367959235 Strings from char arrays a1 String = ABCDE a1 String hash = 62061635 a1 String value identityHashCode = 510044439 a2 String = ABCDE a2 String hash = 62061635 a2 String value identityHashCode = 1709651096 
+1
source

All Articles