Is direct access to an array of string support justified in some cases?

I am working on optimizing word processing software that uses the following class:

class Sentence { private final char[] textArray; private final String textString; public Sentence(String text) { this.textArray = text.toCharArray(); this.textString = text; } public String getString() { return textString; } public char[] getArray() { return textArray; } } 

As you can see, there is some redundancy: the substitution array textString is always equal to textArray, but both of them are preserved.

I hope to reduce the memory size of this class by getting rid of the textArray field.

There is one problem: this class is widely used in the database, so I can not get rid of the getArray () method. My solution is to get rid of the textArray field and the getArray () method returns an array of textSting support instead of reflection.

The result will be something like this:

 class Sentence { private final String textString; public Sentence(String text) { this.textString = text; } public String getString() { return textString; } public char[] getArray() { return getBackingArrayUsingReflection(textString); } } 

This seems like a viable solution, but I suspect that the String support array is private for some reason. What are the potential problems with this approach?

+4
source share
5 answers

One thing that will happen is that you are executing one specific JDK implementation. For example, Java 7 Update 6 has completely updated the use of char[] . That is why such an approach should be allowed only if your code is very ephemeral, basically dumping the code.

If you read only char[] and you code OpenJDK Java 7, Update 6, you will not introduce any errors.

On the other hand, 95% of Java programmers around the world are likely to show their heads at a loss for code that reflects the internal elements of String , so be careful :)

+4
source

Depending on the version of java.lang.String (Java 7 Update 5 and earlier), a lookup array is used, and the starting index and length ( count ) of the actual string in this array. In these Java implementations, the underlying array may (substantially) take longer than the actual string, and the string does not necessarily begin at the beginning of the array.

For example, if you use substring , the lookup array may be identical to the lookup array of the original string, but only with a different start index and character. Therefore, using reflection to return a String support array does not work in all cases (or: this will lead to incorrect / unexpected behavior).

See, for example, http://www.docjar.com/html/api/java/lang/String.java.html String substring(int beginIndex, int endIndex) in the 1950 string (and below), which invokes the String(int offset, int count, char value[]) constructor String(int offset, int count, char value[]) on line 645 (and below). Here, char[] directly used as the base array, and offset and count are used as offset in the array and string length:

 public String substring(int beginIndex, int endIndex) { if (beginIndex < 0) { throw new StringIndexOutOfBoundsException(beginIndex); } if (endIndex > count) { throw new StringIndexOutOfBoundsException(endIndex); } if (beginIndex > endIndex) { throw new StringIndexOutOfBoundsException(endIndex - beginIndex); } return ((beginIndex == 0) && (endIndex == count)) ? this : new String(offset + beginIndex, endIndex - beginIndex, value); } // Package private constructor which shares value array for speed. String(int offset, int count, char value[]) { this.value = value; this.offset = offset; this.count = count; } 

As Marco Topolnik points out, this is no longer the case with later versions of Java 7 . You should not depend on the details of the Java implementation (especially since this can change significantly between versions - as shown).

+3
source

If you need faster, use String.charAt(i) , which will be inline and avoid any problems with inetrnals changes. You can use CharSequence if you want to avoid creating a String from StringBuilder, since both support this interface.

+1
source

For fun and games, run the following unit test:

 public class StringTest { private String text; public StringTest() { super(); } public char[] getBackingArray() { if (text == null) { return null; } try { final Field valueField = text.getClass().getDeclaredField("value"); valueField.setAccessible(true); final char[] data = (char[]) valueField.get(text); return data; } catch (final Exception e) { e.printStackTrace(); } return null; } public String getText() { return text; } public void setText(String text) { this.text = text; } @Test public void testStringFunManipulation() { final StringTest test = new StringTest(); test.setText("Hello World"); Assert.assertNotNull(test); System.out.println("Original String: " + test); System.out .println("Original String Hash: " + test.getText().hashCode()); char[] data = test.getBackingArray(); Assert.assertNotNull(data); System.out.println("Backing Array: " + data); data[0] = 'J'; System.out.println("Modified String: " + test); System.out .println("Modified String Hash: " + test.getText().hashCode()); System.out.println("Modified String Hash Should be: " + "Jello World".hashCode()); } @Override public String toString() { return text != null ? text.toString() : ""; } } 

It should give you an answer to the question why a poor idea of ​​the internal, private values ​​of classes can .

+1
source

You can change the implementation of getArray as follows:

 public char[] getArray() { return this.textString.toCharArray(); } 
0
source

All Articles