I’ve wrote this article several years ago, and intended JVM target is Java 8. Since then, JVM advanced in many areas. Perhaps the code in the article must be modified in order to work. However, the code is for demonstration purposes only, the idea of the post IMHO is still as valid as it was years ago when original text was created.
Java ecosystem is vast. A lot of things happen all the time. It is sometime difficult to be up-to-date and don’t forget things one doesn’t immediately use. In order to check myself I read interview questions blogs and groups from time to time. Sometimes I encounter things that I forgot, sometimes learn something new.. and sometimes I see answers that, well, IMHO are not precise, or incomplete, or incorrect.
Some time ago I encountered question “Why java string made final and immutable?”. Suggested reasons were thread safety, performance, and security.
I totally agree with first reason, it would be nightmare if anything as heavily used as String would require some sort synchronization for safe usage in multi-threaded.
I’m not sure about second reason ,”Java concurrency in practice” mentions that immutable object do have some performance advantage, but authors of “Java performance: definitive guide” point out that it is simply not correct in modern JVM. Both sources are considered very solid, and I don’t know JVM internals enough to decide on myself. Perhaps there is no immediate advantage to immutability , JIT generates the same code, interpreter works in the same manner. However,one case where immutability is certainly advantageous is garbage collection : containers of immutable objects are as “young” as youngest member, so I can easily imagine situations that GC just skips entire container in minor collection. And of cause, less objects to scan means less time… So, containers of strings can be skipped and not scanned. Also, String class is around from the very beginning of Java so perhaps at some point of time there WAS performance penalty for not being immutable so String was designed with this in mind. Then, JVM was improved, but there were no reason to change the String class.So, let’s say I think this point is valid.
But third reason – String class was designed immutable and final for security seems to me absolutely wrong. The example in answer was loading class by name, which is string, so if hacker can change it just before class loader loads the bytecode it is going to be security breach. Well, IMHO the reasoning is flawed because , if hacker can run arbitrary code in your JVM, you probably have bigger problem than those string immutability is “protecting” from. On the other hand, if you want to shoot yourself in foot, who can prevent you from doing it ? After all it is YOURS JVM.
Having this feeling I asked myself question arising from the “security hypothesis”: does immutable design of string really mean that contents of String object can’t be replaced after the string is created? The answer, as you can see below , is, of corse, it can.
First of all, anything can be changed via JNI. Once you have your c code called, you can break things in so many ways I can’t even count… So, let’s put this aside and modify contents of a string using only pure java. It’s not hard to do. Below is proof of concept:
public class StringModifier {
public static void main(String[] str){
try {
String test="aaaa";
String test2 =test;
String test3 = new String(test);
String test4 = new String(test.toCharArray());
Field values = String.class.getDeclaredField("value");
values.setAccessible(true);
char[] ref = (char [])values.get(test);
ref[0] = 'b';
System.out.println(test+" "+test2+" "+test3);
} catch (NoSuchFieldException|SecurityException|IllegalArgumentException|IllegalAccessException ex) {
}
}
}
The output is :
baaa baaa baaa aaaa
So it takes only 4 lines of code to modify a string content. It’s all the same whether the string is in const pool or allocated on heap, all aliases “see” the change in referenced object..
Several questions arise from this “discovery”:
Is it possible to use this? I think, sometimes yes. E.g. there are situations of storing sensible information in strings. It is most definetely bad practice, exactly for the reason of having the information around after it is no longner needed. Only garbage collector decides when memory of unreferenced object is reclaimed, so your precious info can be found perhaps a little longer than you think. But some API require password as strings in the method arguments, sometimes the parameters are passed from environment etc. In such cases of imposed using of string one can clean up sensitive data by modifying string contents… Probably there are some other scenarios in which it is desirable to modify string..
Another note is about scope of the code : as of today AFAIK there is a discussion of modification of the way String stores its value. A lot of strings usage scenarios don’t require it to be unicode. So it is suggested to use one byte per character in those situations. It is quite possible this String class change will be introduced in Java 9, so the above code won’t work as is. But probably it is not too hard to adapt it to the new design.
The string can be modified, so, how those changes manifest in multi threaded environment? After all, most important reason (IMHO) for immutability of the string is correct and expectable behaviour in multi-threading. Is it possible to get discrepancy in values of the string object in different threads? Well, I believe it is actually a case. If one doesn’t operate within “normal” API he shouldn’t expect regular JVM guarantees to be fulfilled. I tried to play a bit with concurrent access and modification of strings by several threads, but wasn’t able to get the discrepancy. But I believe it is just me not doing good enough tests..
That’s all for thoughts on modification of immutable strings in java.Hope you enjoyed the reading.
Leave a Reply