What it takes to make a java class immutable?And what does it mean, actually?

One can find a lot of publications, blog posts, articles etc. about java class immutability, sometimes suggesting confilicting requirements for class to be immutable. In this post I’ll try to clarify some points regarding it by attempting to look at immutability from JVM point of view. I intentionally leave out of discussion importance of immutable object to system design or code optimisations and concentrate only on multi-threading aspects of immutability.

Let’s informally say that class is immutable, if its internal state can’t be affected after the object construction finalised.
That includes the following

  • there are no methods in class interface ( including methods inherited from superclass) that directly or indirectly change internal state in accordance to their arguments.
  • there is no instance fields that can be directly modified from outside
  • if class has a field containing reference to an object, then either referenced object is immutable itself or the reference can’t be obtained from outside of the class
  • references for the object can’t be obtained outside the class before constructor finishes

JVM specification assures two things if the object is immutable:

  1. one can access the object without synchronisation in any thread
  2. objects of immutable class don’t require safe publication (reference is safely published by initialising an object reference from static initializer, storing reference into volatile, final, AtomicReference variable, or variable guarded by a intrinsic lock)

Let’s look at the first item:

In order to understand it let me remind the reader why discrepancy possible at all. To be efficient, JVM maintains copies of the same object for different threads. Most modern architectures are NUMA, that is different cores has different access times to regions of memory. Threads write to their copies of object, and at some point of time JVM propagates changes to other threads. Without JVM and compiler taking special care, it is possible that one thread doesn’t observe changes made by others or, even worse, observes only partial changes.. Java memory model specifies conditions on which copies are guaranteed to be up-to-date.

For particular kind of classes, called effectively immuctable,JMM guarantees that any thread observes correct values of fields of the object of the class. Object is effectively immutable if it is fully constructed by the time any thread accesses it, and it’s state doesn’t change after construction.

Let’s see why this happens. After the object has been constructed by JVM, any thread that tries to access it first time, copies state of the object to its local memory and then works with this copy. But if object never changes, all local copies contain identical information. And there is a reason for why no synchronization needed – local copy is up to date. Note that JVM doesn’t do anything special, it is not even aware of “immutability” of the object.

With this in mind, one easily understands requirement of not escaping reference before constructor finishes for class to be effectively immutable. If constructor haven’t finished, there is no guarantee that newly obtained local copy doesn’t contains only partailly initialized data.

For immutable class there is no way to modify local copy of data, by definition of the immutability. Hence immutable class is always effectively immutable. So copies of the object maintained by all threads are identical, and this is “for free” from JVM point of view..

Immutable object doesn’t require safe publication – meaning that JVM must ensure that thread that obtain the object reference gets it right (e.g. there are no certain types of access optimisations JVM is allowed and performed on “regular” object references). JVM does additional work for this, and it must decide whether apply additional guarantees, prevent some optimisations or not. Many places on the net state that class must be final in order to be immutable, but it is not true. For example BigInteger is immutable and not final as one can see from documentation. Once again, if one thinks about the matter from JVM “point of view”, absence of this requirement is quite clear. When object is constructed JVM always knows its exact class. It doesn’t matter that there are some subclasses that are not immutable. At the point of construction compiler knows whether object is immutable or not. At some other points in code it can’t determine for sure, e.g. if argument of a methd is such immutable but not final class. In this case compiler can’t assume that object is indeed immutable, and perform optimisations that would be possible if class was final. But construction is not such a case, and compiler is able to produce code accordingly ( or, perhaps JVM can take this in account).

Immutable class doesn’t have to be final, but what rules one has to follow so compiler does recognizes the class as immutable? Once again, from JVM point of view meaning of immutability is there is no way to affect state of objects of the class after construction. That is, contents of memory representing object and all the objects it references can’t be changed.

JVM specification assures two things if the object is immutable:

  • making all non-private fields final
  • ensuring there is no method in the class that changes object’s fields directly or indirectly
  • all referenced objects can’t be modified from outside, e.g. by ensuring they are immutable themselves

Let’s take a look on some examples in order to get a taste of what can’t be done if one desires to make class immutable:

public class Mutable1 {
 ............
 int samePackageAccessible;
 ............
}

Even if member is not public, possibility of modification ( by code of other classes of the same package) disqualifies the class from being immutable. Note that it is not important whether other classes of the package modify samePackageAccessible at time of compilation. Such a code can be added later.

public class Mutable2 {

   private int mutable;


   private void someMehtod(int newVal) {
    .....
    mutable = newVal;
    ......
   }

   public void someOtherMethod(..) {
     ....
     if ( something )
        someMethod(int value that depends on arguments of someOtherMethod);
     ....

}

mutable is not accessible directly from outside, but it can be changed by calling someOtherMethod

public class MutableClass {
   int  mutable;
}

public class Mutable3 {
  public final MutableClass myFinal;
  ....
}

Mutable3 disqualified from being immutable even though all its non-private members are final, because one can change its state by modifying

Mutable3 myObj = new Mutable3();
myObj.myFinal.mutable = ...
public class MutableClass {
        int mutable;
}

public class Mutable4 {
  private final MutableClass myPrivite;

  // there are no methods that modify myPrivate

  public Mutable4(MuttbleClass myPrivate) {
    ....
    this.myPrivate = myPrivate;
  }

}

Mutable4 is disqualified from being immutable, because it is possible that outside object retains reference to myPrivate object passed to constructor, and can modify its state…

public class Mutable5 {
   private Random r = new Random();
   private int mutable;
   
   private void someMethod(){
     mutable = r.nextInt();
   }

   ...
   public void someOtherMethod() {
     ...
      someMethod();
     ...
   }

}

Mutable5 class is disqualified from being immutable even though mutable value doesn’t depend on any input from outside of the class. Calling someOtherMethod still changes internal state of the object.

At this point it seems that immutable class just can’t contain mutable fields, all fields are final ,and all referneced objecs are immutable…

Beleive it or not – immutable object can contain mutable fields that change with time! But all access to the fields must yield the same result. If compiler can assure this , class still immutable.

Let me explain with example. String is immutable class. But for performance reasons it doesn’t compute its hash code at construction time. It contains field

private int hash; // defaulted by 0

and

public int hashCode(){

 if (hash ==  0) {
        
      hash = s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1] 
              //where s is array of chars representing the string
 }
 return  hash;
}

What is extremly important is computation of hash depends only on final array of characters that doesn’t change after string construction. This way hash can take only 2 values – 0 and result of computation that depends only on non-changeable fields of String object. Compiler can verify that any read access to hash happens after its value set as result of deterministic computation which depends only on internal unchangeable data of the class… Also, it is important that type of the field is not long or double. Writes to fields of those types are not atomic, if hash was long there is possible data race – thread computing the hash code writes only part of the value, and another thread reads this partial data… As one can see, pulling the trick of having mutable fields in immutable class requires subtle reasoning, and very error prone. Don’t do it unless really have to…

So, let me summarize the post:

  • immutable class can define public fields
  • immutable class is not required to be final
  • if there is any possibility of changing internal state, even state of objects referenced by objects of the class or calling of method that calls a method that calls method changing internal state, the class is not immutable. Even if at compile time no other code calls this method or holds reference to object that is part of internal state…
  • immutable class even can contain some private mutable fields, but the behavior of the class must be as if those fields always contain the same value. Creating immutable class with such fields is very tricky and error prone

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *