Sunday, August 26, 2012

Java Concurrency in Practice - Summary - Part 2

NOTE: These summaries are NOT meant to replace the book.  I highly recommend buying your own copy of the book if you haven't already read it. 

Chapter 3 - Sharing Objects
  1. Synchronization (for example, by using synchronized blocks) is not just about atomic execution of code blocks, it also influences memory visibility - i.e., ensures that a thread can see the changes made in another thread.  Without synchronization, the Java memory model does not guarantee that a value written by a thread will be seen by another thread on a timely basis or even at all.   For example,  if proper synchronization is not used, a thread X that relies on a control variable that is set in thread Y may NEVER see any updates to it that are written in thread Y.  In most cases, thread X will incorrectly loop for ever.
  2. If synchronization is not used, reordering of operations done by multi-processor CPUs for improving performance, can cause a thread to see an incorrect or partial value written by another thread.  Without synchronization, the data can be stale.  If thread first sets variable x to 1 and then y to 2, another thread may see y set to 2 while x is still unset.  This can lead to bugs that are very hard to debug.
  3. Always use synchronization whenever data is shared across threads.
  4. Out-of-thin-air safety - A thread always sees the value of a variable that was written by some thread; not some random value pulled out of thin air.  Unless declared as volatile, 64-bit numeric variables (long and double) do not have out-of-thin-air safety, because the JVM treats 64-bit operations as two 32-bit operations.
  5. Volatile variables provide a weaker form of synchronization.  
    1. Volatile variables are specially treated by the compiler (for eg: not cached in registers). So a read of a volatile variable always returns the latest value written by some thread.
    2. When thread A writes to a volatile variable and subsequently thread B reads that same variable, the values of all variables visible to A prior to writing the volatile variable become visible to B after reading the volatile variable.
    3. Don't overuse volatile, and in tricky ways.  Synchronized blocks are still necessary for atomic operations.  
    4. Volatile is commonly used for a control variable that determines when a thread should exit an infinite loop.
    5. Locking guarantees visibility and atomicity; volatile variables guarantee only visibility.
    6. Use volatile variables only when all the following conditions are satisfied:
      1. Writes to the variable do not depend on its current value, or if it is guaranteed that only a single thread writes to the variable.
      2. The variable does not participate in invariants with other state variables.
      3. Locking is not required for any other reason while the variable is being accessed.
  6. For server applications, always specify the -server JVM command line argument even while developing and testing, since the JVM does more drastic optimizations in server mode. Some concurrency bugs arise only under these optimizations.
  7. Publishing an object means making it available to code outside of its current scope.
    1. This can be done by:
      1. storing a reference to it somewhere where other code can find it, say a public static field or in a publicly accessible HashMap.
      2. returning it from a non-private method.
      3. passing it to an alien method
        1. a method in other classes
        2. an overridable method in the same class
      4. publishing an inner class instance (this automatically exposes the enclosing instance)
    2. Sometimes we do not want to publish an object since that will break encapsulation.  An object that is published when it should not have been is said to have escaped.
    3. Do not allow the this reference to escape during construction.  This commonly happens when the constructor registers some inner class with external event listeners or starts a thread.  Even if this is the last statement in the constructor, it is possible that a reference to the object may escape before it is fully constructed.  Other threads can see the partially constructed object and react incorrectly.  Use a separate start() method to start a thread created in the constructor, or to register event listeners created in the constructor.  Alternatively, to do it one step, use a newInstance() factory method that calls the constructor and then automatically calls start() before returning the newly created object.
    4. Calling an overriden instance method from the constructor also allows this to escape before being fully constructed.
  8. Thread confinement, i.e., make sure data is accessed only from one thread, is the easiest way to achieve thread safety.
    1. Swing UI framework & JDBC connection objects use thread confinement extensively.
    2. Thread confinement options:
      1. Ad-hoc thread confinement - programmer entirely responsible to confine object to thread - no language features used. Not recommended due to fragility.
      2. Stack confinement - Object can be reached only through local variables
        1. Primitive types are always stack confined.
        2. Care should be taken that object references do not escape.
      3. ThreadLocal - provides get and set methods that maintain a separate copy of a value for each thread. Used as: new ThreadLocal() { public T initialValue() {...}}
  9. Immutability - Immutable objects are always thread-safe.
    1. Even if all fields of an object are final, it may still not be immutable as some of its final fields can refer to mutable objects.
    2. Final fields provide initialization safety as they have special semantics under the Java Memory Model.  Make all fields of a class final unless they really need to be mutable 
    3. When a group of related data items must be processed atomically, consider creating an immutable holder class.   When an immutable holder class, we may be able to avoid a synchronized block.
  10. Safe publication
    1. Simply storing a reference to an object into a public field is not safe, as it could lead to other threads seeing the object in a partially constructed state (due to reordering).
    2. Immutable objects can be published through any mechanism; no synchronization necessary.
    3. Others must be safely published, i.e., both the reference to the object and the object's state must be made visible to other threads at the same time.  A properly constructed object can be safely published by:
      1. Initializing an object reference from a static initializer.  This is often the easiest way; static initializers are executed by the JVM at class initialization time which has JVM-internal synchronization.
      2. Storing a reference to it in a volatile field or AtomicReference.
      3. Storing a reference to it into a final field of a properly constructed object
      4. Storing a reference to it into a field that is properly guarded by a lock, like thread-safe collections like Vector or synchronizedList.
    4. Effectively immutable objects must be safely published.
      1. Objects that are not technically immutable, but whose state will not be modified after publication are called effectively immutable.  Safely published effectively immutable objects can be safely used by any thread without additional synchronization.  For example, the Date object is often used as an effectively immutable object although it is technically mutable.
    5. Mutable objects must be safely published, AND must be either thread-safe or guarded by a lock.


Unknown said...

"Locking (for example, by using synchronized blocks) is not just about atomic execution of code blocks, it also influences memory visibility"

I think you meant "Synchronization" instead of "Locking".

Dilip Joseph said...

Yes, I did mean Synchronization. Thanks for the correction. I have updated the page.