The Ex CS Grad Student

My Experiences Interviewing for a New Job

2012-12-15T17:28:00.000-08:00

I recently interviewed for a new software engineering role. It is a terrific time to be a software engineer in the Bay Area. Lots of companies, big and tiny, are hiring.

Last time I searched for a job, I didn't really look around much. This time, I went the other extreme and interviewed at 10 companies, mostly startups. In retrospect, 10 was too many. I should have spent more effort filtering out startups after the initial conversations with the founders/engineering leads, and should have avoided the first round of technical interviews if I wasn't entirely enthusiastic and convinced about the company or the role. In multiple cases, I cancelled the second round of interviews after it became clear to me that a company would not be among my final top choices. That way, I avoided further wasting everyone's time.

I applied to a couple of startups and big companies through friends who already worked there. For most startups, I applied via a recruiter who contacted me over LinkedIn multiple times over the last one year. He, and his colleagues at his firm, turned out to be excellent and presented many exciting startup opportunities to me, which I otherwise would not have known about. I was initially concerned about recruiters being pushy, but these guys were super nice and professional. I also participated in developerauction.com.

Last time I applied for jobs, I used my pdf resume. This time, all I needed was my LinkedIn profile. Only one company even asked for my pdf resume. I spent a good deal of time (grudgingly) searching for my resume's latex source files, which I finally found on an external backup drive. Now, it is checked in at bitbucket; but hopefully, I will never need it again. An up-to-date LinkedIn profile is all you need these days when applying for a job.

Since I applied to too many companies this time, I did a LOT of interviews. Most of my interviews were the standard "write code for simple algorithmic problem on the whiteboard". I really don't enjoy writing code on the whiteboard - it is so unnatural - but I did it without complaining and was successful. At a couple of startups, I was asked only about my current projects and some high-level design problems - didn't write a single line of code. The best interview process was at a mid-sized mobile payment startup - 4 pair-programming interviews where we produced real working code and unit tests, 2 design interviews and one interview just to talk about my work experience.

Successfully tackling algorithmic questions on the whiteboard requires practice, especially if you, like most software engineers, don't deal with linked lists, dynamic programming, graphs, etc on a daily basis. The last time I dealt with such problems on a daily basis was during my undergrad Algorithms class. To refresh my memory, I went back to my old favorite algorithms text book - Introduction to Algorithms by Cormen, Leiserson and Rivest. It was a great $8 investment made 10 years ago (a lot of money for a textbook in India in those days). Here is a photo of the first page that is marked with my initials (DAnJo) and roll number. The http://www.careercup.com website and associated book Cracking the Coding Interview: 150 Programming Questions and Solutions were also very useful in preparing for the interviews.

Interviewing took a lot of time. However, it was a very enjoyable experience for me. I was able to meet many amazing engineers across multiple super-cool companies (If you are looking for interesting startups, shoot me an email and I can send you some pointers). I probably never exercised my brain so much in such a short span of time since undergrad exam crunch time.

Lastly, in case you are wondering, I am joining Facebook on Monday.

My Talk on Spark at AMPCamp 2012

2012-10-06T23:17:00.000-07:00

I recently gave a talk about how Conviva uses Spark at AMPCamp 2012. Spark has helped Conviva speed up online video data analysis by 30x. My talk video is embedded below.

Java Concurrency in Practice - Summary

2012-10-02T00:13:00.003-07:00

I recently read Java Concurrency in Practice by Brian Goetz. It is a fantastic book. Every Java programmer must read it. In order to firmly internalize the concepts described in the book, I read it a second time and took down notes summarizing each chapter. My notes were very popular when I was in school and college. Hopefully, these notes will be useful to others as well.

NOTE: These summaries are NOT meant to replace the book. I highly recommend buying your own copy of the book if you haven't already read it.

Java Concurrency in Practice - Summary - Part 10

2012-10-01T23:43:00.000-07:00

This is part 10 of my notes from reading Java Concurrency in Practice.

NOTE: These summaries are NOT meant to replace the book. I highly recommend buying your own copy of the book if you haven't already read it.

Chapter 15 - Atomic Variables and non-blocking synchronization

Non-blocking algorithms use low-level atomic machine instructions like compare and swap (CAS) instead of locks to ensure data integrity under concurrent access. An algorithm is called non-blocking if failure or suspension of any thread cannot cause failure or suspension of another thread. An algorithm is called lock-free if some thread can make progress at each step.
Locking has many disadvantages. For small operations like incrementing a counter, the overhead imposed by locking is just too much.
Almost every modern processor supports atomic Compare and Swap (CAS) instructions. CAS atomically updates a memory location to a new value if the old value matches what we expect. Otherwise it does nothing. In either case, it returns the original value of the memory location.

CAS: "I think V should have the value A. If it does put B there, otherwise don't change it but tell me I was wrong."
CAS is an optimistic technique - proceeds with update and detects collisions.
A thread that loses in CAS is not suspended. It can choose to retry, do some other recovery action or do nothing.
Primary disadvantage of CAS is that the caller must deal with contention. Locking automatically deals with contention by blocking.
Rule of thumb : cost of uncontended lock acquisition and release is twice the cost of a CAS on a multi-processor system.

Atomic Variables

AtomicInteger, AtomicLong, AtomicBoolean, AtomicReference
Elements in the atomic array classes (available in Integer, Long and Reference versions) can be updated atomically.
Not good candidates for keys in hash based collections since they are mutable.
At low to moderate contention, Atomic variables provide better scalability. At high contention, locks are better.

Refer the book for non-blocking stack and linked list code listing.
Atomic field updaters can be used to update existing volatile variables atomically. Only to be used when there is a performance problem associated with creating multiple new Atomic* variables (like for each node of a linked list).
ABA problem - During an update, the value of a reference field can change from A to B and back to A. In this case, we do not want the update to proceed. But in regular CAS, it will. This problem usually happens when we manage our own object pools. One option to avoid this problem is to use versioned references with AtomicMarkedReference and AtomicStampedReference.

Chapter 16 - The Java Memory Model

A memory model is required because compilers can reorder instructions, store variables in caches instead of memory, etc. Hence the result of assigning a value to a variable may not be immediately visible to another thread. The Java Memory Model (JMM) specifies the minimal guarantees the JVM must make about when writes to variables become visible to other threads.
The JMM shields application developers from differences in memory models across different processor architectures.
Within a single thread, the JVM must maintain serial semantics.
Actions - reads/writes to variables, locks/unlocks, starting/joining threads.
The JMM defines a partial ordering called happens-before between actions:

Program rule order - Each action in a thread happens-before every subsequent action in that thread.
Monitor lock rule - An unlock on a monitor happens-before every subsequent lock on the same monitor lock.
Volatile variable rule - A write to a volatile variable happens-before every subsequent read of that same variable.
Thread start rule - A call to Thread.start happens-before every action in the started thread.
Thread termination rule - Any action in a thread happens-before any other thread detects that it has terminated (either by successfully returning from Thread.join or by Thread.isAlive() returning false).
Interruption rule - A thread calling interrupt() on another thread happens-before the interrupted thread detecting the interrupt.
Finalizer rule - The end of an object's constructor happens-before the start of its finalizer.
Transitivity - A happens-before B and B happens-before C implies A happens-before C.
Placing an item in a thread-safe collection happens-before another thread retrieves that item.*
Counting down on a CountdDownLatch happens-before a thread returns from await().
Releasing a permit to a Semaphore happens-before acquiring a permit from the same Semaphore.
A FutureTasks computation happens-before another thread successfully returns from Future.get().
Submitting a Runnable/Callable to an Executor happens-before the task begins execution.
A thread arriving at a CycleBarrier or Exchanger happens-before the barrier action, which happens-before threads are released from the barrier.

Piggybacking - advanced technique that uses an existing happens-before ordering to ensure visibility of an object X, rather than creating an happens-before relationship for publishing X. Do not use unless absolutely necessary for performance.
Static initialization provides safe publication as static initializers are run by the JVM at class initialization time. You can use static initialization lazily by using a private static holder class.
Double-checked locking - incorrect technique to avoid synchronization during lazy initialization. First check if resource is non null without holding lock. If non-null, just use it. If null, synchronize and initialize it. The problem is that using the resource without any happens-before relationship can expose it in a partially constructed state.

Java Concurrency in Practice - Summary - Part 9

2012-09-30T11:24:00.000-07:00

This is part 9 of my notes from reading Java Concurrency in Practice.

NOTE: These summaries are NOT meant to replace the book. I highly recommend buying your own copy of the book if you haven't already read it.

Chapter 13 - Explicit Locks

Unlike intrinsic locking, the Lock interface offers unconditional, polled, timed, and interruptible lock acquisition.
Lock implementations provide the same memory visibility guarantees as intrinsic locking. They can vary in locking semantics, scheduling algorithms, ordering guarantees and performance.
ReentrantLock has same semantics as a synchronized block.
Why use explicit locks over intrinsic locks?

Unlike intrinsic locking, a thread waiting to acquire a ReentrantLock can be interrupted.
ReentrantLock also supports timed lock acquisition.
WIth intrinsic locks, a deadlock is fatal.
Intrinsic locks must be released in the same code block they are acquired in. This makes non-blocking designs impossible.
ReentrantLock is much faster than intrinsic locking in Java 5.0

Lock objects are usually released in a finally block, to make sure that it is released if an exception is thrown.
lockInterruptibly() helps us build cancelable tasks.
tryLock() returns false if the lock cannot be acquired. Timed tryLock() is also responsive to interruption.
ReentrantLock offers two fairness options

Fair - threads acquire locks in order of requesting.
Non-fair (default) - thread can acquire lock if it is available at the time of the lock request, even if earlier threads are waiting. Non-fair locking is useful because it avoids the overhead of suspending/resuming a thread if the lock is available at time of the lock request.
Fairness is usually not needed, and has a very high performance penalty (multiple orders of magnitude).
Fair locks work best when they are held for a relatively long time or when the mean time between lock requests is large.

When to use intrinsic locks?

synchronized blocks have a more concise syntax. You can never forget to unlock a synchronized block.
Use ReentrantLock only when advanced features like timed, polled, interruptible lock acquisition, fairness or non-block structured locking are needed.
Harder to debug deadlock problems when using ReentrantLock because lock acquisition is not tied to a particular stack frame, and thus the stack dump is not very helpful.
synchronized is likely to have more performance improvements in the future (eg: lock coarsening) as it is part of the Java language spec.

Read-Write Lock - protected resource can be accessed by multiple readers or one writer at the same time.

offers readLock() and writeLock() methods which return a Lock object that must be acquired before doing the respective operations.
More complex implementation. Hence has lower performance except in read-heavy workloads.

Lock can only be released by thread that acquired it.

Chapter 14 - Building Custom Synchronizers

State-dependent classes - blocking operations can proceed only if state-precondition becomes true (for example, you cannot retrieve result of FutureTask if computation has not yet finished).
Try to use existing state-dependent classes whenever possible.
Condition queue - allows a group of threads (called wait set) to wait for a specific condition to become true.
Intrinsic condition queues - Any java object can act as a condition queue via the Object.wait(), notify() and notifyAll() functions.

Must hold intrinsic lock on an object before you can call wait(), notify() or notifyAll().
Calling Object.wait() atomically releases lock and suspends the current thread. It reacquires the lock upon waking up, just before returning from the wait() function call. wait() blocks till thread is awakened by a notification, a specified timeout expires or the thread is interrupted.
In order to use condition queues, we must first identify and document the pre-condition that makes an operation state-dependent. The state variables involved in the condition must be protected by the same lock object as the one we wait() on.
A single intrinsic condition queue can be used with more than one condition predicate. This means that when a thread is awakened by a notifyAll, the condition it was waiting on need not be true. wait() can even return spuriously without any notify(). The condition can also become false by the time wait() reacquires the lock after waking up. Hence when waking up from wait(), the condition predicate must be tested again and we must go back to waiting if it is false. Hence, call wait() in a loop: synchronized(lockObj) { while(!conditionPredicate()) { lock.wait();} // object is in desired state now

Notifications are not sticky - i.e. a thread won't know about notifications that occurred before it called wait().
In order to call notify() or notifyAll() on an object, you must hold the intrinsic lock on that object. Unlike wait(), the lock is not automatically released. The lock must be manually released soon as none of the woken up threads can make progress without acquiring the lock.
Use notifyAll() instead of notify(). If multiple threads are waiting on the same condition queue for different condition predicates, calling notify() instead of notifyAll() can lead to missed signals, as only the wrong thread may be woken up.

However using notifyAll() can be very inefficient, as multiple threads are woken up and contend for the lock where only one of them can usually make progress.
notify() can be used only if

The same condition predicate is associated with the condition queue and each thread executes the same logic on returning from wait().
A notification on the condition queue enables at most one thread to proceed.

A bounded buffer implementation needs to call notify only when moving away from the empty state or full states. Such conditional notifications are efficient, but makes the code hard to get right. Hence, avoid unless necessary as an optimization.
A state dependent class should either fully document its waiting/notification protocols to sub-classes or prevent sub-classes from participating in them at all.
Encapsulate condition queue objects in order to avoid external code from incorrectly calling wait() or notify() on them. This often implies the usage of a private lock object instead of using the main object itself.
Explicit Condition objects - Condition

Each intrinsic lock can have only one associated condition queue. Hence multiple threads may wait on same condition queue for different condition predicates.
A Condition is associated with a single Lock object. A Condition is created by calling Lock.newCondition(). You can create multiple Condition objects per Lock.
Equivalents of wait(), notify() and notifyAll() for Condition are await(), signal() and signalAll(). Since Condition is an Object, wait() and notify() are also available. Do not confuse them.
Explicit Condition objects make it easier to use signal() instead of signalAll().

Synchronizers

Both Semaphore and ReentrantLock extend AbstractQueuedSynchronzer (AQS) class.
AQS is a framework for building locks and synchronizers.
When using AQS, there is only one point of contention.
Acquisition - state dependent operation that can block.
Release - allows some threads blocked in acquire to proceed. Not-blocking

AQS manages a single integer of state for the synchronizer class. It can be accessed with getState(), setState() and compareAndSetState() methods. The integer can represent arbitrary semantics. For example, FutureTask uses it to represent the state (running, completed, canceled) of the task. Semaphore uses it to track the number of permits remaining.

Synchronizers track additional state variables themselves.
Synchronizers override tryAcquire, tryRelease, isHeldExclusively, tryAcquireShared and tryReleaseShared. The acquire, release, etc methods of AQS call the appropriate try methods,

Java Concurrency in Practice - Summary - Part 8

2012-09-27T11:17:00.000-07:00

This is part 8 of my notes from reading Java Concurrency in Practice.

NOTE: These summaries are NOT meant to replace the book. I highly recommend buying your own copy of the book if you haven't already read it.

Chapter 11 - Performance and Scalability

Avoid premature optimization - first make it right, then make it fast, if not fast enough already (as indicated by actual performance measurements)
Tuning for scalability is often different from tuning for performance, and are often contradictory.
Amdahl's Law : Speedup <= 1/( F + (1-F)/N) where F is the fraction of computation that must be executed serially, and N is the number of processors.
A shared work queue adds some (often overlooked) serial processing. Result handling is another form of serialization hidden inside otherwise seemingly 100% concurrent programs.
Costs of using threads

context switches - managing shared data structures in OS and JVM take memory and CPU. Can also cause flurry of processor cache misses on a thread context switch.
When a thread blocks on a lock, it is switched out by JVM before reaching its full scheduled CPU quantum, leading to more overhead.

Context switching costs 5000-10000 clock cycles (few microseconds). Use vmstat to find % of time program spent in the kernel. High % can indicate high context switching.
synchronized and volatile result in the use of special CPU instructions called memory barriers that involve flushing/invalidating CPU caches, stalling execution pipelines, flushing hardware write buffers, and inhibit compiler optimizations as operations cannot be reordered.
Performance of contended and uncontended synchronization are very different. synchronized is optimized for the uncontended scenario (20 to 250 clock cycles). volatile is always uncontended.
Modern JVMs can optimize away locking code that can be proven to never contend.
Modern JVMs perform escape analysis to identify thread-confined objects and avoid locking them.
Modern JVMs can do lock coarsening to merge multiple adjacent locks into a larger lock to avoid multiple lock/unlocks.
Synchronization by one thread affects performance of other threads due to traffic on the shared memory bus.
Uncontended synchronization can be handled entirely in JVM. Contended synchronization involves OS activity - OS needs to suspend the thread that loses the contention.
Blocking can implemented by spin-waiting or by suspending the thread via the OS. spin-waiting is preferred for short waits. JVM decides what to use based on profiling past performance.
Reducing lock contention

reduce duration for which locks are held.
reduce frequency at which locks are requested. Coarsen lock granularity by lock splitting (for moderately contended locks) and lock striping (for heavily contended locks).
replace exclusive locks with coordination mechanisms that permit greater concurrency.

Lock striping - ConcurrentHashMap uses 16 locks - bucket N is guarded by lock N % 16. Locking for exclusive access to entire collection is hard when lock striping is used.
Avoid hot fields like cached values - for eg: size is cached for a Map, in order to convert an O(n) operation to a O(1) operation. Use striped counters or atomic variables.
Alternatives to exclusive locks - concurrent collections, read-write locks, immutable objects, atomic variables.
Do not use object pools. Object allocation and GC were slow in earlier versions of Java. Now object allocation is faster than a C malloc - only 10 machine instructions. Object pools also introduce synchronization overheads

Chapter 12 - Testing Concurrent Programs

Every test must wait till all the threads created by it terminate. It should then report any failures in tearDown().
Testing blocking operations need some way to unblock a thread that has blocked as expected. This is usually done by doing the blocking operation in a new thread and interrupting it after waiting for some time. An InterruptedException is thrown if the operation blocked as expected.
Thread.getState() should not be used for concurrency control or testing. Useful only for debugging.
One approach to test producer-consumer programs is to check that everything that is put into a queue or buffer eventually comes out of it, and nothing else does.

For single producer-single consumer designs, use order sensitive checksum of elements that are added, and verify them when the element is removed. Do not use a synchronized shadow list to track the elements as that will introduce artificial serialization.
For multiple producer-consumer designs, use an order insensitive checksum that can be combined at the end of the test to verify that all enqueued elements have been dequeued.

Make sure that the checksums are not guessable by the compiler (for eg: consecutive integers), so that they are not precomputed. Use a simple random number generator like xorShift(int y) { y ^= (y << 6); y ^= (y >>> 21); y ^= (y << 7); return y;}

Test on multi-processor machines with fewer processors than active threads.
Generate more thread interleaving by using Thread.yield() to encourage more context switches during operations that access shared state.
Always include some basic functionality testing when doing performance testing to make sure that you are not measuring performance of broken code.
Non-fair semaphores provide better throughput, while fair semaphores provide lower variance in responsiveness.
Avoiding performance testing pitfalls

Ensure that garbage collection does not run at all during your test (check this using the -verbose:gc flag) OR ensure that garbage collection runs a number of times during the test (need to run test for a long time).
Your tests should run only after all code has been compiled; no point measuring performance of interpreted byte code. Dynamic compilation takes CPU resources. Compiled code executes much faster.

Code may be decompiled/recompiled multiple times during execution - for eg: if some previous assumption made by JVM is invalidated, or to compile with better optimization flags based on recently gathered performance statistics.
Run program long enough (several minutes) so that compilation and interpreted execution represent a small fraction of the results and do not bias it.
Or have an unmeasured warm-up run before starting to collect performance statistics.
Run JVM with -XX:+PrintCompilation so that we know when dynamic compilation happens.

When running multiple unrelated computationally intensive tests in a single JVM, place explicit pauses between tests in order to give the JVM a chance to catch up with its background tasks. Don't do this when measuring multiple related activities, since omitting CPU required by background tasks gives unrealistic results.
In order to obtain realistic results, concurrent performance tests should approximate the thread-local computation done by a typical application. Otherwise, there will be unrealistic contention.
Make sure that compilers do not optimize away benchmarking code.

Trick to make sure that benchmarking calculation is not optimized away: if (fox.hashCode() == System.nanoTime()) System.out.print(" ");

Complementary Testing Approaches

Code Review
Static analysis tools: FindBugs has detectors for:

Inconsistent synchronization.
Invoking Thread.run (Thread.start() is what is usually invoked, not Thread.run())
Unreleased lock
Empty synchronized block
Double-checked locking
Starting a thread from a constructor
Notification errors
Condition wait errors: Object.wait() or Condition.await() should be called in a loop with the appropriate lock held after testing some state predicate.
Misuse of Lock and Condition
Sleeping or waiting while holding a lock.
Spin loops

Java Concurrency in Practice - Summary - Part 7

2012-09-27T03:32:00.001-07:00

This is part 7 of my notes from reading Java Concurrency in Practice.

NOTE: These summaries are NOT meant to replace the book. I highly recommend buying your own copy of the book if you haven't already read it.

Chapter 9 - GUI Applications

Almost all GUI toolkits, including Swing, are implemented as a single-threaded subsystem. All GUI activity is confined to a single dedicated event dispatch thread. Attempts at multi-threaded GUIs suffered from deadlocks and race conditions. User actions manifest as events that bubble up from the GUI component to the application. Application initiated actions bubble down from the application code to the GUI components. Hence, GUI components are often accessed in opposite order, creating ripe conditions for deadlocks.
Tasks that execute in the event thread must complete quickly. Otherwise the UI will hang.
In Swing, GUI objects are kept consistent not by synchronization, but by thread confinement. They must NOT be accessed from any other thread.
A few Swing methods are thread-safe:

SwingUtilities.isEventDispatchThread
SwingUtilities.invokeLater - schedules a Runnable to be executed on the event thread.
SwingUtilities.invokeAndWait - callable only from a non-GUI thread. Schedules Runnable to be executed on GUI thread and waits for it complete
methods to enqueue a repaint or revalidate request on the event queue.
methods for adding/removing event listeners.

Short-running tasks can be run directly on the GUI thread. For long running tasks, use Executors.newCachedThreadPool().
Use Future, so that tasks can be easily cancelled. The task must be coded so that it is responsive to interruption.
SwingWorker class provides support for cancellation, progress indication, completion notification. So, we don't have to implement our own using FutureTask and Executor.
Data models must be thread-safe if they are to be accessed from the GUI thread.
A program that has both a presentation-domain and an application domain data model is said to have a split-model design.

presentation data model is confined to event thread. Application domain data model is thread-safe and is shared between the application and GUI threads.
presentation model registers listeners with the application model so that it can be notified of updates. Presentation model can be updated from the application model by sending a snapshot of the current state or via incremental updates.

Chapter 10 - Avoiding Liveness Hazards

Unlike database systems, JVM does not do deadlock detection or recovery
A program will be free of lock-ordering deadlocks if all threads acquire the needed locks in a fixed global order.

The order of locks acquired by a thread may depend on external input. Hence static analysis alone is not sufficient to avoid lock-ordering deadlocks.
An alternative is to induce an ordering on locks by using System.identityHashCode. Order lock acquisition by the hash code of the lock object.

In the extremely unlike scenario where the hash codes of two lock objects are equal, acquire a third "tie" lock before trying to acquire the original two locks. The tie lock can be a global lock. Since hash collisions are infrequent, the tie lock won't introduce a concurrency bottleneck.

If the lock objects (say bank Accounts) have a unique key, lock acquisition can be ordered by the key, and there is no need for the tie-lock.
Multiple locks may not always acquired in the same method. Hence, it is not easy to detect lock-ordering deadlocks. Watch out for invocation of alien methods while holding a lock.

Calling a method with no locks held is called an open call. Liveness of a program can be more easily analyzed if all calls are open.
Use synchronized blocks within methods to guard shared state, instead of making the entire method synchronized.

In cases where loss of atomicity of the synchronized method is unacceptable, we need to construct application level protocols. For example, when shutting down a service, lock for just long enough to mark the service as shutting down, and wait for existing tasks to complete without holding the lock. Since the service is marked as shutting down, no new tasks will start.

In addition to deadlocking waiting for locks, threads can also deadlock waiting for resources like database connections.
If you must acquire multiple locks, lock ordering must be part of your design. Minimize number of locks needed. Document ordering policy.
Timed locks offered by the Lock class are another option for detecting and recovering from deadlocks. The tryLock() method returns failure if timeout expires. It can return failure even if no deadlock occurred, but the thread just took a long time due to some other reason.
JVM prints out deadlock information in thread dumps. To trigger a thread dump, send SIGQUIT (kill -3) to the JVM. Explicit Lock objects are not clearly shown in a thread dump.
Starvation - a thread is perpetually denied access to needed resources.

CPU cycle starvation can be caused by inappropriate use of thread priorities, or by executing infinite loops with locks held.
Avoid setting thread priorities as they are platform-dependent and can cause liveness issues. Set lower priorities only for truly background tasks, that can improve the responsiveness of foreground tasks.

Livelock - thread is not blocked, but cannot make progress because it keeps retrying an operation that will always fail. For example, when a code bug is triggered when processing a particular input, and that input is re-queued for processing by over-eager error handling code. An unrecoverable error is being mistakenly being treated as a recoverable one. Solution for some forms for livelocks is to introduce randomness into the retry.

Java Concurrency in Practice - Summary - Part 6

2012-09-22T08:54:00.003-07:00

This is part 6 of my notes from reading Java Concurrency in Practice.

NOTE: These summaries are NOT meant to replace the book. I highly recommend buying your own copy of the book if you haven't already read it.

Chapter 8 - Applying Thread Pools

In the Executor framework, there is an implicit coupling between tasks and execution policies. Not all tasks are compatible with all execution policies.
If a task depends on the results of other tasks, then the execution policy must be carefully managed to avoid liveness problems. Deadlocks can happen if the thread pool is bounded, i.e. thread starvation deadlock.

Will always deadlock if using Executors.newSingleThreadExecutor().
Other resources like JDBC connections may also be a bottleneck.
Document any pool sizing or configuration constraints.

Tasks that rely on thread confinement for thread-safety will not work well with thread pools.
Responsiveness of time-sensitive tasks may be bad if we use a single thread executor or if we submit several long running tasks to a small thread pool. Use timed resource waits instead of unbounded waits.
Tasks that use ThreadLocal cannot be used with the standard Executor implementation as Executors may reuse or kill threads. Do not use ThreadLocal to communicate value between tasks.
For compute-intensive tasks, an Ncpu-processor system achieves optimum utilization with a thread pool of Ncpu + 1 threads. For tasks that include I/O or other blocking operations, use a larger thread pool since not all threads will be schedulable at all times.
ThreadPoolExecutor is the base class of executors returned by Executors.newCachedThreadPool, newFixedThreadPool and newScheduledThreadExecutor. It is highly configurable.
We can specify the type of BlockingQueue that holds tasks awaiting execution.

unbounded LinkedBlockingQueue is the default for newFixedThreadPool and newSingleThreadExecutor.
Another option is to use a bounded LinkedBlockingQueue, ArrayBlockingQueue or PriorityBlockingQueue.
SynchronousQueue - not really a queue. It is a mechanism for managing handoffs between threads. Another thread must be waiting to accept handoff - if pool maximum size has not been reached a new thread is created. If no thread is available, the task is rejected. Handoff is more efficient as we don't have to place the Runnable in an Queue. newCachedThreadPool uses a SynchronousQueue

newCachedThreadPool is a good default choice for an Executor.
Saturation Policy for a ThreadPoolExecutor can be modified by calling setRejectedExecutionHandler().

abort - causes execute() to throw the unchecked RejectedExecutionException. Caller catches this exception and implements its own overflow handling. This is the default.
discard - silently discard the newly submitted task.
discard-oldest - discard tasks that would be executed next and tries to resubmit the new task.
caller-runs - Tries to slow down the flow of new task submission by pushing some of the work to the caller. It executes the newly submitted task not in a pool thread, but in the thread that calls execute().
There is no predefined saturation policy to make execute() block when the work queue is full. However, this can be achieved using a Semaphore to bound the task injection rate.

Thread Factories - whenever a thread pool needs to create a thread, it uses a thread factory. ThreadFactory.newThread() is called whenever a thread pool needs to create a new thread. Default thread factory creates a new non-daemon thread with no special configuration. Use a custom thread factory to to specify an UncaughtExceptionHandler for pool threads, or instantiate an instance of a custom Thread class that does debug logging, or give pool threads more meaningful names.
Most ThreadPoolExecutor options can be changed after construction via setters. Executors.unconfigurableExecutorService wraps an existing ExecutorService to ensure that its configuration cannot be changed further. newSingleThreadExecutor() returns such a wrapped Executor rather than a raw ThreadPoolExecutor. This is because newSingleThreadExecutor is implemented as a thread pool with one thread, and no one should be able to increase the pool size.
ThreadPoolExecutor was designed for extension.

beforeExecute and afterExecute hooks are called in the thread that executes the task. Used for logging, timing, monitoring, statistics gathering. Use ThreadLocal to share values between beforeExecute and afterExecute.
afterExecute is not called if task completes with an Error (regular exception is okay)
If beforeExcute throws a RuntimeException, the task is not executed and afterExecute() is not called.
terminated hook is called after the thread pool has shutdown - all tasks have finished and all worker threads have shut down. Useful for releasing resources allocated by the Executor, notification, logging, finalize statistics gathering.

Parallelizing recursive algorithms

Sequential loops are suitable for parallelization when each iteration is independent of others, and the work done in each iteration is significant to offset cost of task creation.
Sequential loops within recursive algorithms can be parallelized. Easier if iteration does not need value of recursive iterations it invokes.
In order to wait for all results, create a new Executor, schedule the parallel tasks, call executor.shutdown() and then awaitTermination().

Java Concurrency in Practice - Summary - Part 5

2012-09-21T11:21:00.001-07:00

This is part 5 of my notes from reading Java Concurrency in Practice.

NOTE: These summaries are NOT meant to replace the book. I highly recommend buying your own copy of the book if you haven't already read it.

Chapter 7 - Cancellation & Shutdown

Java does not provide any mechanism for safely forcing a thread to stop what it is doing. Instead, we need to rely on interruption where a thread can request another thread to stop. The thread to be stopped may choose to ignore the request or can terminate after optionally performing cleanup operations.
Don't use the deprecated Thread.stop() and suspend() methods.
A task is cancelable if some external code can move it to a completed state before its normal completion. A task that supports cancellation must specify its cancellation policy:

how external code can request cancellation
responsiveness guarantees to cancellation requests
what happens on cancellation (such as cleanup operations)

One cooperative mechanism for terminating a task is to use a cancellation requested flag which the task periodically checks. If the flag is set by some external code, then the task terminates early. Remember to make the cancellation requested flag volatile. Otherwise changes made by the external code may never become visible to the task.

The thread will exit only when it checks the cancellation flag. Hence there is no guarantee on if and when the check will be made.
Can take a very long time to take effect (if at all) if the thread to be cancelled is stuck at a blocking operation.

Interruption Is usually the most sensible way to implement cancellation
Each thread has a boolean interrupted status, which is set to true when a thread is interrupted.

interrupt() interrupts the target thread
isInterrupted() returns the interrupted status of the target thread
interrupted() clears the interrupted status of the current thread and returns its previous value. This is the only way to clear the interrupted status of a thread. Poor choice of function name.

Blocking library calls try to detect when a thread has been interrupted and return early. They clear the interrupted status and throw an InterruptedException. There is no guarantee about how quickly a blocking method will detect interruption. In practice, it happens fairly quickly. When a thread that is not blocked is interrupted, its interrupted status is set. It is upto the activity being cancelled to poll the interrupted status and respond appropriately.

A task does not need to immediately stop on detecting the interrupted status. It can postpone acting on the interruption till a more opportune moment. This can prevent internal data structures from being corrupted when interrupted in the middle of some critical operations.

There is a distinction between how tasks and threads react to interruption. An interrupt on a worker thread in a thread pool can cancel the current task as well as shut down the worker thread. Hence, guesT code that doesn't own a thread must preserve the interrupted status of the thread after acting on the interrupt, so that the owner can appropriately deal with it later.
A thread should be interrupted only by its owner. Because each thread has its own interruption policy, you should not interrupt a thread unless you know what interruption means to that thread.
Responding to interruption

Propagate the interruptedException, OR
Restore the interrupted status by calling Thread.currentThread.interrupt(). This is the only feasible solution in cases like Runnable.run() which does not allow exceptions to be thrown.
Only code that implements a thread's interruption policy may swallow an interruption request. General purpose task and library code must never swallow interruption requests.

Activities that do not support cancellation but still call interruptible blocking methods must call them in a loop, retrying when interruption is detected. The interruption status should be saved locally and restored before returning. Restoring the interrupted status immediately can result in an infinite loop.
Cancellation via Future

Future.cancel(boolean mayInterruptIfRunning) - if mayInterruptIfRunning is true and the task is running in some thread, then that thread is interrupted. If false, cancel() only means that don't run this task if it hasn't started yet.
Standard Executor implementation implement a thread interruption policy that allows tasks to be canceled through interruption. Hence, it is ok to call Future.cancel(true) which interrupts the thread.
You should not interrupt a pool thread directly when attempting to cancel a task because you won't know what task is running at the time the interrupt is delivered. Cancel only through the task's Future.
When Future.get() throws an InterruptedException or TimeoutException and you know that the result is no longer required, cancel the task by calling Future.cancel().

Dealing with non-interruptile blocking

synchronous socket I/O in java.io - read/write in InputStream and OutputStream are not responsive to interruption, but closing the underlying socket makes any threads blocked in read/write to throw a SocketException.
Synchronous I/IO in java.nio - Interrupting a thread waiting on an InterruptibleChannel causes it to throw ClosedByInterruptionException and close the channel. Closing an InterruptibleChannel cause threads blocked on channel operations to throw AsynchronousCloseException.
Asynchronous IO with Selector - A thread blocked in Selector.select() returns prematurely if close() or wakeup() is called.
Lock acquisition - A thread waiting for an intrinsic lock cannot be interrupted. Explicit Lock class offers the lockInterruptibly method.
To perform non standard cancellation tasks (like closing a socket), override newTaskFor() in ThreadPoolExecutor to return a CancellableTask. CancellableTask extends Callable and overrides the FutureTask.cancel() method to close socket or perform any other nonstandard cancellation tasks.

Stopping a thread-based service

A thread pool owns the worker threads, and should be responsible for stopping them. It should provide lifecycle methods that can be used by the application to shut down the pool, which in turn shuts down the worker threads.

ExecutorService provides shutdown() and shutdownNow(). shutdownNow() returns the list of tasks that had not started, so they can be logged or saved for future processing. The returned Runnable objects may not be the same as what was submitted - they may be wrapped.

shutdownNow() provides no way of knowing the state of tasks in progress at shutdown time, unless the tasks themselves do checkpointing. Another option is to override execute() of AbstractExecutorService and pass in a wrapper Runnable that records tasks cancelled at shutdown. There is a race condition that may cause a completed task to be marked as cancelled. So tasks must be idempotent.

Poison Pill - a special object placed on the work queue is another way to convince a producer-consumer service to shut down. Applicable only when the number of consumers and producers is known, and when the queue is unbounded.
Handling abnormal thread termination

Leading cause of premature thread death is RuntimeException. If nothing special is done, the exception bubbles all the way up the stack and the thread is killed after printing a stacktrace to the console (which no one may be watching for)
If you are writing a worker thread class for a thread pool or executor service, be sure to catch Throwable and then notify the executor service of premature thread death.
Thread API provides an UncaughtExceptionHandler facility - when a thread exits due to an uncaught exception, the JVM reports this to an application provided UncaughtExceptionHandler. Use Thread.setUncaughtExceptionHandler() to set the handler for the current thread or Thread.setDefaultUncaughtExceptionHandler to set it for all threads.
In long-running applications, always use uncaught exception handlers for all threads that at least log the exception.
To set an UncaughtExceptionHandler for pool threads, provide a ThreadFactory to the ThreadPoolExecutor. Exceptions thrown from tasks make it to the UncaughtExceptionHandler only for tasks submitted with execute(). For those submitted with submit(), the exceptions are rethrown when calling Future.get()

JVM Shutdown

orderly shutdown - when the last nondaemon thread exits, System.exit(), Ctrl-C
abrupt shutdown - Runtime.halt(), SIGKILL the JVM
In orderly shutdown, JVM starts all shutdown hooks registered while Runtime.addShutdownHook(). Order of shutdown hook execution is not guaranteed.
After all shutdown hooks have completed, JVM may choose to run finalizers if runFinalizersOnExit is true,
JVM makes no attempt to stop or interrupt any application threads that are still running.
Shut-down hooks can run concurrently with other application threads. So, they must be thread-safe. Since JVM is shutting down, the application state may be messy. Hence the shut-down hooks must be coded extremely defensively.
Shut-down hooks should not use services that can be shutdown by the application or by other shutdown hooks. One option is to use a single shutdown hook per application that executes various shutdown operations in sequence.

Daemon threads - existence of a Daemon thread does not prevent JVM from shutting down. Internal JVM threads like GC thread are daemon threads. When JVM exits, finally blocks of any existing daemon threads are not run.

Should be used sparingly, for activities that can be safely abandoned at any time without any cleanup.
Do not use daemon threads for any tasks that perform I/O.
Generally used for housekeeping tasks like a background thread to remove expired cache entries.

Finalizers

GC treats objects that have a non-trivial finalize() method specially. finalize() is called after the memory is reclaimed.
Finalizers can run concurrently with application threads. Hence, they must be thread-safe.
No guarantee about if or when they will run.
HIgh performance cost.
Usually, finally blocks and explicit close statements are sufficient to release resources, instead of using finalize().
finalizers may be needed for objects that hold resources acquired by native methods.
Avoid finalizers

Java Concurrency in Practice - Summary - Part 4

2012-09-19T10:07:00.001-07:00

This is part 4 of my notes from reading Java Concurrency in Practice.

NOTE: These summaries are NOT meant to replace the book. I highly recommend buying your own copy of the book if you haven't already read it.

Chapter 6 - Task Execution

Individual client requests are a natural task boundary choice for server applications.
Creating a new thread per task is usually NOT a good idea.

Overheads of thread creation and teardown can add up to be quite significant.
Active threads consume system resources like memory even when they are idle, and also increase CPU contention.
Creating too many threads can result in an OutOfMemoryError.

Primary abstraction for task execution in Java is the Executor framework, not Threads.

Executor interface has a void execute(Runnable) methods.

Executors are the easiest way to implement a producer-consumer design.
Decouples task submission from execution. Execution policy is separated from task submission.
Always use Executor instead of new Thread(runnable).start()
Different types of Thread Pools can be created using static factory methods of the Executors class:

newFixedThreadPool - creates new threads upto a maximum specified size. Tries to keep pool size constant by adding new threads if some die due to exceptions.
newCachedThreadPool - No bounds on number of threads. Number of threads increase/decrease based on load.
newSingleThreadExecutor - Used to process tasks sequentially in order imposed by task queue (FIFO, LIFO, priority order). Replaces thread if it dies unexpectedly.
newScheduledThreadPool - Fixed size thread pool that supports delayed and periodic task execution.

Use instead of Timer.

Timer creates only a single thread. If one task takes too long, it affects the timing accuracy of subsequent TimerTasks.
An unchecked exception thrown by a TimerTask terminates the Timer thread. The Timer thread is not resurrected, and the Timer is simply cancelled.
Scheduled thread pools do not have the above two limitations However, Timers can schedule based on absolute time, while scheduled thread pools only support relative time.

Executor lifecycle has 3 states - running, shutting down, terminated.
ExecutorService interface (that extends Executor) offers methods like shutdown(), shutdownNow(), isShutdown(), isTerminated(), awaitTermination() to control Executor life cycle.

shutdown() - graceful shutdown. Allow all running and previously submitted tasks to complete. No new tasks are accepted.
shutdownNow() - cancel running tasks, and ignores any queued tasks.
Tasks submitted to executor after shutdown are passed to a rejection handler, which may silently drop the task or throw a RejectedExecutionException.
awaitTermination() is usually called immediately after calling shutdown().

ExecutorService.submit(Callable) returns a Future. A Future represents the lifecycle of a task, and provides methods to monitor/control it.
CompletionService combines the functionality of an Executor and a BlockingQueue. Submit a bunch of Callables to the Executor, and then wait for the results to be available using take() and poll().

An ExecutorCompletionService is a wrapper around an Executor - new ExecutorCompletionService(executor).
Tasks are submitted to the completion service, and not directly to the Executor.
Multiple completion services can share an Executor.
Keep track of the number of tasks submitted in order to determine the number of times to call take().

Future.get() supports a version that throws a TimeoutException if the result is not available within the specified time delay.

If a TimeoutException happens, then call cancel on the Future to cancel the task. If the task is written to be cancelable, then it can be terminated to avoid consuming unnecessary resources.

ExecutorService.invokeAll - takes a collection of tasks and returns a collection of Futures. Timed version of invokeAll returns when all tasks have completed, the calling thread is interrupted or if the timeout expires. Use Future.isCancelled() to determine if a particular task completed or was interrupted/cancelled.

Java Concurrency in Practice - Summary - Part 3

2012-09-19T05:50:00.002-07:00

This is part 3 of my notes from reading Java Concurrency in Practice.

NOTE: These summaries are NOT meant to replace the book. I highly recommend buying your own copy of the book if you haven't already read it.

Chapter 4 - Composing objects

We often create thread-safe objects by composing together other thread-safe objects. In order to design a thread-safe class, we need to identify the object's state variables, establish invariants that constraint them and the post conditions associated with its method, and then establish a policy for managing concurrent access to them.

An object's state includes the state of other objects referenced from its state variables. For eg: a LinkedList's state includes all the link node objects.

Synchronization policy - how an object uses immutability, thread-confinement, locking (how and which variables are guarded by which locks) to coordinate access to its state variable so that the invariants or post conditions are not violated.
Encapsulation enables us to determine that a class is thread-safe without having to examine the entire program, because all paths that use the data can be identified and made thread-safe.

Collection wrappers like synchronizedList make the underlying non-thread-safe ArrayList thread-safe by using encapsulation.

Java monitor pattern - object encapsulates all state and guards it with its own intrinsic lock.

Pro: Simplicity.
Con: External code can lock on the object's intrinsic lock and cause liveness problems that may require examining the whole program to debug. This problem is avoided if a separate private object is used as the lock.

A composite object made of thread-safe components need not be thread-safe. This is usually the case when there are constraints on the state of the components.
A state variable can be published if it is thread-safe, does not participate in any invariants that constrain its value and has no prohibited state transitions for any of its operations.
Re-use existing java thread-safe libraries whenever possible. To add-functionality to an existing thread-safe class:

The best option is to modify the source code of the class if available.

Pro: details about synchronization policy are confined to one file, and is thus easier to maintain.

If modifying source code is not possible, the next best option is to extend the class, assuming it was designed for extension.

Con: Fragile. If base class changes synchronization policy (say which locks are used), then the extended class will silently fail.

Extension or source modification is not possible for collections wrapped in Collections.synchronized* wrappers, since the underlying wrapped class is unknown. The solution is to use client-side locking - guard client code with the lock specified by the object's synchronization policy. For Vector, the lock is the object itself.
Another option is Composition. In a wrapper object, maintain a private internal copy of the object (say List) whose functionality we wish to extend. Add the new functionality as a synchronized method of the wrapper. Add synchronized methods for existing functionality of the wrapped object that simply delegate to the underlying object.

Pro: Less fragile
Con: Minor overhead due to double locking.

Document a class's thread-safety guarantees for users of the class; Document its synchronization policy for the class's maintainers.

Use @GuardedBy annotations to document the locks used to guard different state variables.
Since documentation for commonly used Java libraries is vague, we often have to guess whether a class is thread-safe or not. For example, since a JDBC DataSource represents a pool of reusable database connections to be shared across multiple threads, we can assume that it is thread-safe. However, individual JDBC Connection objects are intended to be used by a single thread, and are most likely not thread-safe.

Chapter 5: Building Blocks

Java offers a wide variety of thread-safe classes that can serve as building blocks for large concurrent programs.
External locking is still required for thread-safe classes in order to provide a 'reasonable' behavior.
Unreliable Iteration is one example where additional synchronization is required. For example, using getLast() and deleteLast() on a thread-safe List requires additional synchronization. If getLast() determines the index of the last element to be L and the element at L is deleted by deleteLast() before getLast() accesses it, an ArrayOutOfBoundsException will be thrown. Otherwise, an ArrayOutOfBounds exception may be thrown if the last element of the list is deleted after getLast() determined the index of the last element to be returned. Note that the data in the List is never corrupted, we just get unexpected behavior.

Java Iterators throw an unchecked ConcurrentModificationException if it detects that the underlying collection has changed during iteration. This is not reliable as the counters used to track whether the underlying collection has changed are not thread-safe.
Iterators can sometimes be hidden - For eg: an iterator is used if a collection object is passed to a log statement which tries to get its string representation.
Unreliable iteration can be solved by client-side locking. When using the synchronized wrappers provided by Java, we can use the underlying collection as the lock for composite actions. Con: Decreases scalability.
Another option to avoid concurrent modification exception is to clone the collection as a local copy. The lock on the collection must be held while cloning. Con: Can be very expensive to clone large collections.
To avoid client-side locking during iteration, use the Concurrent collections offered by Java. This improves scalability as multiple threads can now access the collections simultaneously without blocking

ConcurrentHashMap

HashMap uses a single lock to synchronize all its operations.
ConcurrentHashMap uses lock-striping. Supports concurrent non-blocking access by infinite number of readers, and a limited number of writers.
Iterators do not throw ConcurrentModificationException.
size() and isEmpty() are approximate. These methods are not useful in concurrent environments anyway.
Cannot lock the entire map for synchronized access, needed in rare cases where multiple map entries need to be added atomically.
Cannot use client-side mapping while adding new atomic operations. If these are needed, you most likely need ConcurrentMap instead of ConcurrentHashMap.

CopyOnWriteArrayList

Thread-safety derived from the immutability of underlying list. Mutability is provided by creating and republishing a new copy of the list on every change. Iterators point to the list at the time the iterator was created.
Copying large lists can be expensive. Hence mainly useful when iteration is the more common operation rather than addition - for eg: when using a list of event listeners.

Some more concurrent collections - CopyOnWriteArraySet, ConcurrentLinkedQueue, ConcurrentSkipListMap - concurrent replacement for synchronized SortedMap, ConcurrentSkipListSet - concurrent replacement for synchronized SortedSets
Java offers multiple Queue implementations (esp. BlockingQueue) that can be very useful to implement producer-consumer designs. Queues can be blocking or non-blocking. For non-blocking queues, .retrieval operations return null if queue is empty.
BlockingQueue

blocking methods : take, put (blocking happens only if queue is bounded)
non-blocking methods: offer, poll
Use bounded blocking queues for reliable resource management. Otherwise, if consumers are slow, producers can keeping adding to the queue till the JVM runs out of heap space. Do this early in design; hard to retrofit later.
We can use offer() to check if the item will be accepted by the queue. If the queue is full, the item can be discarded in application specific ways (for eg: simply drop it or save it to local disk for later usage)
BlockingQueue implementations - contain sufficient internal synchronization to safely publish objects from a producer thread to a consumer thread

LinkedBlockingQueue
ArrayBlockingQueue
PriorityBlockingQueue
SynchronousQueue - Not really a queue as it does not maintain storage space for elements. Just maintains a list of queued threads waiting to enqueue or dequeue an element. Directly transfers item from producer to consumer - more efficient. Direct handoff also informs producer that consumer has taken responsibility for the item. take() and put() will block if no thread is waiting to participate in the handoff. Mainly used when there are always enough consumer threads.

Deque and BlockingDeque - allows efficient insertion and removal from head and tail.

Enables Work Stealing designs - In producer-consumer design, there is a single queue that is shared across all threads. This causes lots of contention. In work stealing, each consumer has its own deque, from the head of which it consumes items. If its deque is empty, it can steal objects from the tail of some other consumer's deque. Most of the time, a consumer takes objects from the head of its own deque, thereby avoiding contention. Even when it steals, there is little contention as it steals from the tail rather than the head. Work stealing is well-suited for applications where producers are also consumers.

Interruption

Thread.interrupt() interrupts a thread.
Interruption is a cooperative mechanism, i.e., One thread cannot force another to stop what it is doing. Thread.interrupt() merely requests a thread to stop at a convenient stopping point.
If a method is marked to throw an InterruptedException, it means that the method is blocking and that it will attempt to stop blocking early if interrupted.
No language specification about how to deal with interrupts. Most natural option is to cancel whatever the thread is currently doing. Blocking methods that are interruptible make it easy to cancel long-running tasks when necessary.
One common option to handle the InterruptedException is to propagate it to your caller. This can be done by not catching it at all, or by catching and rethrowing it after performing some local cleanup.
In cases where you cannot throw an InterruptedException (for eg: inside Runnabe.run()), you must catch the InterruptedException and restore the interrupted status by calling Thread.currentThread().interrupt() on the current thread. This allows code higher up in the call stack to see that the thread was interrupted.
Never catch an InterruptedException and ignore it - except when extending Thread (and therefore controlling all code higher up in the call stack).

Synchronizers - an object that coordinates the control flow of threads based on its state. BlockingQueue is one example of a synchronizer. Latch, FutureTask, Semaphores and Barriers are other examples of synchronizers.

Latch - a synchronizer that can block threads until it reaches its terminal state. A latch acts as a gate. Once open, it remains open forever.

Eg usage: Ensure that a computation cannot proceed until the resources needed by it have been initialized, Wait until all players in a multi-player game have finished their moves.
CountDownLatch - initialized with positive integer.Threads call await(), which blocks till counter becomes 0. Other threads call countDown() which decreases the count.

FutureTask - mainly used to represent long running or async computation (for eg: by the Executor framework)

The computation is encapsulated in a Callable (result-bearing equivalent of Runnable).
FutureTask.get() returns result immediately if computation is done, or if exception is thrown or if cancelled; otherwise blocks till done. Result obtained from get() is safely published.
Once complete, it stays in completed state forever.
Future.get() can throw an ExecutionException if the Callable.run() throws one. Check all known exceptions when calling get(). Other exceptions are generally rethrown.

Semaphores

Counting semaphores are used to control the number of threads that can simultaneously access a resource. A thread wishing to use the resource must acquire() a virtual permit and release() it when done. acquire() blocks if no permits are available.
A binary semaphore is a mutex with non-reentrant locking, unlike the intrinsic java object lock which is reentrant.
Can be used to turn any collection into a bounded blocking collection.

Barriers

Similar to latches, but all threads must come together at the barrier point at the same time in order to proceed.
CyclicBarrier allows a fixed number of threads to rendezvous repeatedly. Useful in parallel iterative algorithms.
If a thread blocked on await() is interrupted or an await() times out, then BrokenBarrierException is thrown.
When barrier is successfully passed, await() returns with a unique arrival index per thread, which can be used for leader election amongst the threads.
Also supports barrier action - a Runnable to be executed when barrier is successfully passed but before threads are released.

For building an efficient scalable result cache, use a ConcurrentHashMap> putIfAbsent()

Java Concurrency in Practice - Summary - Part 2

2012-08-26T22:56:00.002-07:00

This is part 2 of my notes from reading Java Concurrency in Practice.

NOTE: These summaries are NOT meant to replace the book. I highly recommend buying your own copy of the book if you haven't already read it.

Chapter 3 - Sharing Objects

Synchronization (for example, by using synchronized blocks) is not just about atomic execution of code blocks, it also influences memory visibility - i.e., ensures that a thread can see the changes made in another thread. Without synchronization, the Java memory model does not guarantee that a value written by a thread will be seen by another thread on a timely basis or even at all. For example, if proper synchronization is not used, a thread X that relies on a control variable that is set in thread Y may NEVER see any updates to it that are written in thread Y. In most cases, thread X will incorrectly loop for ever.
If synchronization is not used, reordering of operations done by multi-processor CPUs for improving performance, can cause a thread to see an incorrect or partial value written by another thread. Without synchronization, the data can be stale. If thread first sets variable x to 1 and then y to 2, another thread may see y set to 2 while x is still unset. This can lead to bugs that are very hard to debug.
Always use synchronization whenever data is shared across threads.
Out-of-thin-air safety - A thread always sees the value of a variable that was written by some thread; not some random value pulled out of thin air. Unless declared as volatile, 64-bit numeric variables (long and double) do not have out-of-thin-air safety, because the JVM treats 64-bit operations as two 32-bit operations.
Volatile variables provide a weaker form of synchronization.

Volatile variables are specially treated by the compiler (for eg: not cached in registers). So a read of a volatile variable always returns the latest value written by some thread.
When thread A writes to a volatile variable and subsequently thread B reads that same variable, the values of all variables visible to A prior to writing the volatile variable become visible to B after reading the volatile variable.
Don't overuse volatile, and in tricky ways. Synchronized blocks are still necessary for atomic operations.
Volatile is commonly used for a control variable that determines when a thread should exit an infinite loop.
Locking guarantees visibility and atomicity; volatile variables guarantee only visibility.
Use volatile variables only when all the following conditions are satisfied:

Writes to the variable do not depend on its current value, or if it is guaranteed that only a single thread writes to the variable.
The variable does not participate in invariants with other state variables.
Locking is not required for any other reason while the variable is being accessed.

For server applications, always specify the -server JVM command line argument even while developing and testing, since the JVM does more drastic optimizations in server mode. Some concurrency bugs arise only under these optimizations.
Publishing an object means making it available to code outside of its current scope.

This can be done by:

storing a reference to it somewhere where other code can find it, say a public static field or in a publicly accessible HashMap.
returning it from a non-private method.
passing it to an alien method

a method in other classes
an overridable method in the same class

publishing an inner class instance (this automatically exposes the enclosing instance)

Sometimes we do not want to publish an object since that will break encapsulation. An object that is published when it should not have been is said to have escaped.
Do not allow the this reference to escape during construction. This commonly happens when the constructor registers some inner class with external event listeners or starts a thread. Even if this is the last statement in the constructor, it is possible that a reference to the object may escape before it is fully constructed. Other threads can see the partially constructed object and react incorrectly. Use a separate start() method to start a thread created in the constructor, or to register event listeners created in the constructor. Alternatively, to do it one step, use a newInstance() factory method that calls the constructor and then automatically calls start() before returning the newly created object.
Calling an overriden instance method from the constructor also allows this to escape before being fully constructed.

Thread confinement, i.e., make sure data is accessed only from one thread, is the easiest way to achieve thread safety.

Swing UI framework & JDBC connection objects use thread confinement extensively.
Thread confinement options:

Ad-hoc thread confinement - programmer entirely responsible to confine object to thread - no language features used. Not recommended due to fragility.
Stack confinement - Object can be reached only through local variables

Primitive types are always stack confined.
Care should be taken that object references do not escape.

ThreadLocal - provides get and set methods that maintain a separate copy of a value for each thread. Used as: new ThreadLocal() { public T initialValue() {...}}

Immutability - Immutable objects are always thread-safe.

Even if all fields of an object are final, it may still not be immutable as some of its final fields can refer to mutable objects.
Final fields provide initialization safety as they have special semantics under the Java Memory Model. Make all fields of a class final unless they really need to be mutable
When a group of related data items must be processed atomically, consider creating an immutable holder class. When an immutable holder class, we may be able to avoid a synchronized block.

Safe publication

Simply storing a reference to an object into a public field is not safe, as it could lead to other threads seeing the object in a partially constructed state (due to reordering).
Immutable objects can be published through any mechanism; no synchronization necessary.
Others must be safely published, i.e., both the reference to the object and the object's state must be made visible to other threads at the same time. A properly constructed object can be safely published by:

Initializing an object reference from a static initializer. This is often the easiest way; static initializers are executed by the JVM at class initialization time which has JVM-internal synchronization.
Storing a reference to it in a volatile field or AtomicReference.
Storing a reference to it into a final field of a properly constructed object
Storing a reference to it into a field that is properly guarded by a lock, like thread-safe collections like Vector or synchronizedList.

Effectively immutable objects must be safely published.

Objects that are not technically immutable, but whose state will not be modified after publication are called effectively immutable. Safely published effectively immutable objects can be safely used by any thread without additional synchronization. For example, the Date object is often used as an effectively immutable object although it is technically mutable.

Mutable objects must be safely published, AND must be either thread-safe or guarded by a lock.

Java Concurrency in Practice - Summary - Part 1

2012-08-26T00:53:00.000-07:00

This is part 1 of my notes from reading Java Concurrency in Practice.

NOTE: These summaries are NOT meant to replace the book. I highly recommend buying your own copy of the book if you haven't already read it.

Chapter 1 - Introduction

Writing correct concurrent programs is very hard.
Threads are the easiest way to effectively use multi-processor systems, which are now ubiquitous.
When writing multi-threaded programs, we must pay attention to the following:

Safety - Nothing bad ever happens, i.e. program correctness is guaranteed irrespective of interleaved execution.
Liveness - Something good eventually happens. For eg: no deadlock.
Performance - Something good happens fast enough. For eg: no excessive context switches.

Many Java frameworks (GUI toolkits, RMI, Timers, etc) internally use threads. So your code must be thread-safe even if you do not explicitly use threads.

Chapter 2 - Thread Safety

Writing concurrent programs is all about correctly managing access to shared, mutable state. Threads are just one kind of mechanism.

An object's state = any data that can affect its externally visible behavior.
An object's mutable state needs to be protected from uncontrolled concurrent access from multiple threads.

A class is thread-safe if it continues to behave correctly when accessed from multiple threads, with no additional synchronization or coordination required of the calling code. In the absence of formal specifications (i.e., invariants constraining an object's fields, postconditions defining the effect of operations on the object etc), we assume that the single-threaded behavior of a class is its correct behavior (after verification, of course!).

It is much easier to design a class to be thread-safe than to retrofit thread-safety into it later.
It is easier to make a class thread-safe if its state is private. In other words, follow good OO practices.
Thread-safe classes encapsulate any needed synchronization so that calling code need not provide their own.

Stateless objects are always thread-safe.
The most common race condition is associated with check-then-act sequences. Lazy initialization of expensive objects is a common place where check-then-act is used.
Race condition != data race. Data race happens when a thread writes a variable without synchronization and another thread tries to read it - the reading thread may see partial or completely incorrect data.
If all you need is a thread-safe counter, just use java.util.concurrent.atomic.AtomicLong. If multiple pieces of state are involved, this is not sufficient - further synchronization is necessary.
synchronized block - Java's built-in locking mechanism for enforcing atomicity

synchronized block is associated with an object that serves as the lock, and a block of code to be guarded.
Every java object can act as a lock for a synchronized block. These built-in locks are called intrinsic or monitor locks.
Intrinsic locks are mutexes; i.e., only one thread can own it at a time.
Intrinsic locks are reentrant - a thread can immediately acquire a lock that it is currently holding.

Each mutable variable that is read/written from multiple threads must be guarded by synchronization with the SAME lock object EVERY TIME it is read/written.

Use @GuardedBy("lockobject") annotation on each mutable shared variable to document the locking strategy.

For every invariant that involves more than one variable, all the variables involved in the invariant must be guarded by the same lock.

Accessing S3 data in Spark

2012-07-10T23:16:00.000-07:00

Before running a Spark job on EC2, the input data is typically copied from S3 to a local HDFS cluster. The Spark jobs read the data from HDFS instead of directly from S3. When I tried making the Spark job read directly from S3 by specifying a path of the form s3n://AWS_KEY_ID:AWS_SECRET_KEY@BUCKETNAME/mydata, I kept getting the following exception:

org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/mydata' - ResponseCode=403, ResponseMessage=Forbidden

My AWS_SECRET_KEY contained a "/", which I did correctly escape with a %2F. But I still kept getting the exception. I was able to work around the problem by applying the following patch to Spark.

When I run my Spark program, I pass in the access key id and secret key from the command line as follows

scala  -Djava.library.path=/root/mesos/build/src/.libs -Dmesos.master=master@MASTER_HOST:5050 -DawsAccessKeyId=MyAccessKeyId -DawsSecretAccessKey=MySecretKey ...

The path to the data is simply specified as s3n://BUCKETNAME/mydata. There is no need to specify the AWS key id and secret key in the URI anymore.

There probably is some better way to do this. If someone knows how, please do leave a comment.

tcpdump tutorial

2012-06-21T01:21:00.000-07:00

Today, I needed to use tcpdump after a very long time. Hence, I did the most natural thing: GOOGLE tcpdump tutorial. I was pleasantly surprised to find that the 3rd link in the search results was a tcpdump tutorial I had written myself 6 years ago while I was a Teaching Assistant for the undergrad networking class at UC Berkeley : http://inst.eecs.berkeley.edu/~ee122/fa06/projects/tcpdump-6up.pdf

Looks like this tiny tutorial I wrote has had more impact than my PhD thesis :-)

Dog-pile Effect : Squid versus Apache Traffic Server

2012-06-21T01:14:00.002-07:00

When using a forward web proxy cache, it is possible to encounter the dog-pile effect. When a new page suddenly becomes very popular or when a popular page expires from the cache, the proxy will receive a large number of requests for the page at the same time. There are two ways to handle this:

Since the page is not already in the cache, the proxy forwards each request to the origin server, OR
The proxy forwards the first request to the origin server, and queues the others till the response to the first request fills the cache.

Option 1 leads to the dog-pile effect. The origin server is rapidly bombarded with a large number of requests. This is usually problematic. The server slows down and the requests keep piling up.

Option 2 is called connection collapsing or collapsed forwarding. Squid supports this feature - http://www.squid-cache.org/Doc/config/collapsed_forwarding/. However, Apache Traffic Server currently does not support it (Thanks to the super-responsive folks on the traffic-server IRC channel for confirming this). It used to be supported , but was removed since the implementation was buggy : http://mail-archives.apache.org/mod_mbox/trafficserver-commits/201102.mbox/%3C20110209171030.3247A23889BF@eris.apache.org%3E

Someone please disrupt the furniture industry!

2012-03-18T22:50:00.000-07:00

Many things are cheaper online than in brick and mortar stores. Not so for furniture. Based on my (limited) experience buying furniture for my new home over the last 2 months, online furniture stores are way more expensive than your local furniture showroom. http://www.homelement.com/Living-Room/Living-Room-Sets/Lambeth-Sofa-Set-Homelegance-p-25911.html lists a sofa + loveseat combo for $1838. I got the identical sofa, loveseat and coffee table (listed for $309) for just $1325 after tax with free delivery and setup from a local store! That is a savings of over $800!

I visited three stores before I bought the sofa set. If I had bought it from the first store I visited, my savings would have been just $400. This is the strategy I followed.

The first store is where you pick what you like. You peruse through the catalogs of various manufacturers and find what you like. Then you ask for a price quote. The salesman will consult his secret price list to determine what the furniture costs him, and then typically quotes you double that amount. Negotiate a little bit to get the price down.

Then, armed with that price, you go to the next store. Almost every store carries the same set of catalogs, and even get their items from the same warehouses. This time, you waste no time - directly ask the salesman what his best price for item X on page Y of manufacturer Z's catalog is. Use the price from the first store as the starting point and negotiate the price down. Since the prices are often marked up 100% or more, there is a LOT of room to negotiate. Then you go to the next store and do the same. If you had nothing better to do, you can keep doing this. However, after the first 3-4 stores, the price stops going down further, and there is no point wasting another valuable weekend day on furniture shopping. At that point, you close the deal. But wait - offer to pay cash if you can and you will most likely get a further discount.

This strategy saves you many 100s of dollars, but takes a lot of time. Visiting three furniture stores can take a whole Saturday. It would have been great if buying furniture was like shopping for a car online. You just pick the items you want from a multi-manufacturer online catalog and then different local dealers bid for your business. That's it - no driving from store to store. That would have been ideal. But that's not the way it is today.

I hope some startup is working to disrupt the furniture industry and make this seamless online furniture shopping experience possible. I only know of one startup in the furniture space - dealdecor.com, which is trying to become the Groupon of furniture. Dealdecor offers one particular piece of furniture every few weeks for half price. That is great, if the item of the month is what you are looking for. It most like won't be, and you have no option but to visit the local retailers.

The US furniture market is a few 100 billion dollars a year(http://www.marketsize.com/blog/index.php/category/furniture/). There are many challenges to disrupting this industry - resistance from the entrenched local dealers (now, that's a surprise!), finding a good way for shoppers to try out furniture before buying, shipping bulky items, etc. Hopefully, someone will find a way to address these challenges and seize this huge opportunity. As for me, I am done with buying furniture for a while.

Accessing recursive Apache Hive partitions in CDH3

2011-12-04T20:09:00.001-08:00

In this post, I describe the minor Hadoop (0.20.2-cdh3u2) patches required to access data deep inside a multi-level directory structure using hive 0.7. Consider the following directory structure:

We want to issue hive queries involving individual days as well as whole months. For accessing individual days, we define one hive partition per day. For example, we define a partition 2011_01_02 with LOCATION Logs/2011_01/02. To access the whole month of 2011_01, we define a partition 2011_01 with LOCATION Logs/2011_01. However, if you query the 2011_01 partition, you will get no results. This is because hadoop 0.20.2 does not support recursive directory listing.

In order to get this monthly query working, you must first apply the following patch (based on MAPREDUCE-1501, which did not make it into hadoop 0.20.2) to hadoop 0.20.2.cdh3u2. After applying the patch, compile hadoop and point the HADOOP_HOME on the machine running the hive client to the patched hadoop jars. You do NOT have to replace the hadoop jars on the hadoop cluster; the recursive directory listing feature is only needed by the hive client.

In addition to the patched jars, you should also add the following lines to your hive-site.xml:

<property>
  <name>mapred.input.dir.recursive</name>
  <value>true</value>
</property>

After this querying the 2011_01 partition will work fine.

Quickly find and open files

2011-11-24T13:50:00.001-08:00

I frequently need to find a file that is located deep within the current directory and operate on it -- like opening in vim or svn diffing it. I can never remember the exact path to the file, and sometimes can't even remember the full name. All I know is that the file is somewhere within the current directory and its sub-directories. So, I end up running the UNIX find command, and then cut-pasting the returned file path into the command of the interest. This wastes time. So I wrote a small python script to make it easier.

Copy the script f.py (located at the end of the post) into some directory that is on your PATH. Suppose you are looking for the file that starts with Foo, you just run:

$ f.py Foo*
1) ./subdir1/subdir2/Foo1.java
2) ./subdir1/Foo.java
Enter file number:

Enter the number of the file you are interested in. That will bring up the following menu of operations you can perform on the selected file.

Process ./subdir1/subdir2/Foo1.java
1. vim
2. emacs
3. svn add
4. svn diff
5. open (OSX only)
Enter choice (Default is 1):

If the pattern you specify matches only a single file, the script directly jumps to the operation selection menu. Hope this will save you some key-strokes.

#!/usr/bin/python
# This program is used to easily locate a file matching 
# the user specified pattern within the current directory
# and to quickly perform some common operations (like
# opening it in vim) against it.
import subprocess
import sys
import os

def processFile(fileName):
    """
    Show the user the possible actions with the specified file,
    and prompt the user to select a particular action.

    """

    fileName = fileName.strip()
    print "Process %s" % fileName
    print "1. vim"
    print "2. emacs"
    print "3. svn add"
    print "4. svn diff"
    print "5. open (OSX only)"

    choice = raw_input("Enter choice (Default is 1):").strip()
    
    if choice == "1" or choice == "":
        cmd = "vim %s" % fileName
    elif choice == "2":
        cmd = "emacs %s" % fileName
    elif choice == "3":
        cmd = "svn add %s" % fileName
    elif choice == "4":
        cmd = "svn diff %s" % fileName
    elif choice == "5":
        cmd = "open %s" % fileName
    print cmd
    os.system(cmd)


def listFiles(fileNames):
    """ 
    Show the list of files and prompt user to select one 
    """

    fileIndex = 1
    for fileName in fileNames:
        print "%d) %s" % (fileIndex, fileName.strip())
        fileIndex += 1
    choice = raw_input("Enter file number:")
    chosenFileName = fileNames[int(choice)-1].strip()
    processFile(chosenFileName)


if __name__ == "__main__":

    if len(sys.argv) < 2:
        print "Usage: f.py FILE_PATTERN_OF_INTEREST"
        sys.exit(-1)

    pattern = sys.argv[1]
    proc = subprocess.Popen("find . -name \"%s\" | grep -v svn" % pattern, 
        shell=True, stdout=subprocess.PIPE)
    lines = proc.stdout.readlines()
    if len(lines) == 0:
        print "No matching files found. Note you can use wild cards like *"
    elif len(lines) == 1:
        processFile(lines[0])
    else:
        listFiles(lines)

Automatically create Eclipse projects for your Scala projects using sbt

2011-11-09T21:54:00.000-08:00

In my previous post, I described the bare minimum you need to know to get started using sbt. In this post, I will describe how sbt can be used to automatically create the .project and .classpath files you need to create for loading your project into the Eclipse IDE. Using sbt to create your Eclipse project files ensures that they are always in sync with your build definition. And of course, it saves a lot of the clicks or key strokes need to manually specify classpaths in Eclipse.

Step 1
Install the Eclipse plugin for Scala, if you don't already have it installed.
Step 2
Add the following lines to myproject/project/plugins/build.sbt. This tells sbt that you want to use the sbteclipse plugin. Note that this build.sbt is different from the build.sbt at the top level of your project, i.e. in myproject/ directory.

resolvers += Classpaths.typesafeResolver

addSbtPlugin("com.typesafe.sbteclipse" % "sbteclipse" % "1.4.0")

Step 3
From the myproject/ directory, type

sbt "eclipse create-src"

This step will create the .project and .classpath files required by Eclipse inside the myproject/ directory.
Step 4
Import the project into Eclipse. Following the menu File > Import > General > Existing Projects into Workspace and browse to myproject

Quick Start to using Scala Simple Build Tool (sbt)

2011-11-08T22:37:00.000-08:00

After you finish a basic Hello World program in Scala, and want to start your first real Scala project, you will need to choose a build tool. While you can use tools like ant or maven, Simple Build Tool (sbt in short) is a very popular option among Scala programmers. In this post, I describe the bare minimum you need to know to quickly get started on using sbt.

Step 1: Install sbt

On a Mac it is as simple as

sudo port install sbt

or, if you are using homebrew,

brew install sbt

For other operating systems, please see the official Getting Started Setup page.

Step 2: Create your project's directory structure

myproject/
    src/
        main/
            scala/
            java/
            resources/
        test/
            scala/
            java/
            resources/

myproject is your project's top-level directory. resources contain non-code files that are packaged up together with your project, like image or data files.

Step 3: Start writing your project's code
Let's just use a Hello World program. Create myproject/src/main/scala/HelloWorld.scala that contains the following code:

import org.slf4j._

object HelloWorld {
    def main(args: Array[String]) = {
        val logger:Logger = LoggerFactory.getLogger("MyLogger");
        logger.info("Hello World");
    }
}

We use the slf4j logging library instead of a simple println in order to demonstrate how external dependencies are specified in sbt.

Step 4: Create a build definition file
Put the following lines in myproject/build.sbt. Note that the blank lines below ARE absolutely necessary.

name := "Hello World"

version := "1.0"

scalaVersion := "2.9.1"

libraryDependencies ++= Seq(
  "org.slf4j" % "slf4j-api" % "1.6.4",
  "org.slf4j" % "slf4j-simple" % "1.6.4"
)

The libraryDependencies setting specifies the managed dependencies, i.e., the dependencies which are automatically downloaded for you from the Maven repositories. A dependency is specified as groupId % artifactId % revision. Conventions for groupId, artifactId and revision are discussed at http://maven.apache.org/guides/mini/guide-naming-conventions.html. The automatically downloaded dependencies are usually stored in ~/.ivy2/cache.

You don't have to use maven if you don't want to. Instead you can use unmanaged dependencies. Just put the appropriate jars in myproject/lib, and don't specify them in the libraryDependencies setting.

Step 5: Compile, run and package your program

cd myproject
sbt run

The first time you run sbt, it will download all the dependencies (sometimes including the scala version specified in the scalaVersion setting). This means that it will take some time. Subsequent runs will be much faster.

If you just want to compile, type:

sbt compile

You can set up sbt to automatically compile your program as soon as any source file changes. To do so, type:

sbt ~compile

Continuous compilation is a great time-saver.

To package your program for distribution as a jar, type:

sbt package

A jar containing all your compiled classes and resources from myproject/src/main/resources will be found in myproject/target/scala-YOUR_SCALA_VERSION_HERE. Note that dependencies are NOT included in the jar. To include all dependencies into the output jar, you will need to use the assembly plugin. See the next step for pointers to info about plugins.

Step 6: Read the official Getting Started Guide
This post only aims to get you quickly started on sbt. There are tons of features it does not cover -- multiple projects and plugins, just to name two very important ones. To learn how to use these features and to understand the fundamental principles behind sbt (which is in fact just a Scala Domain Specific Language), please read the Official Getting Started Guide. It is long and sometimes too deep, but very useful indeed.

JQueryMobile app with Facebook integration and a Ruby on Rails backend

2011-07-10T15:45:00.000-07:00

I am currently playing around with mobile app development. I want to develop a mobile app that:

Works across multiple devices (iPhone, Android, etc)
Integrates with Facebook
Has a Ruby on Rails backend to store app-specific data

I have implemented a toy jQueryMobile app to demonstrate the implementation of the above requirements. The toy app simply asks you to login with your Facebook credentials and lists your friends. You can try it out by visiting http://jqmfbror.heroku.com using your mobile or desktop web browser. The full source code for app is available at https://github.com/dilip/jqmfbror.

Facebook authentication happens at the Ruby on Rails backend using the Omniauth gem. This means that the backend can uniquely identify repeat visitors and associate app-specific data with them. The backend can use the Facebook access token generated on login to retrieve the user's Facebook data, and pass it on for display on the frontend. Alternatively, the backend can simply pass the Facebook access token to the frontend, which then uses JSONP to retrieve the user's Facebook data. The latter approach is more efficient as it decreases the load on the backend. The toy app uses this approach.

The app also uses MongoDb (instead of mysql) and Heroku (for super-easy hosting of Ruby on Rails applications).

Infographic Resume Creator

2011-06-23T22:07:00.000-07:00

A few months ago, I came across some real cool infographic resumes - http://blog.chumbonus.com/infographic-resumes/. Since I didn't have the Photoshop/Illustrator skills to make my own, I decided to make a web app to automatically create a basic infographic resume from my LinkedIn profile. Thus http://inforesume.heroku.com was born. All you need to do is log in with LinkedIn. Do check it out.

If you want a more fancy infographic resume, http://vizualize.me/ will probably be useful. It's a startup announced on Hacker News just yesterday.

Craigslist Car Finder

2011-05-23T22:02:00.000-07:00

I bought my first car in 2006 through Craigslist. Searching for a used car on Craigslist was a pain -- there was too much junk. I found myself wasting 2-3 hours every day sifting through craigslist postings looking for good deals. So I wrote script to automatically parse craigslist car ads and show me only the ones that may be of interest to me. This script helped me and at least four of my friends find a great deal. on Craigslist. Hope it is useful to you too.

The script has the following features:

Allows you to look for only specific car models.
Allows you to ignore coupes.
Allows you to specify price and year range.
Allows you to ignore manual transmission cars.
Compares the price of the vehicle with the Edmunds True Market Value price and highlights good deals.
Highlights car with low miles.

Here is the script: craigslist.tar.gz.

In 2006, PERL was the only scripting language I knew. It was an absolute horror to do object oriented programming in PERL. I wish I knew Python and Ruby back then.

If you find the script useful or find bugs, please do send me an email.

dealwall.me : Procrastinating about taxes

2011-04-02T15:01:00.000-07:00

Last weekend, I was supposed to do my taxes. But, I found a good way to procrastinate -- build a web application. The result was dealwall.me, a site where you can boast about the Groupon deals you bagged. This weekend, I am procrastinating by blogging about it.

My wife and I are huge fans of Groupon. Through Groupon, we have tried numerous great restaurants, taken flying lessons, taken swimming lessons, biked the bay, etc. etc. We thought it will be cool to blog about our Groupon adventures. Why make a static page, when you can make a web app! And thus dealwall.me was born. Please check out deal wall at http://dealwall.me/4d91637b4570a36150000001.

"Build" is not really the right word to describe my activities over the weekend. It's more like cobble together. There are so many fantastic technologies and libraries available today, that putting together a simple web application is quick, simple and frustration-free. In the rest of this article, I will describe the technologies/libraries that helped me put together dealwall.me over a weekend.

Where should I store my code?
Before writing any code, I needed a version control system. I went with Mercurial hosted on Bitbucket. The main reason I went with Mercurial/Bitbucket instead of git/github is that Bitbucket offers free private repositories. I am extremely happy with Mercurial/Bitbucket, and do not miss git at all.

What web framework should I use?

I went with Ruby on Rails. I had played with Ruby On Rails a couple of years ago when I was a grad student. These days, I do a lot of programming in Python/Django at my day job. I like Ruby On Rails a lot more than Django. Ruby On Rails just looks much nicer and has better libraries, which mitigate a lot of programming frustration.

Where do I store data?

A year ago, the answer was obvious - a relational database like mysql. Today, there are many nosql choices, which are very apt for web apps like the one I was cobbling together. I went with mongoDB, mainly because of the absolutely positive experience I had with it at work. It was super simple to install and use.

I did consider building a CouchApp using CouchDB. It probably would have been a good fit for my simple app. However, I wanted to finish the app over the weekend, and the learning curve associated with CouchDB would not have made that possible (based on my experience twiddling with CouchApps a few months ago).

What object relational mapper should I use?
If I were using sql with Rails, the most obvious choice would have been Rails' own ActiveRecord. But I was using MongoDB. I choose lightweight Mongomatic instead of the more heavyweight Mongoid and Mongomapper ORMs. Mongomatic is a very thin layer on top of the mongodb ruby drivers. Since this was my first mongoDB + Ruby On Rails app, I wanted to have full control over how and when data is stored and accessed. Mongomatic is very simple to use, and works great for my app's very simple model -- a wall which has a list of deals within it.

How do I make my web app look pretty?

I am not good at making websites look professional. So I needed all the help I can get. That's where Blueprint, Compass and Fancy-Buttons provide a lot of support. Blueprint is a CSS framework that provides a grid on which you can easily align various page elements. Along with nice-looking default typography and sane style defaults, Blueprint gives a jumpstart towards building a professional looking website.

Compass is a stylesheet authoring framework that makes it easy to organize your CSS. No more CSS files containing hundreds of lines. Instead, Compass allows you to break up your CSS into smaller re-usable parts called plugins, and to compose them into bigger files just like you use #include in C++. Fancy-buttons is a compass plugin that provides neat looking CSS buttons. See http://codefastdieyoung.com/2011/03/want-to-move-fast-just-do-this-part-1-design for an excellent article on how to quickly design a professional looking web app.

How do I make my app social?

Super simple. Copy some Javascript from the Facebook dev site to embed Facebook Like buttons and the new Facebook commenting system. I also added a ShareThis widget to enable easy sharing through email, twitter, and other social-networking sites.

Where do I host my app?

The code has been written. The design looks reasonable. Everything has been checked into Bitbucket. The next task is to host the app somewhere so that it is accessible to the world. I first considered using the free Amazon EC2 instance. Since I did not have time to set up the webserver and other software infrastructure needed to host a Ruby on Rails app, I chose a shortcut - Heroku. Heroku is probably the easiest way to deploy a Ruby on Rails app. It is a real gem -- extremely simple to signup and get my first app running. The whole process took less than an hour. Deploying an app is as simple as one command -- hg push git+ssh://git@heroku.com:.git. Since Heroku relies on git, I had to install hg-git using the instructions at http://smith-stubbs.com/notes/2010/04/30/deploying-to-heroku-with-mercurial.

By default, Heroku offers postgres as the data store. Since I was using mongoDB, I installed the mongoHQ add-on. It took just one click. I chose the free 16MB plan. If my app takes off, that won't be sufficient. But I will worry about upgrading to the $5/month plan if and when my app takes off.

Where do I get my domain?

I got mine from GoDaddy, $8.99 for dealwall.me. This was the only part where I had to shell out real money. I usually buy my domains through my hosting account at bluehost. However, they don't sell .me domains.

How do I make money?

Yes, I have a plan to make money. Ofcourse, this assumes that the app takes off :-). I have signed up as a Groupon affiliate. All links associated with the deals users post on their walls are linked back to Groupon through my affiliate link. I also display the Groupon daily deal widget on the side of each wall. Anytime someone buys a Groupon deal through dealwall.me, I make money.

How do I get users for my app?
So, if I want to make money, I need users (obviously!). Right now, I have just asked a few friends to try the app out. I am planning to publicize the app on Facebook, by "Liking" it and also by sharing my deal wall. However, I am going to wait for a week or two. Currently, the Facebook activity streams of everyone I know are inundated with messages about India's victory in the Cricket World Cup. Notifications from my little app have absolutely no chance of getting noticed.

This is a lesson I learnt when publishing my first Android app -- frync. You need to publish at the right time. When an app is published, it stays on the "Just In" list for at least a few hours. I should have published the app just after Christmas, when everyone would have been playing with their new smart phones received as gifts. The app would have been noticed much more and would have received more installs, just by virtue of being visible at the right time. So this is a lesson learnt.

What next?

Doing my taxes, ofcourse. As soon as I finish writing this blog entry.... and if I don't get distracted into more web app building.

The app is running fine at dealwall.me. It is still hard to use. Since Groupon (obviously) does not offer an API to retrieve a particular user's deals, users have to login to Groupon and then paste the HTML source of their All Groupons page into my app. I am still thinking about how to make this process easier. Don't understand what I am talking about? Please try creating your wall at dealwall.me. If you have any ideas about how to make it better, please leave a comment.