SAAPP: State of the Art in tools
As part of Sun ONE Studio, LockLint performs a static analysis of the use of mutex and multiple readers/single writer locks, and looks for inconsistent use of these locking techniques. In looking for consistent use of locks, LockLint detects the most common cause of data races and deadlocks: the failure to hold the appropriate lock while accessing a shared variable. For information about how to use LockLint, see Chapter 5 and Appendix A of Analyzing Program Performance With Sun WorkShop for Sun WorkShop 5.0, available on http://docs.sun.com. (Search on LockLint.) This release of LockLint is available on the Solaris[tm] operating environment (SPARC[tm] Platform Edition) versions 7, 8, and 9.
The lock covers technique [DIS91] is a relaxation of the happens-before approach for programs that make heavy use of locks. By extending the lock covers technique discussed earlier and at the same time discarding the underlying happens-before apparatus Savage et al. present a tool called Eraser based on their lockset algorithm for dynamically detecting data races in lock-based multithreaded programs on binary rewriting techniques [SBN+97]. This has severe limitations due to inadvertent interactions with the operating system, more difficult scheduling control, less general programs (e.g. only user-mode), etc.
Based on the algorithms and implementation of the Eraser tool by Savage et al. [SBN+97] Compaq provides a runtime debugging and analysing tool for multithreaded programs called Visual Threads [HAR00]. It can be used to automatically diagnose common problems associated with multithreading including deadlock, protection of shared data (Tru64 UNIX only), and thread usage errors. It can also be used to monitor the thread-related performance of the application, helping you to identify bottlenecks or locking granularity problems.
In in VeriSoft [GHJ98] dynamically on-the-fly system, Godefroid et al. developed a system similar to what we plan. VeriSoft systematically explores the state spaces of systems composed of concurrent processes executing arbitrary code without storing any intermediate states in memory. The key to make this approach tractable is to use a new search algorithm built on existing state space exploration techniques known as partial order methods [GOD96]. Although their system, like ours, does not need the construction of a formal model of the software system to be examined, VeriSoft requires the careful (informal) modelling of the environment. In addition, it seems the communication primitives are limited to specific shared variables, semaphores and FIFO buffers, and the systems to be analysed must be available in source code.
In the RecPlay tool [ROB00], Ronsse and Bosschere combines execution replay with automatic on-the-fly data race detection. RecPlay detects the first feasible data race in an execution as long as the program only uses the Solaris (or POSIX) synchronization primitives. Using a binary rewriting technique called JiTI (Just in Time Instrumentation) [ROB98], RecPlay tries to limit both the memory consumption and the intrusion caused by the monitoring code. In brief, RecPlay works in three phases. In the first phase, all synchronisation events and a limited amount of general execution information is traced. In a second phase, the information from the first phase is used to guide a second execution. An equivalent re-execution is only possible in the absence of data races. If data race is detected, a third phase is used to track the error back to the source code. Although RecPlay tries to minimize the intrusion, the simulator approach we suggest will have zero intrusion.