Bounded Analysis of Concurrent Programs Through Sequentialization
Gennaro Parlato is an associate professor at the University of Southampton, and member of its Academic Centre of Excellence for Cybersecurity Research. He has a broad interest in software verification, and has worked both on theoretical aspects as well as on practical tools for analyzing heaps, concurrency, security policies, etc.
Developing correct, scalable, and efficient concurrent programs is a complex and difficult task, due to the large number of possible concurrent executions that must be considered. Modern multi-core processors with weak memory models and lock-free algorithms make this task even harder, as they introduce additional executions that confound the developers’ reasoning. Due to these complex interactions, concurrent programs often contain bugs that are difficult to find, reproduce, and fix. Stress testing is known to be very ineffective in detecting rare concurrency bugs as all possible executions of the programs have to be explored explicitly. Consequently, testing by itself is often inadequate for concurrent programs, and needs to be complemented by automated analysis tools that enable detection of errors in a systematic and symbolic way.
Sequentialization is a technique for the analysis of concurrent programs that exploits mature analysis techniques and tools that were originally designed for sequential programs. It can be implemented as a code-to-code translation from the concurrent program into a corresponding non-deterministic sequential program that simulates all the targeted executions of the original program. The sequential program contains both the mapping of the threads in the form of functions, and an encoding of the scheduler, where non-determinism is used to capture thread interleavings.
In the first part of this series of lectures, I will cover some of the sequentializations proposed in literature, and show that the existing sequentializations based on bounding the number of execution contexts, execution rounds, or delays from a deterministic task-scheduler rely on three key features for scalable concurrent program analyses: (1) reduction to the sequential program model, (2) compositional reasoning to avoid expensive task-product constructions, and (3) parameterized exploration bounds.
In the second part, I will cover the design and the implementation of two sequentialization schemes targeted at using Bounded Model Checking tools as backends. These code-to-code translations are carefully designed to introduce very small memory overheads and very few sources of nondeterminism, so that they produce tight SAT/SMT formulae, and are thus very effective in practice.
Finally, I will present a new swarm verification approach based on a parametrizable code-to-code translation to generate a set of simpler program instances, each capturing a reduced set of the original program’s interleavings that can then be checked independently in parallel. This approach does not depend on the tool that is chosen for the final analysis, is compatible with weak memory models, and amplifies the effectiveness of existing tools, making them find bugs faster and with fewer resources.