Lab 4: Modeling and Verification using UPPAAL
See also the lab introduction slides which can be found here.
The goal of this assignment is to provide initial hands-on experience with formal modeling and verification of real-time systems. This is done using the Timed Automata model and the model checker UPPAAL.
Working in Groups
Solve this assignment in your groups. All students participating in the group shall be able to describe all parts of the solution.
The report must consist of answers to all parts of this assignment. Submit the report via Student Portal no later than Wednesday, 14.10., 23:59 .
UPPAAL is a tool for modeling, simulating and verifying (via model checking) timed automata. Unless you already used it earlier, you probably want to get familiar with it first: UPPAAL tutorial.
Please do not submit any report for the tutorial. The tutorial is there to help you, it is not part of the assignment.
You might also check out the home page of UPPAAL for further documentation and help.
In order to start UPPAAL on the Solaris machines in the department, use:
Working At Home
To work at home you need to download the tool from the UPPAAL website: . It is available for Windows, Linux and Mac OS X. When downloading please do us a favor and enter your real name and affiliation. To run UPPAAL you will also need a Java Runtime Environment 6 or newer which you can get for example from java.sun.com.
Part 1: Verification Warm-Up
The main purpose of this part is to familiarize yourself with the UPPAAL tool and with its basic model checking abilities.
Model the following three automata as three (separate) processes in Uppaal. The global variable i is of type integer.
To get a feeling for the behaviour of the system, try to simulate it. Then answer the following questions using verification:
- Can the value of the variable i become strictly greater than 200?
- Is it true that no matter what happens (always) is the value of i smaller than or equal to 100?
- Is it true that no matter what happens (always) is the value of i strictly smaller than 100?
- Is there a run of the system in which i never exceeds 42?
- Is it true that no matter what happens (always) is the value of i greater than or equal to 0?
What to report:
The answers to the questions together with the queries used for determining the answer.
Part 2: Scheduling
In this part, we will model a variant of the well-studied multiprocessor scheduling problem, the so called "task graph scheduling on heterogeneous multiprocessors". Given is a set of jobs which have to be assigned to a set of multiprocessors with the goal of minimizing the execution time of the schedule. Let's assume that jobs can't be preempted and also do not migrate between the processors while they are running. We consider two extensions of the basic model:
- Jobs have dependency constraints. Some jobs may only be ready for execution after other jobs finished. Our schedule should honor these constraints.
- Each job has a specific execution time on each processor. This means that the execution times on different processors can be different. Imagine for example a system with two processors: one that operates with a high clock rate but does not provide floating point (FP) instructions, and one with a lower clock rate but with native FP instructions. Some tasks that don't need FP operations will execute faster on the first processor. Others may need to emulate FP instructions and may thus finish earlier on the second processor even with a lower clock rate.
Formally, each job J is modeled as a tuple (C_1, ..., C_M) containing the execution times of that job on each of the M processors. Additionally, each job has a list of job names it needs to wait for before it can execute. Consider the following example:
The example contains five jobs A to E with dependency constraints and different execution times on the two available processors. The dependencies can be visualized as a directed acyclic graph (DAG):
There are many different ways in which these jobs can be scheduled, for example:
The left schedule takes 10 time units to complete, the right one only 8. It is easy to see that the right schedule must be optimal, since jobs A and C must execute after one another. This results in an overall execution time of at least 3+5 which are the shortest possible execution times of both jobs. In fact, this is the only optimal schedule for the example set. (Why?) In general, it is difficult to find an optimal schedule.
We will now use UPPAAL to model this problem in order to find optimal schedules and to prove their optimality.
Simple UPPAAL model
As a first approach, we simplify the problem and ignore the dependency constraints for now. The UPPAAL model will contain two different types of timed automata:
- One automaton for each job
- One automaton for each CPU
Jobs communicate with the CPUs via channels in order to "use" them. The CPU automaton (template) looks as follows:
Initially, the CPU is idle and waits for a job to request its use. When this happens, the CPU resets local clock x and waits in the InUse location until the required time passed. It then sends a signal back to the job that it is done. Note that the channels "use" and "done" and also the time bound "C" will be template parameters that we can later instantiate with the concrete variables for each individual CPU. Thus, the "Parameters:" field of the template should contain:
- chan &use, chan &done, int &C
Using that, we can instantiate two CPUs in the "System declarations" as follows:
- CPU1 = CPU(CPU1_use, CPU1_done, CPU1_C);
- CPU2 = CPU(CPU2_use, CPU2_done, CPU2_C);
In order for this to work, the global declarations need to contain the channels and variables:
- int CPU1_C;
- chan CPU1_use, CPU1_done;
- int CPU2_C;
- chan CPU2_use, CPU2_done;
Enter the template "CPU" into UPPAAL together with the global and system declarations mentioned above. Based on this, your first task now is to model the "Job" automaton (template). Create a new template "Job" that takes two integer parameters C1 and C2 for the execution times of that job on both processors. Your automaton should non-deterministically try to use one of the CPUs by synchronizing on the appropriate channel. When it succeeds in doing so, it needs to tell the CPU how long the execution time will be (via writing the value to the global variable CPU1_C or CPU2_C), and wait for completion.
Your template needs to be instantiated in the "System declarations" just as we instantiated the CPUs above. Further, you need to define the system as a composition of all automata, like this:
- system JobA, JobB, JobC, JobD, JobE, CPU1, CPU2;
Before we can try out the model, we need to complete it with an important detail. We would like to measure the overall execution time of the schedule. Thus, we need to measure global time, and need a way to notice when all tasks are completed. For this, add a global integer counter "done_counter" and a global clock "y" to the global declarations. Let your job automaton increment the counter when its done. We can now finally express the property we want: What is the minimum value of clock "y" with "done_counter" being equal to the number of jobs (5 in our example)?
In order to answer this question, use the verifier with an appropriate reachability query. Setting the "Diagnostic Trace" option to "Fastest" will produce a trace in the simulator that corresponds to the minimal overall execution time. Is it smaller/greater than the optimal schedule from above? Why/why not?
Extended UPPAAL model
Now, we want to express the dependencies between the jobs. Extend your model with a mechanism such that jobs respect the dependencies, i.e., they can only acquire a CPU when all preceding jobs are finished. You should only change the "Job" template for that purpose, leaving the "CPU" template unchanged. Using global variables may be helpful, e.g., flags.
When you are done, try again the above example. Does the generated trace correspond to the optimal schedule above? Further, generate queries to the verifier that directly prove the optimality of the result. (I.e., without the "Diagnostic Trace" feature.) Is one query enough?
What to report:
- The "Job" template for the extended model, including a short description of what you did and why.
- All queries you used, i.e., for generating the trace as well as for proving optimality of the result. Please include a short description as well.
- Minimal execution time for the following problem instance:
Part 3: A Deadlock
In this last part of the assignment we will see how to model concurrent systems and how to detect subtle bugs in their design. As a basis for this, we look at the producer-consumer-buffer model which you implemented in the Ada lab.
Let's revisit the problem: Three processes are working together where one is the Buffer that stores values produced by the Producer and later consumed by the Consumer. All three processes communicate via synchronous channels, i.e., they are blocked as long as the other party is not ready. However, they can use a timeout for the waiting. The Buffer has limited storage capacity. The Consumer should finish consuming the values as soon as a certain sum is reached. It is then its duty to tell the other two processes to stop working.
Some of the details which you had to implement in Ada we abstract away now. In particular, we assume that all values for the Buffer are the same (20, let's say) and that the Buffer size is relatively small (3 items, let's say). This greatly reduces the state space, since now the Buffer only needs to store one counter. The "bad" effects will be observable even in this simplified setting. To summarize, pseudo-code of the three processes you implemented in Ada stripped down to only the key functionality may look as follows:
// Buffer while true: select when (items > 0) => accept get; items--; or when (items < 3) => accept put; items++; or accept finish; break; end select; end while; // Producer while true: select delay random(0..3); Buffer.put; or accept finish; break; end select; end while; // Consumer while true: delay random(0..3); if sum < 100: Buffer.get; sum := sum + 20; else: break; end while; Buffer.finish; Producer.finish;
We would like to model the above pseudo-code in UPPAAL. As an example, the Producer may look as follows:
Look at how we modeled different aspects of the code like the random delay or the choice between delay and accepting the signal. Note that we prefixed the channel name in order to distinguish it from the channel for the Buffer. Enter the above Producer model in UPPAAL and create two more timed automata for the Buffer and the Consumer. Try to stay as close as possible to the pseudo-code. You may need to declare local variables (like sum, items, clocks) and global ones (like channels). The "last" location of each automaton should be called "done" for later reference. Also, make sure that you model enough locations, i.e., each semicolon in the above pseudo-code should correspond to some dedicated location in the timed automaton. Note for example the "waitput" location in the above automaton for the Producer.
First, simulate the system for a number of steps and observe how it behaves. You may see that if everything went fine, all three processes are in their "done" location.
A desired property of such a concurrent system is that it should never deadlock. We would like to verify this property which is usually done with the following query:
- A not deadlock
Since all processes being in the "done" location formally constitutes a deadlock (no process can move further), we need to change the query in order to allow this special state. Change the formula appropriately and then try to verify it in UPPAAL with "Diagnostic Trace" set to "Shortest".
You will notice that the system contains a "real" deadlock, apart from the artificial one you excluded (all processes in "done"). Examine the problem in the simulator on the trace that the verifier produced as a counter example to the query. The problem seems to be that the Consumer first terminates the Buffer and then the Producer afterwards. This can lead to the situation that the Consumer is waiting for the Producer in order to terminate it, the Producer is waiting for the Buffer to accept a new item, but the Buffer is already terminated. (In Ada this would not have been a deadlock but an uncaught runtime exception, which is equally bad.)
An obvious solution seems to be to first terminate the Producer and then the Buffer. Do the change to the Consumer automaton and run the verification again. Is there still a deadlock? If so, explain what happens and suggest a fix. After your fix, is the system now deadlock free?
What to report:
- The two timed automata for Consumer and Buffer
- The extended query to exclude the "artificial" deadlock state
- An explanation to the second deadlock
- Your suggested fix to that problem