Cost Analysis of Nondeterministic Probabilistic Programs

Guide to the Artifact Accompanying the Paper

Overview

This artifact contains our implementation of the PUCS/PLCS synthesis algorithm as described in the paper. The algorithm takes a nondeterministic probabilistic program, together with linear invariants at each point of the program, and finds polynomial upper/lower bounds for its expected accumulated cost.

Assumptions

We assume that the input program terminates. More specifically, we assume that it satisfies the concentration property. See the paper for a formal treatment of this point.

Moreover, our algorithm is sound but not necessarily complete. So, it might fail on custom instances (either because the program does not satisfy the concentration property, or because it does not have a PUCS/PLCS with the provided parameters).

Files

Dependencies and Prerequisites

Our code is written in C++ and Matlab. C++ is used for creating the Control-Flow Graphs of input programs and Matlab is used for the main PUCS/PLCS synthesis algorithm (which involves symbolic computations and an optimization problem). Moreover, we use Bash scripts for connecting the different components. Hence, we suggest that you run the artifact in a Linux environment or the provided virtual machine.

Complete List of Prerequisites

We are also providing another tool for invariant generation. This is an orthogonal problem and not part of our contributions in the paper, but we thought it might be useful to include it, given that it helps when one wants to test our tool with custom inputs. This tool depends on the Stanford Invariant Generator (StInG) and can only be executed on Linux machines (see below).

Note: If running outside the virtual machine, make sure that all .sh files are allowed to be executed as programs.

File Structure

There are several folders in the artifact. Here is a description of each:

Correspondence between the Paper and the Artifact

The file “examples.txt” contains a description of the correspondence between the example programs in the artifact and those in the paper.

Compilation

To compile the codes, open a terminal in the “Tool” folder and run this command:

./compile.sh

You need to have a C++-11-compliant version of g++ installed.

Obtaining the Results Reported in the Paper

To obtain all the results reported in the paper, simply open a terminal in the “Tool” folder and run the following command:

./main-experiment.sh

When the execution ends, all obtained results will be available in the “Outputs” folder. These results match Tables 2 and 3 in our paper and the Figures provided in the Appendix. However, please note that while our algorithm’s output (PUCS/PLCS) exactly matches the results reported in the paper, simulation results might fluctuate due to the inherent randomness of the process.

Warning

Running the main experiment takes a long time (Took almost a day on our virtual machine, but only 4 hours outside the vm). This is not due to our algorithm, but happens because the experiment includes many simulations of the example programs (used in both Table 2 and the Figures in the Appendix). Moreover, we are including a trial version of Matlab in the virtual machine (because Matlab is proprietary software). This trial version does not use multithreading and is significantly slower than a full-fledged Matlab.

Disabling the Creation of Plots

In our experiments, while Table 2 requires some simulation, most of the simulation is done to obtain the Figures (Plots) in the Appendix. You can disable the latter, and save a lot of time, by adding a -nosim flag to the command above.  Hence, in order to obtain the results reported in Tables 2 and 3, you can use the following command:

./main-experiment.sh -nosim

Note: You can also run the experiment on a single example program or your own custom input. See below for more details.

Matlab License

We have pre-installed a trial version of Matlab on the Virtual Machine. If you are running the experiments on your own machine, or if Matlab asks for a license, you can obtain a free 30-day trial license at https://www.mathworks.com/campaigns/products/trials.html.

Note: The trial version of Matlab installed on the virtual machine outputs low-quality jpg plots. The simulation results are always between our upper and lower bounds (see the fig files), but due to the low quality picture, it might sometimes seem that the simulation result is a little bit above the upper bound (or a little bit below the lower bound).

Running on a Single Example

We have numbered the example programs from 1 to 10. See “examples.txt” for the correspondence between these numbers and the paper. To run the experiment on a single example program, e.g. program number x, you can use the following command:

./run-example.sh x

For example

./run-example.sh 1

runs the experiment on the first example only. As before, you can also use the -nosim flag to disable the plot, e.g.

./run-example.sh 1 -nosim

(The order matters, -nosim must come at the very end)

Note: Running the experiment is most time-consuming on example 10, followed by examples 6 and 9. Hence, we suggest that you try other examples first.

Running on a Custom Input

To run the PUCS/PLCS Synthesis algorithm on a custom program, follow these steps:

  1. Create an input file that contains a program, its invariants and the distributions used for its sampling variables. See the example programs in the “Inputs” folder. Basically, the file should begin with the distributions. This is followed by two # signs (and anything between these signs is considered to be a comment). Then, the main program starts after the second #. We assume this input file is called “input.program.invariants”.
  2. Obtain the CFG by running the following command (in the “Tool” directory):
    ./CFG/cfg cfg.txt < input.program.invariants
    This creates a file named “cfg.txt” that contains the control flow graph of the program. Copy this file into the “Custom” folder.
  3. Create a configuration file named “config.txt” in the “Custom” folder. The configuration file contains the values of d and K as in the paper, and the initial valuations for which PUCS/PLCS values should be computed. See “Custom/readme.txt” and the example configuration files in the “Inputs” folder.
  4. Run the Matlab Script by opening a terminal in the “Custom” folder and running this command:
    matlab -nodesktop -nosplash -r "run custom.m"
  5. The output will be written to log.txt

Invariant Generation

We have included an additional tool that can be used to generate invariants for custom programs. Note that invariant generation is an orthogonal problem and this additional tool is not part of our contribution in the paper. We are including it in order to make it easier to generate custom inputs for our tool. This tool translates the program to an instance for the Stanford Invariant Generator (StInG) and then calls StInG to obtain the invariants. We have used the same tool for generating invariants for our example problems. However, note that invariant generation is not done in polynomial time and might take a while. Moreover, StInG is only available for Linux machines. If you like to use this tool on a different OS, please consult the compilation guide of StInG.

The grammar supported by this tool is a subset of our main grammar and is provided in “InvariantGeneration/grammar.txt”. To generate an invariant for a custom program, write the program in this grammar (see “InvariantGeneration/examples”). Note that the names of program variables should begin with p_ and the names of sampling variables with r_.

Let’s assume that the program is named “x.program”, then you should also create an additional file named “x.program.rvars” which contains the supports of probability distributions used for each sampling variable (see “InvariantGeneration/examples”). Then, run the following command in the terminal:

./parser x.program

The output will be written to “x.program.finaloutput”.

Note: The invariant generator can only find invariants that can be deduced from the structure of the program (see StInG documentation). It is always a good idea to provide some initial invariants, especially at the beginning point of the program.