We are pleased to inform you that your artifact met or exceeded the
expectations of your paper.  You can include a badge in your final
paper; you can find the badge here:

http://pldi14-aec.cs.brown.edu/aec-badge-pldi.pdf

In addition, you may use an extra page in the final version of your
PLDI paper, at no cost to you.  This gives you space to include the
badge in your paper, as well as a description of your artifact.

There will also be an AEC session/event at the PLDI conference, in
which we hope you will participate.  We will provide more details to
you about this event in the future.

In case you are curious, out of roughly 50 accepted papers, 20
submitted artifacts.  Of these, 12 were found to meet or exceed
expectations.  The artifacts were reviewed by an outstanding team ---

http://pldi14-aec.cs.brown.edu/committee/index.html

--- who poured hours of effort into each review, and then into
discussions, which often resulted in revisiting the artifact and
further discussions.  Their work demonstrated a deep interest in doing
the process well.

The reviews of your artifact are attached below.  You may, if you
wish, include portions of the reviews in your paper as you see fit.

We sincerely hope you will use the reviewers' comments to revise the
artifacts and, more importantly, to make the artifacts PUBLIC.  For
instance, we recommend uploading your revised artifact to the ACM DL's
paper supplement section, or putting it in some other permanent
location and referring to it in the final version of your paper.

We thank you for submitting your artifact, and we very much hope that
you will continue to submit your artifacts in the future.

Sincerely yours ---

Eric Eide
Shriram Krishnamurthi
Jan Vitek
on behalf of the rest of the AEC


----------------------- REVIEW 1 ---------------------
PAPER: 3
TITLE: Resource Limits for Haskell
AUTHORS: Edward Yang and David Mazieres

OVERALL SCORE: 2 (greatly exceeded expectations)

----------- BRIEF PAPER SUMMARY AND CONTRIBUTIONS -----------
A mechanism is proposed for dynamically monitoring and restricting the
  memory footprint of sub-computations, called *resource containers*. Threads
  are tainted by the containers for objects to which they retain references.
  Threads are destroyed when any of their associated containers in exceeded.
  Containers can avoid excessive taint by copying objects instead of referring
  to them.

----------- ARTIFACT SUMMARY -----------
The resource container system is implemented as an extension to GHC
  providing a minimal low-level API together with a monad as a higher-level
  interface to the system. There are a few example programs using both
  interfaces and an evaluation of the tool's accuracy.

----------- ARTIFACT PACKAGING AND REPRODUCIBILITY -----------
The artifact was provided pre-built in a VM image, which was extremely
  pleasant. The directions made it easy to build and run examples. Running the
  full accuracy evaluation was also turnkey. Just great!

----------- ARTIFACT IMPLEMENTATION AND USABILITY -----------
The higher-level API to the RC system is very straightforward and makes it
  easy to handle exceptional cases. Implementation based on GHC's
  block-structured heap is elegant and effective.

----------- DETAILED EVALUATION AND SCORE JUSTIFICATION -----------
I read the provided examples to get a more detailed idea of how the APIs
  worked and how they're used in semi-realistic programs. I also read over the
  more real-world case study (an open-source HTTP server) and found that the
  simplicity of the bug-fix lived up to the claims in the paper.
  
  I compiled and ran the examples (including Happstack) to verify that they
  did what was claimed.
  
  I also attempted to run the "fast" version of the accuracy data collection,
  but I (ironically) ran into a Haskell OOM error. Perhaps this was a
  peculiarity of my VirtualBox setup or some missing flags to the RTS.

  The APIs seem to match exactly what was described in the paper and work
  without much hassle -- adding containers to existing Haskell code seems to
  be straightforward and accurate.

----------- COMMENTS FOR IMPROVING THE ARTIFACT -----------
I ended up slightly confused about the necessity of the 'rcKill' calls in
  some of the examples. Is this just necessary for benchmarking purposes or is
  there some more practical need for explicit kills? I didn't see this
  explained in the paper.

  This is probably a different paper, but it would be useful to see how more
  introspective information could be generated for debugging purposes when
  containers are exceeded. Who should be blamed for allocated too much memory?
  Always attributing to the creator of thunks is a good first cut, but might
  violate intuition in some cases, suggesting that debugging tools built on
  this system could be fruitful.


----------------------- REVIEW 2 ---------------------
PAPER: 3
TITLE: Resource Limits for Haskell
AUTHORS: Edward Yang and David Mazieres

OVERALL SCORE: 1 (exceeded expectations)

----------- BRIEF PAPER SUMMARY AND CONTRIBUTIONS -----------
This paper presents a resource limits system for Haskell based on
resource containers, which allows programs to enforce local restrictions
on space usage for differents portions of a program. To manage the
interaction of resource containers with lazy evaluation, the authors
reuse the existing cost semantics used by the Haskell profiler. Unlike
previous resource limits systems, this system supports both revocable
references (as a special datatype) and killing threads as reclamations
strategies.

The paper first provides a high-level overview of the systems's design
and explains some of the design decisions made by the authors. Then, it
presents two formal semantics: a big-step semantics describing the
interaction between resource containers and core features of Haskell,
and a small-step semantics describing the interactions between resource
containers and exceptions. The paper then describes the implementation
of resource limits in GHC and provides an empirical evaluation showing
that: 1- resource limits correctly limit memory usage and 2- time
overhead of resource limits is low, but not low enough to enable then by
default.

----------- ARTIFACT SUMMARY -----------
The artifact is a virtual machine image that contains a version of GHC
that supports resource limits, the user-level resource limits library,
benchmarks and scripts to run the "accuracy" experiments, a version of
Happstack that exhibits the bug discussed in section 5.2 (as well as a
fix for that bug) and the code for the prisoners dilemma example from
section 5.2.

----------- ARTIFACT PACKAGING AND REPRODUCIBILITY -----------
The artifact was well packaged. Unfortunately, the scripts to run some
of the experiments (some of the prisoners dilemma, nofib benchmarks)
were missing, which made it harder/impossible to reproduce these
experiments.

----------- ARTIFACT IMPLEMENTATION AND USABILITY -----------
The instructions and tutorial were detailed and helpful. The resource
limits API is slightly different from the description in the paper, but
the differences are well-documented and none of them are significant.

----------- DETAILED EVALUATION AND SCORE JUSTIFICATION -----------
To evaluate this artifact, I did the following:
 - I ran the examples from the tutorial and some small examples of my own.
 - I reproduced the "accuracy" experiment
 - I reproduced the happstack infinite header bug
 - I reproduced the prisoners dilemma experiment
 
The examples from the tutorial all ran as expected. These examples
showcased enough of the API that I was able to write my own examples
without problems. Documentation would have been a nice addition, but
examples were enough.

I had a few surprises when trying out additional examples. First, I
tried the variant of 2.hs for which the README says it will not
attribute costs properly. Unless I misunderstood something, that example
did seem to attribute costs to the right resource container (as
confirmed by a resource container listener). I tried another similar
example, more closely based on section 4.6, and it also seemed to
attribute costs properly. Maybe I was not using the right optimization
level to trigger the bug. Of course, if that cost attribution limitation
does not actually show up, this is not a bad thing!

In a few of my examples, listeners sometimes behaved unpredictably. For
example, a listener for 150 in an RC of size 200 would never run, but
listeners for smaller values would. To reproduce, edit 3.hs (provided in
the artifact) to produce a message at 150 instead of 100. 100 and 120
(or anything lower) results in a message, but 150 does not.

I've also observed listeners firing before any computation should have
occurred within an RC (the computation was blocking on input, so I don't
think it was lazy evaluation shuffling the evaluation order). I included
a program exhibiting that behavior, in the "additional comments" field.


I reproduced the "accuracy" experiment. Apart from "suml" using a lot
less memory than the paper reports (it barely uses more than 1x the
limit) and GHC getting close to 2x the limit at the 600M end of the
plot, the results are very close to those in the paper. Memory usage
never grows beyond twice the limit.


I reproduced the happstack infinite header bug. I was impressed by how
simple (and non-intrusive) resource limits made the fix. As predicted,
without resource limits, I was able to get happstack to consume all of
memory. With resource limits, the server stopped almost immediately.

I also ran the happstack benchmark. The numbers of connections per
second I observed were much lower than in the paper (the maximum I
observed was about 10), which is probably due to the virtual
machine. The overhead of enabling RC was comparable (a bit lower,
actually) than reported in the paper.


I ran the prisoners dilemma experiment. Since the scripts to generate
the plot from figure 8 were missing, so I had to rely on the logs
produced by the program. The "RCMVar no longer available" events in the
log seem to match up with the peaks in figure 8, so the results look
plausible. I found the implementation of the prisoners dilemma program
to be very clear, and a nice showcase of resource limits.


I was not able to reproduce the "nofib" experiment from the paper, as
the benchmarks were not included in the artifact.

----------- COMMENTS FOR IMPROVING THE ARTIFACT -----------
The "accuracy" experiments produces results in a pdf file, but the
artifact does not include a pdf reader, or an ssh client or server to
copy the files to a different machine. I ended up installing the latter
(thanks for providing the root password!), but providing it as part of
the artifact would have been nice.

----------- ADDITIONAL COMMENTS TO THE AUTHORS -----------
import Control.RLimits

rint :: String -> Int
rint = read

main = do
   cur <- getCurrentRC
   rc <- newRC 400 cur
   -- this gets printed before the input is even read
   listenRC rc 50 (putStrLn "50 left")
   x <- getLine
   withRC1 rc x $ \x' -> do
      print (foldr (+) (rint x') [1.. (rint x')])


----------------------- REVIEW 3 ---------------------
PAPER: 3
TITLE: Resource Limits for Haskell
AUTHORS: Edward Yang and David Mazieres

OVERALL SCORE: 0 (met expectations)

----------- BRIEF PAPER SUMMARY AND CONTRIBUTIONS -----------
This paper describes how to add fine-grained resource limits to Haskell.
In this case, the authors considered memory usage. Current approaches
for establishing resource limits are fairly course grained, either at
the OS level or through Haskell APIs. One problem with this approach is
that limits can only be applied at a course grain, such as at the level
of processes.

The authors present a modification to the Haskell runtime, as well as a
new resource container API that allows programmers to set memory limits
on particular sections of code. The paper includes a semantics for the
STG language with these new resource limits and an evaluation of how
well their system enforces the desired limits.

----------- ARTIFACT SUMMARY -----------
The artifact consists of a Github repo containing source code for
several examples that were used in the paper. The README includes links
to the paper, a virtual hard disk image and the Github repo for the
rlimits support library.

The hard disk contains their modified Haskell pre-installed, the rlimits
library, several rlimits examples, and the source code from the
artifact's repository.

----------- ARTIFACT PACKAGING AND REPRODUCIBILITY -----------
I was able to get the virtual machine running and building the provided
software. For the most part, everything ran as expected.

----------- ARTIFACT IMPLEMENTATION AND USABILITY -----------
I appreciate that the VM image includes the modifications to ghc already
installed. The script to generate Figure 6 from the paper was a nice
touch as well. Distributing the artifact via Github provides an easy way
for me to quickly browse the files and see, for example, what changes
were needed in ghc.

----------- DETAILED EVALUATION AND SCORE JUSTIFICATION -----------
I did most of my review from within the virtual machine. I did make an
attempt to build the modified ghc for myself, but decided to focus on
the VM instead.

I ran the script to generate data.pdf. Mine had a lot fewer lines on it
than the one in the paper. It looks like FAST mode was enabled in the
Makefile by default. I had a spike in ghc's memory usage on the line
near 600M. How deterministic are these results supposed to be? It was
still under the black limit, so maybe Haskell's garbage collector did
something different this time.

The Prisoners program ran without any trouble, but I was not able to
build PrisonersRaw. The error was:


PrisonersRaw.hs:74:18:
    Couldn't match expected type `() -> IO RC' with actual type `IO RC'
    The function `getCurrentRC' is applied to one argument,
    but its type `IO RC' has none
    In a stmt of a 'do' block: parent_rc <- getCurrentRC ()
    In the expression:
      do { let players = ...;
           scores <- mapM (const (newIORef 0)) players;
           parent_rc <- getCurrentRC ();
           forM_ (zip3 [1 .. ] players scores) $ \ (i, p, s) -> do { ... } }
make: *** [PrisonersRaw] Error 1


I wasn't sure exactly how to interpret the results from Prisoners, but
I'm guessing Figure 8 was extracted from the +RTS -S output?

Next, I built and ran each of the tutorial programs. When running 4.hs,
I had a couple of MVar-related errors, but otherwise everything else
worked well.


To me, one strength of this artifact is that the code was provided
through Github (though I had to hunt a little to find the ghc changes).
This greatly reduces barriers to building on this work in future
research.

----------- COMMENTS FOR IMPROVING THE ARTIFACT -----------
The obvious one is to make sure all the examples provided build and run
(though perhaps the errors were my fault).

I didn't see a link to the modified ghc repo in the pldi14-rlimits-aec
repo. I was able to find it based on version in the VM, but it would
have been nice to have a link to this and a mention of what branch to
find the changes in.