Welcome to JConsume home page!

JConsume 2.0 is a memory consumption analyzer for Java bytecode. It computes parametric expressions that over approximate the maximum number of simultaneously live objects ever created by a method, where by live we mean irreclaimable by any garbage collector. Computing the number of live objects is a key underlying step in all client analysis techniques aiming at computing dynamic-memory requirements.

JConsume implements a new compositional quantitative analysis aimed at inferring non-linear upper-bounds on the number of irreclaimable objects that may be stored in the heap at any point in the execution of Java-like programs, including polymorphism and implicit memory management. The analysis is based on the compositional construction of live objects summaries for every method under analysis. More concretely, it over-approximates both a) the maximum amount of fresh objects which are simultaneously alive during the execution of the method and b) the number of created objects that may remain alive after the method finishes its execution.

Summaries are built up using method local information (e.g., their own allocations) and precomputed summaries of called methods. Since the behavior of a method can vary a lot with the arguments it is called with, summaries are parametric in order to provide bounds that can depend on the arguments.


Here! (github)

How it works?

Let us consider the statements in map that allocate memory: the new in line 1, and 3 method invocations, namely in lines 2, 5 and 6, with the last two inside a loop. To compute a summary for method map we will use inter- mediate results to make the formulas easier to read. These intermediate results are attached to lines of the source corresponding to statements that allocate memory.  Lines 1 and 2 are the simplest cases. We can associate the following simple summary to any new:  MaxLive1 = MaxLivenew = 1 Esc1 = Escnew = 1

Similarly for the iterator: MaxLive2 = MaxLiveiterator = 1 ; Esc2 = Esciterator = 1

The call to the method apply at line 5 is inside a loop. Therefore, we resort to  an invariant to characterize the iteration space corresponding to that statement:  Inv5 =1ilist size  where list size is a formal variable representing the length of the list (i.e., list.size). Objects escaping from apply will live in map’s scope. Clearly, such number contributes to MaxLive of map. Thus, the amount of escaping objects from apply at each iteration must be accumulated as it contributes to the total amount of live objects in the scope of map. Besides, objects not escaping from apply are also alive during (part of) the execution of map. Therefore, they should also be counted in MaxLive of map. However, since the number of those objects may be different at each iteration, it would be sufficient to count only the maximum such number over all iterations to obtain an over-approximation. 

That is, MaxLive5 =|_|(Inv5)(MaxLiveapply Escapply) + SC.Summate(Escapply, Inv5, list size)  = |_|1ilist size (3 1) + SC.Summate(1, 1 i list size, list size)  = 2 + list size

Notice that we subtract the number of escaping objects from the overall num- ber of live objects of apply since this amount is indeed accounted for in the SC.Summate.

Let the symbolic calculator SC provide an operation SC.UpBound which computes the max of an expression over a domain. In this case, we could write:

MaxLive5 = SC.UpBound(MaxLiveapply Escapply, Inv5, list size) + SC.Summate(Escapply, Inv5, list size)

Let us look now at Esc5. All objects escaping from apply are added to the  returned list. Consequently, they also escape from map.

Therefore: Esc5 = SC.Summate(Escapply, Inv5, list size) = SC.Summate(1, 1 i list size, list size) = list size
To compute the intermediate result for line 6, we make use of the summary of method called at this line, that is add. Its summary is: MaxLiveadd = 1 ; Escadd = 1
Since the call is in the loop, we proceed as for apply. The invariant at line 6 is:  Inv6 =1ilist size MaxLive and Esc at at line 6 is computed as follows:  

MaxLive6 = SC.UpBound MaxLiveadd Escadd, Inv6, list size + SC.Summate(Escadd, Inv6, list size) = 0 + list size = list size
Esc6 = SC.Summate(Escadd, Inv6, list size) = list size 

Finally, to obtain the summary of map we compose the previous results as fol- lows. MaxLive for map is the sum of the corresponding Esc for all lines plus the the maximum MaxLive of all lines but discounting the corresponding Esc:

MaxLivemap = MaxLivel Escl + Escl l∈{1,2,5,6} l∈{1,2,5,6} = ⊔ {1 1, 1 1, list size + 2 list size, list size list size} +(1+1+list size+list size) =2+(2+2×list size)=2×list size+4

Esc of map should count all objects that (may) escape from map. From all objects accounted for in Escl, for l ∈ {1, 2, 5, 6}, only the iterator allocated at line 2 is captured by map. All other objects escape. Thus, we get:

Escmap = Esc1 + Esc5 + Esc6 = 1 + list size + list size = 2×list size+1 

Buw how are the invariants obtained? They are generated using a tool called Daikon that generates likely pre- and post-conditions of method. We add calls to dummy methods right next to relevant lines inside a loop. The parameters of these dummy methods must be the variables which will have a part in the invariant. They are obtained with an enriched live variable analysis. Moreover, one has to distinguish between variables that are involved in the invariant, and what we may call inductive variables, i.e., the variables that change values with each loop iteration.

For example, consider the following code:
void m(int p){
          for(int i=1; i < p; i++){
              Integer temp = new Integer(i);

Suppose we want to analyze how many times m2 is executed. The live variables at that program point are temp, i and p. So we insert a call to a dummy method with those parameters:

void m(int p){
          for(int i=1; i < p; i++){
              Integer temp = new Integer(i);

When daikon generates preconditions for the method, it will include as a result that 1≤i≤p. So the parametric invariant is that the parameter is p, there exist some variable temp and some variable i such that 1≤i≤p. However, we must only count on i, not on temp, as temp isn't inductive. The notation used for the invariant results in this: [p] -> {[i] -> ∃ temp : 1≤i and i≤p} (you might think of it as a parametric set and what's happening is that we want to count the parametric number of elements of the set).

The Java code is converted into an intermediate representation equivalent to the one described in section 4 of the paper. More specifically, it is a languange independent representation of the callgraph, enriched with:

a) Likely invariants of all objects created and methods called

b) Which objects escape each method (it's context sensitive).

After this, we build the memory consumption only using the intermediate represention. The process is described in this paper (). This makes it easier for us to analyze programs written in other languages, as the cost reduces to write the intermediate representation.

For more details, check the located in the github repository.



Diego Garbervetsky (Departamento de Computación, FCEyN, UBA and CONICET)

Samuel Hym (LIFL, Université Lille 1 and CNRS)

Víctor Braberman (Departamento de Computación, FCEyN, UBA and CONICET)

Sergio Yovine (Departamento de Computación, FCEyN, UBA and CONICET)

Sven Stork (Departamento de Computación, FCEyN, UBA and CONICET)

Guillermo Mosse (Departamento de Matematica and Departamento de Computación, FCEyN, UBA)