Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Overview

The purpose of the PCG Analysis is to provide clients with the following:

  • The PCG data structure representing the state of ownership and borrowing of Rust places at arbitrary program points within a Rust function
  • For any pair of consecutive program points, an ordered list of actions that describe the transformation of the PCG of to the PCG of .

PCG Analysis Algorithm

Key Concepts:

  • The PCG Analysis Algorithm operates on the MIR Body of a Rust function and returns a PCG Analysis of the function.
  • A PCG Analysis contains PCG Data for every reachable1 MIR location in the Body2.
  • The PCG Data is a tuple of PCG States and PCG Action Data.

The PCG States of a MIR location is a list of the PCGs computed at that location, concretely:

The PCG Action Data of a MIR location contains a list of PCG Actions for each evaluation phase in the location (i.e. the actions performed at that phase).

The PCG Dataflow Analysis

The PCG Analysis Algorithm is implemented as a MIR dataflow analysis using PcgDomainData as the domain. PcgDomainData contains a PCGData value and other relevant metadata (e.g the associated basic block). Notably, the analysis only analyzes the statements in each basic block one time. Conceptually, this property is ensured because the final state at a loop head is computed upon entering it in the analysis (without having first seen the body).

We note that the behaviour of the join operation on PCGDomainData requires tracking of what blocks have been previously joined (this is basically a consequence of the interface of the MIR dataflow analysis). The PCGDomainData join operation joins the PCG of block into the PCG at block as follows:

  1. If no block has ever been joined into , then set =
  2. If the edge from to is not a back edge3 of a loop, then is joined into using the algorithm defined here

Because the join does not modify the PCG for back edges, the analysis can be completed without ever having to re-analyse the statements within a block.

Our implementation should also be checking that the PCG generated at the loop head is valid w.r.t the state at the back edge here, but this is not happening yet.

PCG Data Structure

The PCG data structure represents the state of ownership and borrowing of Rust places at an arbitrary program point.

It consists of three components:

  • The Owned PCG, which describes the state of owned places
  • The Borrow State, which describes borrowed memory (borrowed places and lifetime projections) and borrow relations, and also some auxillary data structures
  • The Place Capabilities which is a map from places to capabilities

Owned PCG

The Owned PCG is a forest, where the root of each tree in the forest is a MIR Local.

Each tree in the forest is defined by a set of place expansions, which describe how unpacked all of the owned places are. For each expansion in the set, the base is a node in the forest and are its children. We note that each expansion can be similarly interpreted as a hyperedge

Each local in the body of a MIR function is either unallocated or allocated in the Owned PCG. A local is allocated iff there exists a corresponding root node in the Owned PCG, otherwise it is unallocated.

The operation allocate in the Owned PCG requires that is not allocated, and adds a new root for . The deallocate operation removes the forest rooted at .

Borrow State

The Borrow State is a 3-tuple containing a Latest Map describing the latest locations of each place; a set of Validity Conditions that describes the set of paths leading to the current block; and a Borrows Graph, a directed acyclic hypergraph which describes borrowed places, sets of borrows, and their relation to one another. The Borrows Graph is represented as a set of PCG hyperedges.

Because a borrow created within a block exists only for executions that visit that block, we label new borrows using the validity conditions of the block in which they were created.

Place Capabilities

Place capabilities is a partial map from places to capabilities.

We may want to change the domain to be maybe-labelled places instead.


  1. A MIR location reachable iff its basic block is reachable from the start block in the CFG without traversing unwind or fake edges (the latter kind do not correspond to the actual control flow of the function). The original reason for only considering reachable edges was to improve performance; removing this constraint (and instead considering all locations) would be simple to change in the implementation.

  2. The PCG analysis algorithm is implemented as a MIR dataflow analysis defined by the rust compiler crate. In the implementation, the PCG Data is computed for every reachable MIR location during the algorithm execution itself, but only the PCG Data for the entry state of each basic block is stored. The PCG Data for an arbitrary location within a block is re-computed by applying the dataflow analysis transfer function from the entry state.

  3. In the join implementation, we an edge from to is a back edge if dominates in the CFG

Capabilities

There are four types of capabilites:

Exclusive (E)

Places with this capability can be read from, written to, or mutably borrowed.

We do not differentiate between locals bound with let bindings and let mut bindings: a variable bound with let would still have this capability if it could be written to if it was mutably borrowed.

Read (R)

Places with this capability can be read from. Shared borrows can also be created to this place. Shared references with this capability can be dereferenced.

A place with capability E is downgraded to R if a shared borrow is created to a place that is a pre- or postfix of .

When a shared reference is dereferenced, the capability to is downgraded to R. Any place projecting from a shared references will have capability R.

Write (W)

The place can be written to.

When an exclusive reference is dereferenced, the capability to is downgraded to W.

ShallowExclusive (e)

This is used for a rather specific and uncommon situation. When converting a raw pointer *mut T into a Box<T>, there is an intermediate state where the memory for the box is allocated on the heap but the box does not yet hold a value. We use ShallowExclusive to represent this state.

Writing to a Box-typed place p via e.g. *p = 5 requires that p have capability e.

Choosing Place Labels

When the PCG analysis labels a place, all references to the current version of the place (i.e. not within already labelled places) are labelled with a label corresponding to the last time the place could have been updated.

To accomplish this, the PCG uses a data structure called the Latest Map to track, for each place, the most recent location where it could have been updated.

We could consider instead to label locations at the point it is required (as is done when labelling lifetime projection). Our current approach involves a lot of bookkeeping.

The Latest Map

The Latest Map is a partial map from places to labels.

The function is defined as follows:

We note that for each place , either does not contain any place that is either a prefix or postfix of , or it contains exactly one such place. This is ensured by construction.

Definitions

Types

Type Contains

A type contains a type , iff:

  1. , or
  2. is an ADT and contains a field and contains
  3. and contains

Types Containing Lifetimes

A type contains a lifetime iff contains the type for some type . A lifetime is nested in a type iff contains a type and contains . We extend these concept to places: a place contains a lifetime iff contains ; is nested in iff is nested in . A lifetime projection is nested if is nested in .


PCG Evaluation Phase

The PCG Evaluation Phases are:

  • PreOperands
  • PostOperands
  • PreMain
  • PostMain

For every MIR location, a seperate PCG is generated for each phase. They represent the following:

  • PreOperands - A reorganization of the initial state1 to prepare to apply the effects of evaluating the operands in the statement
  • PostOperands - The result of applying the operand effects on the PreOperands state
  • PreMain - A reorganization of the PostOperands state to prepare to apply any other effects of the statement
  • PostMain - The result of applying any other effects of the statement to the PreMain state.

Program Point

A program point represents a point within the execution of a Rust function in a way that is more fine-grained than a MIR location (each MIR location has multiple program points which to different stages of evaluation of a statement). Concretely, a program point is either:

  • The start of a basic block
  • A pair of a MIR location and a PCG evaluation phase

Borrows and Blocking

Blocking

A place blocks a place at a location if a usage of at would invalidate a live borrow contained in the origins of at .

Borrow Liveness

A borrow is live at location if a usage of at would invalidate the borrow.

Directly Borrowed

A place is directly borrowed by a borrow if is exactly the borrowed place (not e.g. a pre- or postfix of the place).


  1. The initial state is either the PostMain of the previous location within the basic block. Or if this is the first statement within the block, it is the entry state of the block (i.e. the result from joining the final states of incoming basic blocks).

PCG Nodes

We probably don't need so many label types, but we have them in the implementation currently.
In the implementation we currently refer to lifetime projections as "region projections" and labelled places as "old" places.

Associated Place

The associated place of a PCG node is defined by the partial function :

Where is the base of the lifetime projection as defined here.

Local PCG Nodes

A PCG node is a local node if it has an associated place .

PCG Hyperedges

A PCG Hyperedge is a directed edge of the form , where elements of and are PCG nodes. The set are referred to as the source nodes of the edge, and are the target nodes. Hyperedges in the PCG are labelled with validity conditions1.

We represent a PCG hyperedge as a tuple of an edge kind and validity conditions.

Edge Kinds

An edge kind is a description an edge, including its source and target nodes, as well as other associated metadata. The metadata is described by the type of the edge, the various types are presented below:

Unpack Edges

Unpack edges are used to represent the unpack of an owned place in order to access one of its fields. For example, writing to a field x.f requires expanding x.

An unpack edge connects an owned place to one of it's expansions . Each place in is guaranteed to be owned.

For example, if x is an owned place with fields x.f and x.g, the edge {x} -> {x.f, x.g} is a valid unpack edge.

Unpack edges are not generated for dereferences of reference-typed places. Borrow PCG Expansion Edges are used in such cases. Unpack edges are used in derefences of Box-typed places.

In the implementation, we don't have an explicit data type representing unpack edges. Rather, the unpack edges are conceptually represented as the interior edges in the Owned PCG.

Validity conditions are not currently associated with unpack edges in the implementation.

Borrow PCG Expansion Edge

Borrow PCG Expansion Edges conceptually take one of three forms:

  • Dereference Expansion: The dereference of a reference-typed place
  • Place Expansion: The expansion of a borrowed place to access one of its fields
  • Lifetime Projection Expansion: The expansion of the lifetime projections of an owned or borrowed place

Dereference Expansion

The source nodes of a dereference expansion consist of:

  • A maybe-labelled place , and:
  • A lifetime projection

Where and have the same associated place .

The target node of a dereference expansion is a maybe-labelled place with associated place .

Place Expansion

The source node of a place expansion is a maybe-labelled place with associated place , where is a borrowed place and is not a reference.

The target nodes of a place expansion is a set of maybe-labelled places where the associated places of are an expansion of .

Lifetime Projection Expansion

The source node of a lifetime projection expansion is a lifetime projection where is a maybe-labelled place with associated place of .

The target nodes of a lifetime projection expansion is a set of lifetime projections where the base of each place is a maybe-labelled place. The associated places of the bases of are an expansion of .

It might make sense to differentiate lifetime projection expansions of owned and borrowed places, since they differ in terms of how placeholder labels should be included. Namely, for owned places there is no need to connect the expansion nodes to the placeholder of the base node (the owned place may never be repacked, or could be repacked with entirely unrelated lifetime projections)

Borrow Edges

Borrow edges are used to represent direct borrows in the program. We define two types:

  • Local Borrow Edges: A borrow created in the MIR Body, e.g. from a statement let y = &mut x;
  • Remote Borrow Edges: A borrow created by the caller of this function where the assigned place of the borrow is an input to this function.

Remote Borrows are named as such because (unlike local borrows), the borrowed place does not have a name in the MIR body (since it was created by the caller).

Local Borrows

The source node of a local borrow is a maybe-labelled place . The target node of a local borrow is a lifetime projection where is a maybe-labelled place.

Remote Borrows

The source node of a remote borrow is a remote place . The target node of a remote borrow is a lifetime projection where is a maybe-labelled place.

Abstraction Edge

An abstraction edge represents the flow of borrows introduced due to a function call or loop.

Function Call Abstraction Edge

The source node of a function call abstraction edge is a lifetime projection where is a maybe-labelled place.

The target node of a function call abstraction edge is a lifetime projection where is a maybe-labelled place.

Loop Abstraction Edge

The source node of a loop abstraction edge is a PCG node.

The target node of a loop abstraction edge is a local PCG node.

Borrow-Flow Edge

A borrow-flow edge represents the flow of borrows between a lifetime projection and a local lifetime projection . This edge is used when the relationship between the blocked and target node is known concretely, but does not correspond to an expansion or a borrow.

Borrow-Flow Edges are labelled with a Borrow-Flow Edge Kind with associated metadata, enumerated as follows:

Aggregate

Metadata:

  • : Field Index
  • : Lifetime Projection Index

An aggregate kind is used for borrows flowing into an aggregate type (i.e. struct, tuple). The metadata indicates that the blocked lifetime projection flows into the lifetime projection of the field of the blocking lifetime projection.

Introduced in the following two cases:

  1. Collapsing an owned place:
    • edges flow from the lifetime projections of the labelled places of the base to the lifetime projections of the current base
  2. Assigning an aggregate (e.g. x = Aggregate(a, b)):
    • edges flow from the lifetime projections of the labelled places in the rvalue to the lifetime projections of x

is probably not necessary. We probably don't even need for case 1 (field index can be inferred from the expansion place), so perhaps a different edge kind could be used in that case.

Reference to Constant

Connects a region projection from a constant to some PCG node. For example let x: &'x C = c; where c is a constant of type &'c C, then an edge {c↓'c} -> {x↓'x} of this kind is created. This is called ConstRef in the implementation.

This seems quite similar to "Borrow Outlives", perhaps we should merge them?

Borrow Outlives

For a borrow e.g. let x: &mut y;, the PCG analysis inserts edges of this kind to connect the (potentially snapshotted) lifetime projections of y to the lifetime projections of x.

Initial Borrows

To construct the initial PCG state, the PCG analysis adds an edge of this kind between every lifetime projection in each remote place to the corresponding lifetime projection of its corresponding argument.

For example, if is the local corresponding to a function argument and contains a lifetime projection , an edge will appear in the graph.

Connects the lifetime projections of remote places to the lifetime projections of

Copy

For a copy let x = y;, the PCG analysis inserts edges of this kind to connect the lifetime projections of y to the lifetime projections of x.

In the implementation this is currently called CopyRef.

Move

For a move let x = move y;, the PCG analysis inserts edges of this kind to connect the (potentially snapshotted) lifetime projections of y to the lifetime projections of x.

Future

These edges are introduced to describe the flow of borrows to the lifetime projections of a place that is currently blocked. When they are created, the target node is a placeholder lifetime projection of a blocked place.

Perhaps this should be its own edge type?

  1. Currently not in the owned portion of the PCG, but this should happen eventually.

Places

A place is a tuple of a local and a projection.

Place Expansion

This is missing some cases

A set of places is a place expansion iff there exists a base such that:

  • is an enum type and and is a variant of
  • is a struct or tuple type and is the set of places obtained by projecting with each of the fields in the type of
  • is a reference-typed field and
  • is an array or slice and (TODO: more cases)

If there is such a , then that is unique, and is an expansion of .

Owned Places

A place is owned iff it does not project from the dereference of a reference-typed place.

Place Liveness

A place is live at a location iff there exists a location and a control flow-path from to where a place conflicting with is used at and there are no assignments of any places conflicting with along c.

Place Prefix

A place is a prefix of a place iff:

  • and have the same local, and
  • The projection of is a prefix of the projection of

Note that is a prefix of itself.

A place is a strict prefix of iff is a prefix of and .

Lifetime Projections

Lifetime projections take the following form

Lifetime Projection Lifetime

The lifetime of a lifetime projection is conceptually the lifetime on the right of the , i.e:

Lifetime Projection Base

The base of a lifetime projection is conceptually the part on the left of the , i.e. defined as follows:

Validity Conditions

We associate borrow PCG edges with validity conditions to identify the control-flow conditions under which they are valid. Consider the following program:

fn main(){
    // BB0
    let mut x = 1;
    let mut y = 2;
    let mut z = 3;
    let mut r = &mut x;
    if rand() {
        // BB1
        r = &mut y;
    }
    // BB2
    if rand2() {
        // BB3
        r = &mut z;
    }
    // BB4
    *r = 5;
}

We represent control flow using a primitive called a branch choice of the form . A branch choice represents the flow of control from one basic block to another. In the above code there are two possible branch choices from bb0: and and one branch choice from bb1 (to bb2).

For any given basic block in the program, an execution path leading to is an ordered list of branch choices. For example, one path of execution to bb4 can be described by: .

Validity conditions conceptually define a predicate on execution paths. For example, the validity conditions describing the control flow where r borrows from z at bb3 require that the path contain .

We represent validity conditions as a partial function mapping relevant blocks to sets of allowed target blocks.

An execution path is valid for validity conditions iff, for each , either:

  • is not a relevant block in , or
  • is in the set of allowed target blocks of in

TODO: Explain how validity conditions are initially determined when creating a new edge at a basic block (inherent VCs). Describe the join informally.

Formal Definition

Validity conditions is a map describing control-flow conditions. For each block , is a strict subset of the real successors of .

The join of validity conditions and is defined as:

Validity conditions are valid for a path iff:

The intuition behind the first case of the join is that if whether a borrow is live does not depend on what successor is taken from some basic block , then it is not necessary to store a predicate about .

This is kind of an optimization, actually removing this case for the join would not have any difference on whether the resulting validity conditions are valid for any block. In the implementation this is useful so we can have a canonical representation (and also reduces the size of the data structure).

MIR Definitions

Here we describe definitions of MIR concepts that are relevant to the PCG.

It's possible that these definitions will become outdated as the MIR is not stable. If there is any discrepency between the descriptions here and those from official Rust sources (e.g. the dev guide), this page should be updated accordingly.

MIR Dataflow Analysis

At a high level, a MIR dataflow analysis is defined by the following elements:

  • A domain
  • A join operation
  • An empty state
  • A transfer function

Performing the dataflow analysis on a MIR body computes a value of type for every location in . The analysis is performed (conceptually) as follows1:

  • The analysis defines a map that maps locations in to elements of .
  • Each location in is initialized to
  • The operation analyze(b) updates as follows:
    • For
  • The analysis defines a worklist
  • While is not empty:
    • Pop from
    • Perform
    • Let be the entry of the last location in in
    • For each successor of :
      • Let
      • Let
      • If :
        • Add to
  • is the result

I'm not sure of the order things are popped from . Any ordering should yield the same but some blocks may be analyzed more frequently than necessary. We should check the rustc implementation.


  1. The current analysis implementation (defined in the rust compiler) is more efficient than what we describe because it tracks state per basic block rather than per-location, as the states for any location in a block can be computed by repeated application of the transfer function to the entry state.

Analysing Statements

The PCG Analysis computes four states for each MIR statement, corresponding to the PCG evaluation phases:

  • PreOperands
  • PostOperands
  • PreMain
  • PostMain

The analysis for each statement proceeds in two steps:

  • Step 1: Place Conditions are computed for each statement phase
  • Step 2: PCG Actions are performed for each statement phase

Determining Place Conditions

A place condition is a predicate on the PCG related to a particular MIR place.

We define the following place conditions:

  • Capability: Place must have capability
  • RemoveCapability: Capability for place should be removed1
  • AllocateOrDeallocate: Storage for local is allocated (e.g. via the StorageLive instruction)
  • Unalloc: Storage for local is not allocated (e.g. via the StorageDead instruction)
  • ExpandTwoPhase: Place is the borrowed place of a two-phase borrow
  • Return: The RETURN place has Exclusive capability

ExpandTwoPhase may not be necessary. AllocateOrDeallocate is a confusing name, also it's not clear if it's any different than just having Write permission.

During this step of the analysis, place conditions are computed for each phase. The determination of place conditions is based on the MIR statement; the state of the PCG is not relevant.

The conditions computed for each phase are as follows:

  • PreOperands: Pre-conditions on the PCG for the operands in the statement to be evaluated
  • PostOperands: Post-conditions on the PCG after the operands in the statement has been evaluated
  • PreMain: Pre-conditions on the PCG for the main effect of the statement to be applied
  • PostMain: Post-conditions on the PCG after the main effect of the statement has been applied

As an example, the MIR statement: let y = move x would have the following place conditions:

  • PreOperands: {x: E}
  • PostOperands: {x: W}
  • PreMain: {y: W}
  • PostMain: {y: E}

The rules describing how these place conditions are generated for a statement are described here.

Performing PCG Actions

After the place conditions for each phase are computed, the analyses proceeds by performing the actions for each phase. At a high-level, this proceeds as follows:

PreOperands

  1. The Borrow PCG graph is minimized by repeatedly removing every effective leaf edge2 for which their target PCG nodes of are not live at the current MIR location. A Borrow PCG RemoveEdge action is generated for each removed edge. See more details here.

TODO: Precisely define liveness.

  1. The place capabilities required by the PreOperand place conditions are obtained.

PostOperands

No actions occur at this stage. Capabilities for every moved-out operand are set to Write.

PreMain

The place capabilities required by the PreMain place conditions are obtained.

Then, the behaviour depends on the type of statement:

  1. StorageDead(v)
    1. The analysis performs the MakePlaceOld(v, StorageDead) action.
  2. Assign(p r)
    1. If p is a reference-typed value and contained in the borrows graph and the capability for p is R:
      1. The analysis performs the RestoreCapability(p, E) action
    2. If :
      1. The analysis performs the action
    3. All edges in the borrow PCG blocked by any of the lifetime projections in p are removed

PostMain

  1. For every operand move p in the statement:
    1. The analysis performs the MakePlaceOld(p, MoveOut) action.
  2. If the statement is a function call p = call f(..):
    1. The entry for p in the latest map is updated to the current location
    2. Function call abstraction edges are described using the rules defined here
  3. If the statement takes the form Assign(p, r):
    1. The entry for p in the latest map is updated to the current location
    2. p is given exclusive permission
    3. If takes the form Aggregate(o_1, ..., o_n):
      1. For every
        1. Let be the associated place of
        2. For all
          1. If outlives :
            1. Add an Aggregate BorrowFlow edge , with associated field index and lifetime projection index .
    4. If takes the form use c, c is a reference-typed constant with lifetime , and is a reference-typed place with lifetime , then:
      1. Create a new ConstRef borrowedge of the form
    5. If takes the form move p_f or cast(_, move p_f):
      1. For all :
        1. Let be the i'th lifetime projection in p
        2. Let be the i'th lifetime projection in p_f
        3. Let be the location of p_f in the latest map
        4. Add a Move edge
    6. If takes the form copy p_f or cast(_, copy p_f):
      1. For all :
        1. Let be the i'th lifetime projection in p
        2. Let be the i'th lifetime projection in p_f
        3. Add a CopyRef edge
    7. If takes the form &p or &mut p:
      1. Handle the borrow as described here

  1. This is only used for mutably borrowed places

  2. The set of effective leaf edges are the leaf edges in the graph obtained by removing all edges to placeholder lifetime projections.

Rules for Determining Place Conditions

Place conditions are computed in terms of triples, where a triple is a pair (pre, post) where pre is a place condition and post is either a place condition or None.

The place conditions for each phase are determined by two sets of triples: the operand triples and the main triples. The place conditions for the PreOperands phase is the set of conditions in the pres of the operand triples. The PostOperands phase is the set of conditions in the posts of the operand triples. PreMain and PostMain are defined accordingly.

Determining Operand Triples for a Statement

  1. For each operand in the statement:
    1. If takes the form copy p:
      • Add (p: R, None) to the operand triples
    2. If takes the form move p:
      • Add (p: E, p: W) to the operand triples
  2. For each rvalue in the statement:
    1. If takes the form &p:
      • Add (p: R, p: R) to the operand triples
    2. If takes the form &mut p:
      • If the borrow is a two-phase borrow:
        • Add (ExpandTwoPhase p, p: R) to the operand triples
      • Otherwise, add (p: E, RemoveCapability p) to the operand triples
    3. If takes the form *mut p:
      • Add (p: E, None) to the operand triples
    4. If takes the form *const p:
      • Add (p: R, None) to the operand triples
    5. If takes the form len(p), discriminant(p) or CopyForDeref(p):
      • Add (p: R, None) to the operand triples

Determining Main Triples for a Statement

The rule depends on the statement type:

  1. Assign(p, r)
    1. If r takes the form &fake q:
      • Add (p: W, None) to the main triples
    2. If r takes the form ShallowInitBox o t
      • Add (p: W, p: e) to the main triples
    3. Otherwise, add (p: W, p: E) to the main triples
  2. FakeRead(_, p)
    1. Add (p: R, None) to the main triples
  3. SetDiscriminant(p, ..)
    1. Add (p: E, None) to the main triples
  4. Deinit(p)
    1. Add (p: E, p: w) to the main triples
  5. StorageLive(v)
    1. Add (Unalloc v, AllocateOrDeallocate v) to the main triples
  6. StorageDead(v)
    1. Add (AllocateOrDeallocate v, Unalloc v) to the main triples
  7. Retag(_, p)
    1. Add (p: E, None) to the main triples

Determining Main Triples for a Terminator

The rule depends on the terminator type:

  1. Return
    1. Add (Return, _0: w) to the main triples
  2. Drop(p)
    1. Add (p: W, None) to the main triples
  3. Call(p, _)
    1. Add (p: W, p: E) to the main triples
  4. Yield(p, _)
    1. Add (p: W, p: E) to the main triples

Rules for the Creation of Borrows

Mutable Borrows

Consider the stmt p = &mut q, at a program point , where has type , and has type , and is a type containing lifetimes .

At the end of the PreOperands phase, the PCG is guaranteed to be in a state where, for each the lifetime projection is in the graph. During the Operands phase, each lifetime projection is labelled with the current program point to become . At the end of the PreMain phase, for each , the lifetime projection is guaranteed not to be in the graph. During the Main phase, these projections are added.

Subsequently, the labelled lifetime projections under are connected with BorrowFlow edges to the new lifetime projections under . Namely, for each if outlives , then a BorrowFlow edge is added.

Subsequently, we introduce Future nodes and edges to account for nested references as follows. For each :

  1. Insert the node into the graph
  2. If any Future edges originate from the labelled projection , redirect them such that they originate from .
  3. Insert a Future edge
  4. Insert a Future edge

Owned PCG Operations

Collapse

The operation modifies place expansions and set of place capabilities such that becomes a leaf in the forest corresponding to . Stated more formally it modifies to ensure that contains an expansion containing , and is not the base of any expansion in . Capabilities in are updated to account for the removal of expansions from .

collapse returns the set of Owned PCG Actions corresponding to the removed expansions.

This logic is very similar to the collapse defined on the (combined) PCG defined here. This is used in contexts where the Borrow PCG is not available (such as the join on owned PCGs).

We should investigate making a common operation.

The algorithm is implemented as follows:

  1. Let be the subset of place expansions in such that for each in , the base place is a prefix of .
  2. Let be an ordered list of the expansions in sorted in order of descending length of the projections of their base place
  3. Let be the list of operations to return
  4. For each in
    1. Let be the capabilities of in
    2. Let be the minimum common capabiility of .
    3. Let be the base of
    4. Remove capabilities to the places in from
    5. Assign capability to in
    6. Remove from
    7. Add to
  5. return

Join Operation

The join algorithm on PCGs takes as input PCGs and and mutates to join in .

We define the join in this manner because this is similar to how the implementation works.

The algorithm proceeds in the follow steps:

  1. The Owned PCG of is joined into the Owned PCG of (this may also change the capabilities of )
  2. The capabilities of are joined into the capabilities of .
  3. The Borrow State of is joined into the Borrow State of

We now describe each step in detail:

Owned PCG Join

Let be the owned PCG of and the PCG of .

  1. For each local in the MIR body:
    1. If is allocated in both and :
      1. Join the place expansions rooted at into
    2. Otherwise, if is allocated in , it should be deallocated in

The algorithm joins a set of place expansions into a set of place expansions , and makes corresponding changes to capabilities .

  1. Let be an ordered list of the expansions in sorted in order of ascending length of the projections of their base place
  2. For each in :
    1. Let be the base of .
    2. If there exists a where the base of is :
      1. If
        1. Perform collapse(, , )
        2. Otherwise do nothing (go back to the top of the loop)
    3. Otherwise, if contains an expansion and :
      1. Add to
      2. If :
        1. Assign capability to in
        2. Otherwise, remove capability to in
      3. For each :
        1. If :
          1. Assign capability to in
          2. Otherwise, remove capability to in

Place Capabilitiies Join

The algorithm join() is defined as:

  1. For each p: c in :
    1. If :
      1. Remove capability to in
    2. Otherwise:
      1. If is defined:
        1. Assign capability to in
      2. Otherwise, remove capability to in

Borrow State Join

  1. The borrow graphs are joined
  2. The latest maps are joined
  3. The validity conditions are joined

Borrow PCG Join

Joining into , where is the block for and is the block for .

Update the validity conditions for each edge in to require the branch condition .

Update the validity

  1. If is a loop head perform the loop join algorithm as described here.
  2. Otherwise:
    1. For each edge in :
      1. If there exists an edge of the same kind in
        1. Join the validity conditions associated with in to the validity conditions associated with in
      2. Otherwise, add to
    2. For all placeholder lifetime projections in :
      1. Label all lifetime projection nodes of the form in with

Loops

The loop head is the basic block with an incoming back edge. We define as the location of the first statement of the loop head.

The pre-loop block is the block prior to the loop head. We assume that there is always a unique pre-loop block. The pre-loop state is the state after evaluating the terminator of the pre-loop block.

The following operations are performed when we join the pre-loop block with the loop head. Note that at this point we've already computed .

We construct the state for the loop head as follows:

Step 1 - Identify Relevant Loop Nodes

We identify the following sets of places:

  • : the places used inside the loop that are live and initialized at .

  • : the subset of that are directly borrowed by a borrow live at

  • : the subset of that contain lifetime projections

  • : Places used in the loop that may be relevant for the invariant:

is the union of and the associated lifetime projections of .

are the associated lifetime projections of .

Step 2 - Expand Pre-State Graph to Include Relevant Loop Nodes

The nodes in will need to appear in but may not be present in (for example, it's possible that the loop could borrow from a subplace that requires unpacking). We construct a graph by unpacking such that contains all places in .

Step 3 - Identify Borrow Roots

The borrow roots of a place are the most immediate places that could be borrowing from and later become accessible, and excluding places already in

We defined the borrow roots using the function :

  • Initialize a list to contain all lifetime projections in
  • while is not empty:
    • Pop from
    • For each edge blocked by in :
      • If the edge is an unpack edge, add all of its blocked nodes to
      • Otherwise, for each blocked node in :
        • If or , do nothing
        • If is live at , add to
        • If is a root in , add to
        • Otherwise, add to
  • The resulting roots are the associated places of

We then identify , the most immediate nodes that could be borrowing from and later become accessible (excluding nodes already in ). is the union of the roots for each place in .

Step 4 - Construct Abstraction Graph And Compute Blocked Lifetime Projections

We construct an abstraction graph that describes the blocking relations potentially introduced in the loop from places in to nodes in and from nodes in to nodes in .

connect() function:

We begin by define a helper function which adds edges to based on being blocked by in the loop:

  • Identify : the top-level lifetime projection in
    • Insert a loop abstraction edge into
  • For each :
    • Identify the lifetime projections in that may mutate borrows in
      • If is nonempty:
        • Introduce a placeholder node
        • Add a borrowflow hyperedge
        • Add a future hyperedges:
    • Identify the lifetime projections in that borrows in may flow into
      • If is nonempty:
        • Add a borrowflow hyperedge

Algorithm:

  • For each blocker place :

    • For each , perform
    • For each
      • If blocks at , then perform
  • Subsequently, ensure that is well-formed by adding unpack edges where appropriate. For example, if (*x).f is in the graph, there should also be an expansion edge from *x to (*x).f.

  • We identify the set of lifetime projections that will need to be renamed (indicating they will be expanded in the loop and remain non-accessible at the loop head). is the set of non-leaf lifetime projection nodes in (leaf nodes are accessible at the head).

  • Label all lifetime projections in with location , add connections to their Future nodes as necessary.

Step 5 - Label Blocked Lifetime Projections in Pre-State

The resulting graph for the loop head will require new labels on lifetime projections modified in the loop. We begin by constructing an intermediate graph by labelling each lifetime projection in with and remove capabilities to all places in in .

Step 6 - Identify Pre-State Subgraph to Replace With Abstraction Graph

We then identify the subgraph that will be removed from and replaced with the loop abstraction .

  • Let = .
  • We construct by including all nodes in and all edges where there exists where is on a path between and in .

Step 7 - Replace Pre-State Subgraph with Abstraction Graph

The graph for the loop head is defined as

Step 8 (Optional) - Confirm Invariant is Correct

To confirm that the resulting graph is correct, for any back edge into the state at with state , performing the loop join operation on and should yield .


    Function Calls

    Consider a function call . Each is an operand of the form move p | copy p | const c. Let be the set of moved operand places, and let denote the program point where was last modified.

    The function call abstraction is created in the PostMain evalution stage, after all of the operands have been evaluated. Therefore, all places in are labelled, as they have been moved-out by this point.

    Creating the Function Call Abstraction Graph

    The PCG constructs a function call abstraction graph to reflect the effect of the call.

    We begin by determining how may manipulate the borrows in by constructing the bipartite graph as follows:

    Let be the set of lifetime projections in . For any place , we define as the place .

    Let be the set .

    Let be the set .

    Add all nodes in to

    For each if is nested in , and outlives , we add to a Function Call Abstraction Edge .

    The function call abstraction graph is constructed by adding to edges such that for every lifetime in , we define the set , and add a BorrowFlow edge .

    Adding the Function Call Abstraction Graph to the PCG

    We add an abstraction graph to a PCG to create a new PCG as follows.

    First, we initialize as . For each projection in , we modify as follows:

    1. For each endpoint in whose target is , redirect the target to .
    2. For each Future edge in , remove the edge, and for each leaf in with an ancestor {n} \rightarrow p' \downarrow r' to .
    3. Remove from .

    Misc (To Update)

    Unpacking Places

    An unpack operation introduces an Unpack edge into the graph, and is associated with a source place and an expansion . An unpack operation can either be mutable or read-only.

    An unpack operation is a dereference unpack for lifetime iff the type of the source place p is &'r mut p or &'r p.

    Applying an unpack operation introduces an Unpack edge as follows:

    1. If the unpack operation is a dereference operation, the edge is added to the graph.
    2. Otherwise, the edge is added to the graph.

    A mutable unpack operation for capabiliy requires that the source place have capability . When the operation is a applied, the capability for is removed and the capability of all places in is set to .

    A read-only unpack operation requires that have capability and assigns capability to all places in .

    Updating Lifetime Projections

    Assume the source place has type with lifetimes . If the unpack operation is mutable, then we label each lifetime projection with location .

    The source lifetime projection for a lifetime is if the unpack is mutable, and otherwise. The target lifetime projections is the set .

    For each lifetime , if the set of target lifetime projections associated with is nonempty:

    1. For each in the set of target projections, add a BorrowFlow edge where is the source projection1.
    1. If the unpack is mutable and the source place of the expansion is either a reference or a borrowed place: a. Add a Future edge b. For each in the set of target projections, add a Future edge c. If any Future edges originate from the source projection , redirect them such that they originate from .

    Origin Containg Loan

    An origin contains a loan created at location iff:

    Polonius: Read directly from output facts

    NLL: is live at and outlives


    1. TODO: Currently we actually introduce an unpack edge in the implementation, but we should change this.

    PCG Operations

    Obtaining Capability to a Place

    The obtain(p, o) operation reorganizes a PCG state to a new state in where the PCG node for a place is present in the graph. The parameter o is an Obtain Type which describes the reason for the expansion. The reason is either:

    • To obtain a specified capability to the place
    • To access the place because it is the borrowed place of a two-phase borrow

    An Obtain Type is associated with a result capability which is either the specified capability (in the first case), or Read (in the case of access for two-phase borrows).

    The "two-phase" borrow case is likely unnecessary: we can use the borrow-checker to detect if the place is also the borrowed place of a two-phase borrow reserved at the current location. In fact the current implementation make similar queries as part of the expand step.

    Note that a place node for is not necessarily present in the graph before this occurs.

    This operation is implemented as PcgVisitor::obtain (see https://github.com/viperproject/pcg/blob/main/src/pcg/visitor/obtain.rs)

    This proceeds as follows:

    Step 1 - Upgrading Closest Ancestor From R to E

    This step is included to handle a relatively uncommon case (see the Rationale section below).

    If the obtain operation is called with permission or and the closest ancestor to , that is, the longest prefix of for which there exists a node in the graph, has capability, we upgrade 's capability to in exchange for removing capability to all pre- and postfix places of in the graph (excluding itself).

    This is sound because if we need to obtain non-read capability to place, and there are any ancestors of place in the graph with R capability, one such ancestor originally had E capability was subsequently downgraded. This function finds such an ancestor (if one exists), and performs the capability exchange.

    Perhaps it would be better to explicitly track downgrades in the analysis so that they can be upgraded later? This will make the soundness argument more convincing.

    Rationale

    It's possible that we want to obtain exclusive or write permission to a field that we currently only have read access for. For example, consider the following case:

    • There is an existing shared borrow of (*c).f1
    • Therefore we have read permission to *c, (*c).f1, and (*c).f2
    • Then, we want to create a mutable borrow of (*c).f2
    • This requires obtaining exclusive permission to (*c).f2

    We can upgrade capability of (*c).f2 from R to E by downgrading all other pre-and postfix places of (*c).f2 to None (in this case c and *c). In the example, (*c).f2 is actually the closest read ancestor, but this is not always the case (e.g. if we wanted to obtain (*c).f2.f3 instead)

    Step 2 - Collapse

    Then, if a node for exists in the graph and 's capability is not at least as strong as , collapse the subgraph rooted at (and obtain capability for ) by performing the collapse(p, c) operation.

    collapse

    The collapse(p) operation is implemented as follows:

    • For each such that is a prefix of (from longest to shortest) and there is a node for in the graph:
      • perform the Collapse Repack Operation on .
      • For each lifetime in the type of :
        • Create a new lifetime projection node
        • For each lifetime projection node where is an expansion of :
          • Label
          • Create a new BorrowFlow edge

    Step 3 - Labelling Lifetime Projections

    At this point, if is , we know that a subsequent operation will mutate . As a result, if there exists a lifetime projection node for (for example, if stores a borrow that has since been reborrowed), it will no longer be tied to the current value of . So, we label .

    Step 4 - Expand

    At this point there should be a node for some prefix of in the graph such that .

    We expand the graph to (and obtain the capability for ) by performing the expandTo(p, o) operation.

    expandTo

    The expandTo operation is implemented as follows:

    • For each strict prefix of (from shortest to longest):
      • If expanding one level adds new edges to the graph, then
        • We expand the lifetime projections of one level

    The operation to expand a place one level is the expandPlaceOneLevel operation, and the operation to expand the lifetime projections one level is expandLifetimeProjectionsOneLevel.

    expandLifetimeProjectionsOneLevel

    expandLifetimeProjectionsOneLevel is defined with three parameters:

    • : The place to expand
    • : The target expansion of
    • : The Obtain Type

    The operation is implemented as follows:

    • Let be the expansion of using
    • For each lifetime projection of :
      • Let be the set of lifetime projections in with lifetime
      • If is nonempty1:
        • We identify the base lifetime projection as follows:
          • Let be the current snapshot location
          • If is not an obtain for capability R:
          • Otherwise, if is blocked by a two-phase borrow live at :
            • Let be the reservation location of the conflicting borrow
          • Otherwise,
        • Create a new Borrow PCG Expansion hyperedge

    Step 5 - Labelling Lifetime Projections for Projections

    Finally, if we are obtaining or capability, we can again assume that the intention is to mutate . If there exist any lifetime projection nodes for dereferences of 's fields (or 's fields' fields, and so on...), we encounter the same problem that was described in Step 3. To address this, we label any lifetime projections for dereferences of places for which is a prefix (*p.f, for example).


    1. This could happen e.g. expanding an x : Option<&'a T> to a x@None

    Collapsing Owned Places

    At the outset of each program point, the collapse_owned_places operation eagerly collapses the Owned PCG.

    This operation is implemented as PcgVisitor::collapse_owned_places (see https://github.com/viperproject/pcg/blob/main/src/pcg/visitor/pack.rs

    It is implemented as follows:

    • For each place for which there exists a place node (from longest to shortest):
      • If no expansion of is blocked by a borrow and every expansion of has the same capability:
        • perform the collapse(p) operation
        • if has no projection and has capability, upgrade 's capability to

    Activating Two-Phase Borrows

    activateTwophaseBorrowCreatedAt

    reserveLocation is a function from borrow edges to the MIR location at which the borrow edge was created.

    The activateTwophaseBorrowCreatedAt operation takes a single parameter:

    • , a MIR location

    The operation is implemented as follows:

    • If there exists a borrow edge in the graph such that l = reserveLocation(e):
      • If there exists a place node for in the graph:
        • Restore capability to
      • If is not owned:
        • Downgrade the capability of every ancestor of to None
    • TODO Logic is bad

    Packing Old and Dead Borrow Leaves

    The packOldAndDeadBorrowLeaves operation removes leaf nodes in the Borrow PCG that are old or dead (according to the borrow checker).

    • : the current MIR location

    When analysing a particular location, this operation is performed before the effect of the statement.

    Note that the liveness calculation is performed based on what happened at the end of the previous statement.

    For example when evaluating:

    bb0[1]: let x = &mut y;
    bb0[2]: *x = 2;
    bb0[3]: ... // x is dead
    

    we do not remove the *x -> y edge until bb0[3]. This ensures that the edge appears in the graph at the end of bb0[2] (rather than never at all).

    This operation is implemented as PcgVisitor::pack_old_and_dead_borrow_leaves (see https://github.com/viperproject/pcg/blob/main/src/pcg/visitor/pack.rs

    We must first introduce some auxiliary operations:

    isDead

    isDead(n, l) is true if and only if the borrow checker considers the node to be dead at MIR location

    removeEdgeAndPerformAssociatedStateUpdates

    removeEdgeAndPerformAssociatedStateUpdates is defined with one parameter:

    • : the edge to remove

    It proceeds as follows:

    • For each current place node that would be unblocked by removing :
      • If does not have capability, and is mutable:
        • Update to map to where is the current MIR location
    • Remove from the graph
    • For each current place node that is unblocked by removing :
      • Let be if projects a shared reference and otherwise
      • If has no capability or capability, upgrade its capability to
      • Unlabel each region projection of
    • If is a Borrow PCG Expansion edge:
      • If is a Dereference Expansion where is a current place with no lifetime projections:
        • unlabel
      • If has a source node where is a current place:
        • For each place node in the expansion of , label each region projection of with , where is the current MIR location
    • If is a Borrow edge; where is the current MIR location; the target of the borrow is a current place ; and has non-zero capability:
      • weaken 's capability to

    Main Loop

    packOldAndDeadBorrowLeaves proceeds as follows:

    • Until the PCG remains unchanged across the following steps:
      • Let be be the set of edges such that either:
        • is an Borrow PCG Expansion edge and either:
          • for each , either:
            • , where is the current MIR location
            • or is old
          • or , any pair of place nodes in have the same capability, and for all such that , has the same label as and is an exact prefix of
          • or and for all such that , is an exact prefix of ; and have the same label; and and have the same label.
        • or for each , where or , either:
          • is old
          • or 's associated place is not a function argument and either:
            • has a non-empty projection and is not blocked by an edge
            • or , where is the current MIR location
      • For each in :
        • perform removeEdgeAndPerformAssociatedStateUpdates(e)

    Repack Operations

    Repack operations describe actions on owned places.

    RegainLoanedCapability

    Fields:

    • - Place
    • - Capability

    This operation is used to indicate that is no longer borrowed, and can therefore be restored to capability .

    In principle I think should always be exclusive capability

    Applying RegainLoanedCapability

    The PCG applies this operation by setting the capability of to .

    DerefShallowInit

    Fields:

    • - From Place
    • - To Place

    This operation is used to indicate that a (which is a shallow-initialized box) was dereferenced.

    Applying DerefShallowInit

    Let be the expansion of obtained by expanding towards .

    The PCG applies this operation by adding read capability to the places in

    Why do we use this logic?

    Collapse

    Fields:

    • - Place
    • - Collapse Guide
    • - Capability

    This operation indicates that the expansion of should be packed (using guide ) with resulting capability .

    Applying Collapse

    Let be the expansion of towards guide place .

    Preconditions:

    • Each place in has a capability

    Let be the minimum capability of the places in .

    • Capability for each place in is removed.
    • Capability for is set to .
    • The Unpack edge from is removed

    The current implementation guarantees there is only one unpack edge from . In the future this may change.

    Expand

    Fields:

    • - Place
    • - Expand Guide
    • - Capability

    This operation indicates that should be expanded (using guide ) such that each place in the expansion has capability .

    Applying Expand

    Let be the expansion of towards guide place .

    Preconditions:

    • has capability
    • The unpack edge is added
    • Capability for every place in is set to
    • If is Read capability, the capability of is set to Read
    • Otherwise, if is not Read, capability of is removed

    Note that reference-typed places will never be expanded.

    Borrow PCG Actions

    LabelLifetimeProjection

    Fields

    • predicate - A predicate describing lifetime projections that should be labelled
    • label - The label to apply (current, FUTURE or a label )

    Applying LabelLifetimeProjection

    Replaces the label associated with lifetime projections in the borrow PCG matching predicate. If label is current, then the label of each matching lifetime projection is removed.

    Weaken

    Fields:

    • - Place
    • - From capability
    • - (Optional) To capability

    Used to reduce the capability of a place. In general the is Exclusive, for example in the following cases:

    • Before writing to , capability should be reduced to Write
    • When a two-phase borrow is activated, capabilities to places conflicting with the borrowed place should be removed

    Applying Weaken

    Preconditions:

    • Place has capability

    If is defined, the capability of is set to . Otherwise, capability to is removed.

    RestoreCapability

    Fields:

    • - Place
    • - Capability

    Instructs that the capability to the place should be restored to the given capability, e.g. after a borrow expires, the borrowed place should be restored to exclusive capability.

    Applying RestoreCapability

    The capability of is set to .

    LabelPlace

    Note this is called MakePlaceOld in the implementation.

    Fields

    • - Place
    • reason - Why the place is to be made labelled

    The purpose of this action is to label current versions of (and potentially prefixes and postfixes of ) with the label corresponding to the last time they were updated. The Latest Map (described here) determines what label is to be used.

    There are six reasons defined:

    • StorageDead
    • MoveOut
    • ReAssign
    • LabelSharedDerefProjections
    • Collapse

    Applying LabelPlace

    The behaviour of this action depends on reason:

    ReAssign, StorageDead, MoveOut

    The places to be labelled are:

    • All postfixes of
    • All prefixes of prior to the first dereference of a reference.

    For example, if is (*x).f, then *((*x).f), (*x).f, and *x will be labelled.

    Collapse

    The place is labelled (but none of its prefixes or postfixes).

    LabelSharedDerefProjections

    All strict postfixes of are labelled.

    RemoveEdge

    Removes an edge from the graph. If the removal of the edge causes any place nodes to be removed from the graph, the capability of those places are removed.

    AddEdge

    Inserts an edge into the graph. This does not change the capabilities.

    SetLatest

    Updates the latest label of place to label in the Latest Map.