Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Overview

The purpose of the PCG Analysis is to provide clients with the following:

  • The PCG data structure representing the state of ownership and borrowing of Rust places at arbitrary program points within a Rust function
  • For any pair of consecutive program points, an ordered list of actions that describe the transformation of the PCG of to the PCG of .

PCG Analysis Algorithm

Key Concepts:

  • The PCG Analysis Algorithm operates on the MIR Body of a Rust function and returns a PCG Analysis of the function.
  • A PCG Analysis contains PCG Data for every reachable1 MIR location in the Body2.
  • The PCG Data is a tuple of PCG States and PCG Action Data.

The PCG States of a MIR location is a list of the PCGs computed at that location, concretely:

The PCG Action Data of a MIR location contains a list of PCG Actions for each evaluation phase in the location (i.e. the actions performed at that phase).

The PCG Dataflow Analysis

Work in progress

The PCG Analysis Algorithm is implemented as a MIR dataflow analysis using PcgDomainData as the domain. PcgDomainData contains a PCGData value and other relevant metadata (e.g the associated basic block). Notably, the analysis only analyzes each basic block one time. Conceptually, this property is ensured because the final state at a loop head is computed upon entering it in the analysis (without having first seen the body).

We note that the behaviour of the join operation on PCGDomainData requires careful tracking of what blocks have been previously joined (this is basically a consequence of the interface of the MIR dataflow analysis). We define (joining the state computed at into ) as follows:

  • Let be the associated blocks of and respectively
  • If no block has ever been joined into , then \mathit{join}(s', s)

In the implementation the join operation associated with the PcgDomainData, we have if is the state of a loop head and is the state of a back edge; this ensures that loop heads are only considered once3.

Our implementation should also be checking that the PCG generated at the loop head is valid w.r.t the state at the back edge here, but this is not happening yet.

We note that to correspond

PCG Data Structure

The PCG data structure represents the state of ownership and borrowing of Rust places at an arbitrary program point.

Perhaps this describes too much implementation-specific details?

In our implementation, it consists of three components:

  • The Owned PCG, which describes the state of owned places
  • The Borrow State, which describes borrowed memory (borrowed places and lifetime projections) and borrow relations, and also some auxillary data structures
  • The Place Capabilities which is a map from places to capabilities

Owned PCG

The Owned PCG is a forest, where the root of each tree in the forest is a MIR Local.

Each tree in the forest is defined by a set of place expansions, which describe how unpacked all of the owned places are. For each expansion in the set, the base is a node in the forest and are its children. We note that each expansion can be similarly interpreted as a hyperedge

Each local in the body of a MIR function is either unallocated or allocated in the Owned PCG. A local is allocated iff there exists a corresponding root node in the Owned PCG, otherwise it is unallocated.

The operation allocate in the Owned PCG requires that is not allocated, and adds a new root for . The deallocate operation removes the forest rooted at .

Borrow State

The Borrow State is a 3-tuple containing a Latest Map describing the latest locations of each place; a set of Validity Conditions that describes the set of paths leading to the current block; and a Borrows Graph, a directed acyclic hypergraph which describes borrowed places, sets of borrows, and their relation to one another. The Borrows Graph is represented as a set of PCG hyperedges.

Because a borrow created within a block exists only for executions that visit that block, we label new borrows using the validity conditions of the block in which they were created.


  1. A MIR location reachable iff its basic block is reachable from the start block in the CFG without traversing unwind or fake edges (the latter kind do not correspond to the actual control flow of the function). The original reason for only considering reachable edges was to improve performance; removing this constraint (and instead considering all locations) would be simple to change in the implementation.

  2. The PCG analysis algorithm is implemented as a MIR dataflow analysis defined by the rust compiler crate. In the implementation, the PCG Data is computed for every reachable MIR location during the algorithm execution itself, but only the PCG Data for the entry state of each basic block is stored. The PCG Data for an arbitrary location within a block is re-computed by applying the dataflow analysis transfer function from the entry state.

  3. We should confirm that this also holds in the implementation as this previously was not the case. However, visiting the same stmt multiple times should only affect performance, not correctness.