What do concurrent traces have to say about determinism?

Let's talk about execution traces of concurrent programs, and which ones we consider "the same."

In concurrent programming models, your program constructs allow you to specify computation as indexed by a particular independent execution unit ("thread", "channel", "actor" -- at this level of detail, we can use these terms interchangeably), and to specify how those units communicate or interact. This means you have one artifact, the static program, that gives rise to a set of traces, or possible executions -- records of which parts of the program were evaluated in what order.

We might say that if this set of program traces has only one element, then the program is sequential. But the accuracy of this statement depends on how exactly we represent a program trace! If we are forced to give a total ordering on all computation steps -- unifying a trace with an "interleaving of instructions" -- then yes, the only class of programs for which every trace is "the same" is those that are deterministic and sequential.

But note that this representation precludes deterministic parallel programs, a notable use case for concurrent programming (see Lindsey Kuper's excellent post on the relationship between parallelism and concurrency). You could have two cores simultaneously computing independent things, each of which are totally deterministic -- but you cannot give a total ordering on the steps of execution. You could conceivably do so with respect to a global clock, but such an approach seems uninformative for reasoning about the program you wrote irrespective of the hardware it's running on.

Instead, we can represent an execution trace as a partial ordering of evaluation steps. Or, in other words, we can treat two totally-ordered traces as the same whenever their ordering choices only differ for independent computational components.

This equivalence relation is what literature on CLF refers to as "concurrent equality":

let x1 = e1 in let x2 = e2 in t
is concurrently equivalent to
let x2 = e2 in let x1 = e1 in t

iff e2 doesn't use x1 and e1 doesn't use x2.

Explicating a full trace in terms of its concurrent structure reveals the aforementioned partial order, and means we can visualize the trace as a DAG.

For example, here's a CLF program to sort a list represented as a mapping from locations to elements, by swapping any two adjacent elements that aren't in order, bubble-sort-style:


When we run it on the described 3-element input, a possible trace of the output has the following structure:


There's another possible trace, one that swaps the second and third array positions first, rather than the first and second. Or, viewing the swaps pictorially, there are two orderings for swaps:



We can identify these traces as distinct because of the different dependencies they introduce; each ordering between two swaps is necessary and non-arbitrary. Note: the program is (observably) deterministic; both executions take the same input to the same output.

On the other hand, here are the possible traces for sorting the four-element array [4,3,2,1]:


Again, there are two of them. But that number depends on the syntax I've written them down in. If I required only one swap per line -- one "at a time" -- then I'd have to make an arbitrary choice about which of two independent swaps to do first (e.g. (4,3) or (2,1)?).

In this sense, the 4-element sort traces have concurrent structure where the 3-element traces do not. Remember that both programs are deterministic, and both have multiple possible interleavings of computation units ("scheduler nondeterminism"?). But the difference between these programs is, I think, not well articulated by existing concurrency terminology. Or perhaps it's appropriate to use the term parallelism in this setting? In one set of traces, there is nondeterminism but no parallelism; in the latter set there is both; in Tamara there is parallelism but no (essential) nondeterminism. It still feels like the wrong word, though, because there's nothing inherent about "running computation on different cores simultaneously"; it's just a lack of dependency between actions.

Anyway - what I find so compelling about CLF's notion of concurrency is that it isn't something we talk about as a property of a behavior, it's something we can talk about as a property of a "static artifact" -- the script to a play, or an investigative report of people's independent/interactive behavior. Something to be analyzed after-the-fact of that behavior.

It's my hope that someday we can stop thinking of concurrency as something only related to low-level system processes and instead use it to form compelling mental and programming models of multi-agent behavior.

--

Edit: Section 7.2 in this ebook about multiagent systems contains a strikingly similar discussion to the one I've laid out here.

Comments

  1. You don't have to think in terms of interleavings of traces. Pratt's "true concurrency" uses cubical sets so that a square represents the identification of two interleavings going from one corner to its diagonal opposite. Please don't confuse this with parallelism!

    ReplyDelete
    Replies
    1. Thanks for the pointer. Is there a specific reference you have in mind for "true concurrency?" It looks like this site from LIX is related to what you're saying:

      http://www.lix.polytechnique.fr/~goubault/link002.html

      What I feel I'm missing is a clear vocabulary for distinguishing programs whose trace sets have these different properties. I agree that "interleavings" seems like the wrong level of abstraction on which to discuss that difference.

      Delete
    2. Rooting around a little more I found

      Vaughan Pratt
      "Modeling Concurrency with Geometry"
      http://boole.stanford.edu/pub/cg.pdf

      (posting for my own future reference and for the benefit of others interested in Bob's remarks)

      Delete
  2. A bit orthogonal to what you're talking about, but you seem to be saying that a program is deterministic if it always takes the same input to the same output, i.e. the function it implements is deterministic. That might make sense in a logic programming context, or a deterministic vs. nondeterministic context, but it doesn't allow for randomized algorithms to implement deterministic functions. Even in parallel algorithms, there are interesting uses of randomization to implement deterministic functions.

    But maybe I'm overthinking this because I just finished teaching algorithms and I put a question on the final exam about randomized quicksort :).

    ReplyDelete
    Replies
    1. Hi there - I think that using "deterministic" to mean "observably deterministic" by default is standard. My very point in this post is that within that class of programs, subtler distinctions can be made.

      Delete
    2. Sure, which is why I said it was an orthogonal point. It just struck me based on my gut intuition of contrasting "deterministic" with "randomized", rather than with "nondeterministic".

      Delete
    3. My example in this post is a randomized (bubble) sorting algorithm, so we must be talking past each other. I don't think it's an orthogonal point; I think it's a rephrasing of the point I was already making. :)

      Delete
  3. Robert has already point out that there are better representations for concurrent computations then traces. I don't know much about geometric models for true concurrency, but I like event structures and the way they explicitly handle conflicting and independent events.

    ReplyDelete
    Replies
    1. I'm not sure what Event Structures give you that the CLF trace representation doesn't. In that vocabulary, is there a way to state the difference between the kinds of nondeterminism in the 3-element sort vs the 4-element sort? "causal nondeterminism" vs "independence nondeterminism" or something?

      Delete

Post a Comment

Popular posts from this blog

Using Twine for Games Research (Part II)

Reading academic papers while having ADHD

Using Twine for Games Research (Part III)