Thursday, December 19, 2013

Learning typed functional programming: obstacles & inroads

Yesterday, there was a discussion on a mailing list I'm on about a perception of gender diversity problems in the communities around functional programming, type theory, and programming language theory, even relative to other areas of computer science. After some speculation about education barriers and community approachability, I decided to conduct an informal survey on Twitter:

You can read several of the (filtering for sarcasm/criticisms of languages) collected responses on Storify. I left "typed", "functional", and the combination thereof intentionally ambiguous because I was interested in how people interpreted those words as well as their reactions to whatever programming constructs they associated with them.

Because I promised not to argue with anyone who replied, a bunch of responses have been tumbling around in my head about how different my experiences have been from the ones described here. In general, I agree about the cultural tendencies and have observed plenty of that myself.

But I think what a lot of the non-people-focused responses seem to be telling me is that we're doing a terrible job of advertising, explaining, and demonstrating what typed-functional languages do, and especially what types are good for.

The reason I'm excited about types now, 8 or 9 years after using a Hindley-Milner type-inferred language for the first time, is that there's a correspondence with logic that gives rise to all kinds of useful and fascinating research. (By the way, this is why that one response saying "I have a logic brain but not maths" kind of broke my heart!) But if I think back to why learning ML felt like a godsend after a few years of Java, C, and C++, I remember a few different things:

- SML/NJ had a REPL. I could experiment with tiny pieces of the language and the code I was trying to write before putting them all together in a big file with a top-level entry point.

- Signatures (declarations of new types and functions) were separate from modules (implementations). I could think about the interface I wanted to program to without actually writing the code to do it.

- Algebraic/inductive datatypes and pattern matching. Ok, so this is secretly very related to Curry-Howard and the things that excite me about types now, but at the time the ability to write

datatype 'a tree = Leaf | Node of 'a * 'a tree * 'a tree

...instead of implementing a collection of functions to do pointer or field manipulation, and then immediately being able to write traversal & search functions (again without manipulating pointers or fields), just felt like a huge practical advantage. An entire data structure definition in one line! These building blocks -- type variables, recursive definitions, disjunction (|) and tupling (*) felt like they could make Lego-like structures which were so much more tangible to me than the way I would encode the same thing in C or Java. (Later I would learn that this is the difference between positive and negative types, both of which have their uses, but half of which are missing or at least highly impoverished in most OO settings.)

- Highly related to the above point, the ability to "make illegal states unrepresentable", & thus not have to write code to handle ill-formed cases when deconstructing data passed in to a function.

- The thing Tim said:


Over a couple of years, I found that I was less drawn to the "safety net" features of type systems and more to their utility as design tools. In other words, I found myself discovering that types let you carve out linguistic forms that map directly onto your problem rather than maintaining in your head whatever subset of the language you're using actually corresponds to the problem domain. A super basic example of this is just the ability to create a finite enumeration of, say, colors, rather than using integers, a subset of which you're using to represent different colors. But this goes way beyond base types. I think of every program I write as coming with its own DSL, and I want the language I'm using to support me programming in that DSL as directly as possible, with a type-checker to let me know when I've stepped outside my design constraints. Dependent types especially have a lot to offer in that arena.

Anyway, the point of this is not to tell folks that have struggled with learning type systems that you're wrong or that you just didn't get it. I hear "this was too hard for me" and then that you can write web code in JavaScript or a raytracer in C, and that boggles me, because that's so much harder for me. So I thought it might be elucidating to share my experience, which seems very different from yours.

I'm also directing some frustration at educators (including myself) for being apparently so bad at getting these kinds of things across, and at adapting typed languages & research projects to the needs of people who aren't already entrenched in them, that any programmer could think they are not smart enough to use them. Personally, I don't feel smart enough to do without them.

Wednesday, December 11, 2013

Post-proposal reading list

I passed my thesis proposal!

As I've spoken to more and more people about my work, I'm learning about a bunch of exciting related systems, and now I have a big pile of reading to do! Here's what's currently on my stack:

The Oz Project: Bob Harper and Roger Dannenberg brought to my attention a defunct CMU interactive fiction project called Oz, which efforts had some brief extra-academic fame through the Façade game. I've printed out the following papers which seem most relevant to my work:

Hap: A Reactive, Adaptive Architecture for Agents
A. Bryan Loyall and Joseph Bates. Technical Report CMU-CS-91-147, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, June 1991.

An Architecture for Action, Emotion, and Social Behavior
Joseph Bates, A. Bryan Loyall, and W. Scott Reilly. Techical Report CMU-CS-92-144, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA. May 1992. Also appearing in Artificial Social Systems: Fourth European Workshop on Modeling Autonomous Agents in a Multi-Agent World, Springer-Verlag, Berlin, 1994.


Adam Smith's work: I met Mike Treanor when he interviewed at CMU, and he pointed me in the direction of a few of his colleagues at UCSC working on logic programming for games prototyping, specifically Adam Smith. It turns out that his advisor was Michael Mateas, one of the members of the Oz project. I'm reading:

Bits of his dissertation, Mechanizing Exploratory Game Design

Answer Set Programming for Procedural Content Generation: A Design Space Approach

LUDOCORE: A Logical Game Engine for Modeling Videogames

Towards Knowledge-Oriented Creativity Support in Game Design


Kim Dung Dang's work: My external committee member Gwenn Bosser recommended to me the recent thesis work of Kim Dung Dang, who is also using linear logic for games, specifically interactive storytelling applications. I'm reading:

Kim Dung Dang, Ronan Champagnat, Michel Augeraud: A Methodology to Validate Interactive Storytelling Scenarios in Linear Logic. T. Edutainment 10: 53-82 (2013)

Kim Dung Dang, Steve Hoffmann, Ronan Champagnat, Ulrike Spierling: How Authors Benefit from Linear Logic in the Authoring Process of Interactive Storyworlds. ICIDS 2011: 249-260

Saturday, August 31, 2013

Some classes of effectful programs

I'm going to do something unusual for this blog and talk about functional programming for a bit.

Jessica Kerr wrote a blog post about the concept of idempotence as it's used in math and programming, and I decided I wanted to lay the ideas out in more familiar terms to me.

So, consider a pure mini-ML with functions, products, numbers, lists, polymorphism, and let-binding, but no references (or other effects -- for the duration of this post I'm going to limit my notion of "effect" to references into a store). This is Language 1 (L1). Now consider adding references and sequencing to this language (i.e. new expressions are ref v, r := v, !r, and (e; e)). This is Language 2 (L2).

The operational semantics of Language 2 includes a store mapping references to values, for which I'll use the metavariable S. In addition to giving types T -> T' to L2 functions, we can give them store specifications S -o S' (using the linear implication symbol -o suggestively but not especially formally) to suggest how they might modify it; e.g. a function f(r) = r := 3 has store spec (arg L * (L |-> V) -o (L |-> 3)) (saying that if the location L referred to by the argument had value V before the function was called, it has value 3 after the function is called).

Here are some classes of Language 2 programs:

1. Effectless. 

This is the subset of Language 2 programs that are also Language 1 programs.

f : 'a -> 'b 
meeting spec S -o S (keeps the store the same)
e.g.: f(x) = x

This is sometimes called "referentially transparent".

2. Observably effectless.

f : 'a -> 'b
meeting spec S -o S' (might change the store)
where f(x) = (f(x); f(x))

but always returns the same output for the same input.

e.g.:
let r = ref nil in 
f(x) = (r := (x::!r); x) 

The function might have some internal effect, but the caller can't observe that based on the returned values. I'd argue this might be a more useful class for the term "referentially transparent", since "effectless" is a much better term for class 1, but honestly I don't find the former term informative at all and would rather do away with it.

3. Effectively idempotent.

f : 'a -> unit
meeting spec S -o S'
where f(x) ~ (f(x); f(x))

...where ~ is a notion of equivalence on the store; i.e., if we were write the store explicitly and label its state transitions with function calls:

S --f(x)--o S' --f(x)--o ... --f(x)--o S'

Calling f(x) once can change the store, but calling it an arbitrary number of subsequent times won't induce subsequent changes.

In this case, we don't care about the return value (made it of unit type to emphasize this point), but rather the effects of calling the function. The example I gave for "observably effectless" is not effectively idempotent.

I believe this is the concept that programmers who work across multiple APIs care about when they talk about "idempotence" -- the side-effects of such functions might be sending an email or posting to a user's timeline, which is definitely an effect on the world, but probably not one that you want to repeat for every impatient click on a "submit" button.

4. Mathematically idempotent

This is a property that doesn't have anything to do with effects -- we can talk about it for L1 programs or L2 programs, it doesn't matter. We don't care whether or not it has an effect on the store.

f : 'a -> 'b
f(x) = f(f(x))
e.g.: f(x) = 3

What class 4 has to do with class 3

We can compile L2 programs to L1 programs by reifying the store and location set in L2's semantics as a value in L1 (anything that can represent a mapping from locations to values). 

If we're interested in the class 3 functions of L2, we just need to consider programs of the form

(f(x); f(x))
for
f : 'a -> unit.

If st is our distinguished store type,
f becomes f' : 'a * st -> st
and (f(x); f(x)) in a store reified as s becomes
f' (x, f'(x, s))

Now the property defining class 3 can be translated as:
f(x, s) = f(x, f(x, s))

...which looks, modulo the extra argument, just like the definition in class 4.

Tuesday, April 30, 2013

Paper draft: Linear Logic Programming for Narrative Generation

Gwenn Bosser, João Ferreira, and I have just submitted a paper to LPNMR '13!

Linear Logic Programming for Narrative Generation

Abstract:
In this paper, we explore the use of Linear Logic programming for story generation. We use the language Celf to represent narrative knowledge, and its own querying mechanism to generate story instances, through a number of proof terms. Each proof term obtained is used, through a resource-flow analysis, to build a directed graph where nodes are narrative actions and edges represent inferred causality relationships. Such graphs represent narrative plots structured by narrative causality. Building on previous work evidencing the suitability of Linear Logic as a conceptual model of action and change for narratives, we explore the conditions under which these representations can be operationalized through Linear Logic Programming techniques. This approach is a candidate technique for narrative generation which unifies declarative representations and generation via query and deduction mechanisms.

Monday, March 25, 2013

Modeling gameplay in Celf, Part 3

(This is another iteration of the example I developed in Part 1 and Part 2, but barring incrementally understanding the code, I think this post is relatively self-contained. Celf-contained, if you will.)

When I took a simple choice-based ("CYOA") game with a few bits of inventorial state and tried to add handles onto the rules so as to specify a specific sequence of player choices, something interesting happened: I had to make new decisions about which parts of the game the player could control, and how. For instance, whether they win or get eaten by a grue depends on a prior choice to take the lamp from the den or not; they cannot control their fate after that point. This makes clear that "getting the lamp" and "opening the door" are player-facing game controls, whereas "getting eaten by a grue" is a choice made by the game. We wound up enumerating those actions as follows.

'start : action.
'opendoor : action.
'getlamp : action.
'getkey : action.
'starttoden : action.
'starttocellar : action.
'dentocellar : action.
'cellartodoor : action.
'cellartoden : action.

It's tempting, then, to give the player a generalized, combinatorial command language, rather than a finite set of available actions, like so:

'startat : room -> action.
'open : object -> action.
'get : object -> action.
'moveto : room -> action.

For this version (which includes a few other small syntactic changes) the game rules look like this:

start_to_den    : cur_act ('startat den) * at_start -o {at_den * tick}.
start_to_cellar : cur_act ('startat cellar) * at_start -o {at_cellar * tick}.

den_to_cellar : at_den * cur_act ('moveto cellar) -o {at_cellar * tick}.
den_to_lamp   : at_den * cur_act ('get key) * ~got key -o {at_key}.
den_to_key    : at_den * cur_act ('get lamp) * ~got lamp -o {at_lamp}.
get_key       : at_key -o {got key * at_den * tick}.
get_lamp      : at_lamp -o {got lamp * at_den * tick}.

cellar_to_den  : at_cellar * cur_act ('moveto den) -o {at_den * tick}.
cellar_to_door : at_cellar * cur_act ('open door) -o {at_door}.

open_door_without_key : at_door * ~got key -o {at_cellar * ~got key * tick}.
open_door_with_key    : at_door * got key -o {at_dark}.

dark_with_lamp    : at_dark * got lamp -o {at_win}.
dark_without_lamp : at_dark * ~got lamp -o {at_lose}.

Then I started to wonder if I could recover the "fuzz testing" abilities from the original, branching-choice version of the game: could I still use Celf's logic programming engine to randomly "play" the game?

So I replaced this rule, which pulls a next action from a sequential table

next_act : tick * cur N * nth_act N A -o {cur_act A * cur (s N)}.

with this one:

player : tick * cur N -o {cur (s N) * (Pi a:action.cur_act a)}.

...and wasn't optimistic. The Pi a:action part within the forward-chaining monad generates a template cur_act in the context that can be instantiated with any action. Naïvely, what I thought would happen is that forward chaining would instantiate cur_act at non-applicable actions all over the place, meaning that queries on end states would most of the time fail (the game would reach stuck states).

Thinking about this more, in terms of focusing behavior and by analogy with A -> B, a rule generating Pi x:A.B ought to keep the Pi in focus, forcing a choice of A (e.g. action). But since the proposition in question is actually a type, and depends upon the particular derivation of it, I suspect (as suggested, but glossed over, in Frank's course notes) that it's generating a fresh unification variable that will remain unresolved until further constraints are introduced. In this sense, it sort of gives Pi a more positive character than ->.

 The upshot is that my #query * * * 50 init -o {report END NSTEPS} generates 50 pretty solutions, some winning and some losing, a shorter example of which looks something like this:


Solution: \X1. {
    let {[X2, [X3, [X4, [X5, X6]]]]} = X1 in 
    let {[X7, X8]} = player [X6, X5] in 
    let {[X9, X10]} = start_to_den [X8 !('startat !den), X4] in 
    let {[X11, X12]} = player [X10, X7] in 
    let {X13} = den_to_lamp [X9, [X12 !('get !key), X2]] in 
    let {[X14, [X15, X16]]} = get_key X13 in 
    let {[X17, X18]} = player [X16, X11] in 
    let {X19} = den_to_key [X15, [X18 !('get !lamp), X3]] in 
    let {[X20, [X21, X22]]} = get_lamp X19 in 
    let {[X23, X24]} = player [X22, X17] in 
    let {[X25, X26]} = den_to_cellar [X21, X24 !('moveto !cellar)] in 
    let {[X27, X28]} = player [X26, X23] in 
    let {X29} = cellar_to_door [X25, X28 !('open !door)] in 
    let {X30} = open_door_with_key [X29, X14] in 
    let {X31} = dark_with_lamp [X30, X20] in 
    let {X32} = report_win [X31, X27] in X32}
 #END = w
 #NSTEPS = s !(s !(s !(s !(s !z))))


...which demonstrates (thanks to the ' syntactic markers) how the "player AI" chose to instantiate the universal quantification in a goal-directed way to satisfy the rule, with exactly the random-but-constrained character as before.

So I think that's pretty neat.

 (Code here.)

Wednesday, March 20, 2013

Modeling gameplay in Celf, Part 2: Simulating Interactivity

Where I last left off, I gave a toy example of using Celf to specify a game's causal structure, summarized by the following ruleset:
start_to_den_or_cellar : start -o {den & cellar}.
den_to_cellar_lamp_or_key : den -o 
        {cellar
         & (nolamp -o {getlamp}) 
         & (nokey -o {getkey})}.
get_lamp : getlamp -o {gotlamp * den}.
get_key : getkey -o {gotkey * den}.
cellar_to_den_or_door : cellar -o {den & opendoor}.
open_door_without_key : opendoor * nokey -o {cellar * nokey}.
open_door_with_key : opendoor * gotkey -o {dark}.
dark_with_lamp : dark * gotlamp -o {win}.
dark_without_lamp : dark * nolamp -o {lose}.
init : type = {nokey * nolamp * start}.
As presented, this is effectively "half a game", with no delineation between player choice and game logic -- a fact that allowed us to randomly sample the space of all possible play sequences by querying any final gameplay state -- but that's ultimately uninformative about interactivity.

The key idea of the transformation I'll outline in this post is that we can delineate the boundary between game and player using two atoms treated as "signals", one corresponding to a choice or query from the player (think "event handling") and one corresponding to the end of the game's internal computation, returning control to the player (or, in a more complex setting, to a renderer or printing mechanism).

Each rule that handles a player control will have an extra guard premise cur_act(A) for an action A corresponding to that rule, and each rule that returns control to the player will simply issue a tick, which will be handled by a rule serving as proxy between game and player.

(Edit: I don't know why the formatting insists on being terrible for the rest of the code here. The final, complete text can be referred to in twine-interact.clf on Github.)

As a first pass, we need to disentangle all of the &'d-together choices and present them as separate rules (each of which will get its own guard).
start_to_den : start -o {den}. 
start_to_cellar : start -o {cellar}.
den_to_cellar : den -o {cellar}.
den_to_lamp : den * nolamp -o {getlamp}.
den_to_key : den * nokey -o {getkey}.
get_lamp : getlamp -o {gotlamp * den}.
get_key : getkey -o {gotkey * den}.
cellar_to_den : cellar -o {den}.
cellar_to_door : cellar -o {opendoor}.
open_door_without_key : opendoor * nokey -o {cellar * nokey}.
open_door_with_key : opendoor * gotkey -o {dark}.
dark_with_lamp : dark * gotlamp -o {win}.
dark_without_lamp : dark * nolamp -o {lose}.
Next, we'll specify the allowable player actions. (The ' symbol is just an identifier convention I'm using to distinguish actions; it has no special meaning in Celf.) The rules in green have been modified to only fire when their corresponding action is current. Omitting added type declarations, this is all that's changed:
start_to_dencur_act 'starttoden * start -o {den}.
start_to_cellarcur_act 'starttocellar * start -o {cellar}.
den_to_cellarcur_act 'dentocellar * den -o {cellar}.
den_to_lampcur_act 'getlamp * den * nolamp -o {getlamp}.
den_to_keycur_act 'getkey * den * nokey -o {getkey}.
get_lamp : getlamp -o {gotlamp * den}.
get_key : getkey -o {gotkey * den}.
cellar_to_dencur_act 'cellartoden * cellar -o {den}.
cellar_to_doorcur_act 'cellartodoor * cellar -o {opendoor}.
open_door_without_key : opendoor * nokey -o {cellar * nokey}.
open_door_with_key : opendoor * gotkey -o {dark}.
dark_with_lamp : dark * gotlamp -o {win}.
dark_without_lamp : dark * nolamp -o {lose}.
The job of cur_act is to serve as a sort of "time-varying value" determined by a sequence of actions that can be specified separately. We can encode that sequence as a relation between a natural number N and an action A, saying that the Nth action is A. Here's an example of an action sequence:

% 'starttoden, 'getlamp, 'getkey, 'dentocellar, 'cellartodoor.
act0 : nth_act z 'starttoden.
act1 : nth_act (s z) 'getlamp.
act2 : nth_act (s (s z)) 'getkey.
act3 : nth_act (s (s (s z))) 'dentocellar.
act4 : nth_act (s (s (s (s z)))) 'cellartodoor.
Now we need to connect cur_act to this table. Let's try to be modular by putting it in a single rule:
next_act : cur N * nth_act N A -o {cur_act A * cur (s N)}.
However, this approach poses a problem -- it's not keeping our code synchronized the way we intend. Imagine initializing the context with cur z -- this rule could fire until we'd loaded every action into the context without actually running the game code!

My solution is to add another guard premise. While cur_act passes control from player to game, this one, which I call tick should just pass it back in the other direction, where the next_act rule serves as a proxy for the player.
next_act : tick * cur N * nth_act N A -o {cur_act A * cur (s N)}.

Now we need to do one more global pass on the game logic, issuing a tick at the end of every control sequence. For movement rules, this is just immediately after receiving the action. For rules that involve more complicated checks and have multiple cases (like open_door), this occurs at the end of each branch:
start_to_den : cur_act 'starttoden * start -o {den * tick}.
start_to_cellar : cur_act 'starttocellar * start -o {cellar * tick}.
den_to_cellar : cur_act 'dentocellar * den -o {cellar * tick}.
den_to_lamp : cur_act 'getlamp * den * nolamp -o {getlamp}.
den_to_key : cur_act 'getkey * den * nokey -o {getkey}.
get_lamp : getlamp -o {gotlamp * den * tick}.
get_key : getkey -o {gotkey * den * tick}.
cellar_to_den : cur_act 'cellartoden * cellar -o {den * tick}.
cellar_to_door : cur_act 'cellartodoor * cellar -o {opendoor}.
open_door_without_key : opendoor * nokey -o {cellar * nokey * tick}.
open_door_with_key : opendoor * gotkey -o {dark}.
dark_with_lamp : dark * gotlamp -o {win}.
dark_without_lamp : dark * nolamp -o {lose}.

For fun, we can load up our reporting mechanism with the number of steps we took, then load up the initial context and do a query:
report_win : win * cur N -o {report w N}.
report_loss : lose * cur N -o {report l N}.
init : type = {nokey * nolamp * start * cur z * tick}.
#query * * * 1 init -o {report END NSTEPS}.

Yielding the solution
Solution: \X1. {
    let {[X2, [X3, [X4, [X5, X6]]]]} = X1 in
    let {[X7, X8]} = next_act [X6, [X5, act0]] in
    let {[X9, X10]} = start_to_den [X7, X4] in
    let {[X11, X12]} = next_act [X10, [X8, act1]] in
    let {X13} = den_to_lamp [X11, [X9, X3]] in
    let {[X14, [X15, X16]]} = get_lamp X13 in
    let {[X17, X18]} = next_act [X16, [X12, act2]] in
    let {X19} = den_to_key [X17, [X15, X2]] in
    let {[X20, [X21, X22]]} = get_key X19 in
    let {[X23, X24]} = next_act [X22, [X18, act3]] in
    let {[X25, X26]} = den_to_cellar [X23, X21] in
    let {[X27, X28]} = next_act [X26, [X24, act4]] in
    let {X29} = cellar_to_door [X27, X25] in
    let {X30} = open_door_with_key [X29, X20] in
    let {X31} = dark_with_lamp [X30, X14] in
    let {X32} = report_win [X31, X28] in X32}
 #END = w
 #NSTEPS = s !(s !(s !(s !(s !z))))

The point of all this was (1) to demonstrate how to simulate tightly-controlled interaction in a parametric way (while this was a global transformation of the original program, the only part that depended on the specific action sequence was the table representing that sequence); (2) to show that the very specific, structural use of the "control-passing" atomic formulae suggests, as with "design patterns", a possible useful extension to the language -- specifically, one that lets us express the programmer intent of separate modules or processes, and lets us check that that intent is preserved.

Sunday, March 17, 2013

How to create the PL culture I'd like to believe we deserve

I herein interrupt my (ir)regular schedule to post about something sociological rather than metalogical.

I considered relegating this content to my personal blog, but honestly I think these words need to fall on the ears of exactly the folks who are mired enough in the technical community to follow research blogs. This is a post about "PL culture". What I mean by that is the characteristics, defined by the perceptions of its constituents, of "the PL community" -- or any subset of active programming languages researchers who find themselves in each others' company, perhaps at a conference or within a research group at a particular university.

In particular, I want to understand how PL might be failing as a community, and by that I mean either (1) fostering attitudes that make people within it feel othered/alienated/estranged or (2) serving as a barrier to the potential people and ideas it could include but doesn't. I want to cast some attention on the experiences of anyone who might ever sit around a table with a research group or stand around in a hallway at a conference and think, I'm never going to succeed in this community, because everyone here is so much more ____ than me.

On one level, this is probably recognizable to almost everyone at some point in their career: fill in the blank with "intelligent", "prolific", "knowledgeable", and you've got textbook impostor syndrome. What they don't tell you about impostor syndrome is that even if you're not an impostor in terms of your ability to work hard and think cleverly, you really might not belong in the sense that the prevalent culture is working against you, is engineered for the success of people who want different things than you or whose backgrounds experiences more closely match those around them.

Let me take a moment to be clear where I'm coming from: I consider PL my home, my academic family that raised me (is still very much raising me) as a computer scientist and thinker. I've just been part of CMU's graduate recruiting process for the 5th year in a row, meeting with potential PL grad students and talking to them about what it's like to be one. Because of this regularly-scheduled checkpoint in which I describe my experiences out loud, I've been able to listen to my own sentiments evolve over time. And a fact that has stayed true that entire time is that PL is my home.

So to realize slowly, over the years, that there are a number of ways I feel dissonant with that culture entails a bit of a heartbreak. It feels necessary in that to differentiate myself from a collective mindshare, of course I'm going to feel, well, different, but what worries me is that the cultural memes within academic PL are not on my side in terms of continuing its support and inclusion of me in light of that difference.

And I've discovered that I'm not alone. I asked twitter in what ways they felt PL culture could be alienating, and got a number of really interesting responses, many quite different from mine. Sadly, several people who shared with me (privately) their similar feelings wound up departing from academia, PL, or both. I found some disturbing patterns especially in the treatment that women (trans and cis) and gender-noncomforming students encountered, in terms of "accidental" estrangement and worse. But even those not visibly different reported feeling invisible or excluded due to abnormal background, research interests, or argumentation habits.

I'm going to assume we all agree that this is a problem, first of all. I know that isn't the case, but feel free to check out now if you disagree with, for example, the gender imbalance in the field being a problem, because I'm not here to make a concerted argument on that point. All I will say is this: I want anyone eager to learn about and do PL to be pulled in and educated and embraced and listened to so that ultimately, they productively crank out awesome research that we can all learn from, and then we all crank out even more awesome research. End of story. If you're on board with that, read on.

Of course, it's significantly easier to say "yes, we want to include everyone!" than it is to actually be a fertile community for them to not only feel welcome but to thrive and lead. In light of that, I want to try to present some concrete advice that you can apply whenever socializing and collaborating with other PL enthusiasts or potential PL enthusiasts.

How to create the community we all deserve


1. Don't dismiss non-STEM fields out of hand. Computer science has a pop-cultural reputation of being Only For Extra Smart People who clearly spend the largest portion of their brain cycles thinking about intensely difficult problems and thus have the most superior brains. This not only winds up setting an ego bar for anyone considering entering the field (if they don't already see themselves as a whizz kid, why should they try to hang out with them?), it's also dangerously myopic. I'm a huge fan of Off the Beaten Track as an attempt to try to cross disciplines with ours, but I have a hard time seeing it as a success so far, with a) very few crossed-with disciplines falling outside "other subfields of CS" or occasionally hard sciences and b) not much influence yet on mainline PL research. How are we going to grow our ideas if we can't communicate and cross-germinate them with external ones? When is the criterion of a language's expressiveness going to go beyond its ability to implement similar languages?

2. Inclusive language, y'all. This is a dead horse that keeps rising from the grave and definitely not unique to (or the worst in) PL, but it bears repeating. In formal settings like conferences I feel like we usually do well, but once the tone gets more "casual" like in ad-hoc social communities that grow around research groups, it can get pretty frustrating. Every male-as-default pronoun/name/hypothetical character is a grunch, i.e. a split-second solar-plexus-hitting reminder that a woman is Not The Assumed Audience. As Lindsey pointed out wrt the Haskell Symposium incident linked above:
You know what hurts about that sentence? The word 'they' and the word “us”. Usually, I like to think that when PL people speak of “us”, that I’m included in that “us”. But apparently there are PL people for whom “us” doesn’t mean “PL people”, but rather, “PL guys”.
This of course goes for heteronormative, racial and ableist assumptions that tend to creep into our language and hypothetical examples as well if we're not extra vigilant (to which point 5 below is also relevant).

3. Curb the interruption & other forms of one-upmanship. This is tough sometimes because we're all just so excitable and want to jump in right away with whatever we're thinking exactly when we think it. But the degree to which it happens sometimes makes me think that it's a lack of caring more than a lack of trying, or maybe even just ignorance to the fact that it's an issue. Consider whether you have a habit of interpreting a midsentence pause-to-collect-thoughts as a cue to make your point more loudly, and consider making a concerted effort to change that behavior.

4. Understand and discuss atypical brain function. One way to put this point is: stop valuing your colleagues on the basis of how "smart" you think they are, through e.g. how quickly they can solve a problem you put forth or how long it takes them to grasp a point from a paper or talk. Another thing I'm saying with this is that depression in academia is super common, yet we never talk about it; compounding situations like PTSD are less common yet can be totally crippling in combination with depression and the concomitant taboo/lack of sympathy for anyone who's not at least high-functioning with their atypicality. In fact perhaps we just expect that everyone in academia is "a little bit crazy", which means that a) we have some uniform idea of what that means and how it affects everyone (everyone responds to stress with workaholism, right?) and b) we don't talk about it at all, or what we could be doing to help each other, because we just think it's an inevitable part of the ride.

5. Recognize nonnormative family structure & other aspects of life outside academia. Basically, this is the idea that not everyone has followed-so-far and plans to continue following the Default Life Script. This article on geek/programmer culture, which has a similar aim for the software industry that I do in this post for PL, touches on this in one of its conversational suggestions to "talk about topics that are unique or important to you". Putting that forth as unilateral advice seems to me to ignore that it constitutes a risk some can't afford, but I do think that particularly if you're at a stable point in your career where you can afford to take social risks like that, showing folks with similar stories who're at more vulnerable/formative stages could be hugely helpful for them, knowing that Someone Like Them actually made it, wound up in a career that perhaps they could see themselves pursuing. Or even just that they have a potential ally to talk to if anyone gives them crap.

6. Get over the idea that studying logic makes you immune to fallacious or incomplete arguments. Valid argumentation requires not only clarity of thought but also empathy, humility, and introspection to overcome your biases and allow new information to influence your beliefs. We can't have productive discourse by coming to our conclusions alone, nor by assuming that whatever dogmatic statements we make (e.g. about specific programming idioms being more "natural") lie above critical examination.

7. Give positive feedback. This point arises from a specific suggestion from Chung-chieh Shan:
[T]hank people more often: asking a good question, engaging in discussion that (even if contentious) helped you clarify your stance and work, kicking a dinner group into action -- such helpful actions take work, we should express our (professionally relevant) gratitude for them, and I think it would help explain to many of us why we belong.
Another example of often-thankless activity I thought of is writing down an expository piece of "research folklore" previously conveyed by oral tradition; I've frequently wished we had better venues for solutions to open exposition problems.

Aside from thanking people, if you advise or mentor students much earlier in their career than you, positive feedback can also come in the form of remembering to say out loud what you appreciate about their work. Whatever is obvious to you about their talents and strengths is very likely not obvious to them.


These are all things that you-an-individual can do; there are lots of broader systemic points, such as the cost of conferences and the fact that barrier-to-entry for someone whose path diverged from academia several years ago has little hope of making their way - and I'd like to discuss what we can do about that as well, but I think one could easily write a separate, less-PL-specific post on those matters.

I'm saying all these things because honestly, I love being in PL and I don't want to leave. I want to be happy while I'm here, I want to make my fellow colleagues happy, and I want to show folks who don't yet know what we're about that we're lovely people who think about interesting problems.

If you have more suggestions for this list, please feel free to submit them in comments.

Thanks especially to Chung-chieh Shan, Wm. Caylee Hogg, Philippa Cowderoy, and Lindsey Kuper, who contributed a lot of really important points to this discussion as well as helpful feedback to a first draft.