Skip to main content

coinductive winnowing

I implemented a version of the winnow algorithm for online learning in SML, because I was trying to understand learning algorithms in terms of coinductive data.

The idea behind a learning algorithm, as I understand it (as someone who was just taught about them yesterday), is that given a stream of tagged inputs (like images marked with the locations of faces), you should get better and better at predicting the tags -- specifically, you should make a number of total mistakes bounded by the input size.

Alternatively, you can think of the "tags" as a function from input to answer (for simplicity, from bitstring to bit) that the learning algorithm is trying to approximate. So the "learner" gets to take this function, which I call the "reference function", as an input, and compare its predictions to the actual result, which it can use to update some priors.

Roughly, it has the shape:
1. Given an input, make a prediction
2. Check the answer with the reference function
3. If we're wrong, refine our approximation
4. GOTO 1

The winnow algorithm has a particularly simple-to-implement instantiation of steps 1 and 3. In a sentence: we keep around a vector of weights for each position of our bit-vector input, make our prediction based on the weighted sum of the bits of the input, and either halve or double the weights on the positive bits if we are wrong (depending in which direction we were wrong).

The correctness of the algorithm, while interesting, is not the point -- my intention by posting this on a PL blog is just to show how you can code up an online algorithm with coinductive data, since I couldn't seem to find something like this when I searched.

The first question I had was: what is the type of a learning algorithm? It should be something that first takes a function from input to output (the thing we're approximating), and returns a function from input to output... along with another function, the better approximation. In code, if we have

type 'a input = 'a vector
datatype ans = YES | NO
datatype bit = ZERO | ONE
datatype 'a learner = Susp of 'a input -> (ans * 'a learner)


winnow : (bit input -> ans) -> bit learner

The learner datatype looks sort of like a stream in that it consists of a single constructor that wraps a suspended computation, but that suspended computation isn't just a thunk -- it actually takes an input. If you think of that input as a stream itself (as I blithely alluded to in the intro), then the learner is sort of like a stream transformer that moves in lock-step with its input.

With winnow written, you can define some function chi that accepts certain bit strings, go to the repl and do

- val (Susp f) = winnow chi;
val f = fn : bit input -> ans * bit learner

Then repeatedly call
- val (x, Susp f) = f y;
val x = NO : ans
val f = fn : bit input -> ans * bit learner

for various instantiations of y to see learning in action.

Complete code is here.


  1. I had forgotten about this update rule, it's great to be reminded of it. (I work with perceptron updates or stochastic gradient ascent a lot; winnow is intuitively very related.)

    The definition of a learning algorithm is a little more annoying to formalize if there's noise in the system. The learner will never be 100% correct and may not even be as close as it could be, but is presumably still a "learner". But your definition seems to work pretty well if we want to avoid saying something horribly general such as we'd need to in order to allow for the learners that don't learn...


Post a Comment

Popular posts from this blog

Using Twine for Games Research (Part II)

This preliminary discussion introduced my thoughts on using Twine as a tool for creating prototypes for games research. I'll start with documenting my first case study: a hack-and-slash RPG-like setting where the player character has a record of attributes ("stats") that evolve through actions that turn certain resources (money, health, items) into others. I've selected this hack-and-slash example because it falls outside the canonical "branching story" domain thought to be Twine's primary use case, but it is not too much trickier to implement. It relies crucially on the management of state in ways that simple branching stories would not, but it does so in a fairly straightforward way.

If all goes well, this post may also serve as a tutorial on the "basics" of Twine (links + variables + expressions). In particular, I'll be using Twine 2/Harlowe, and I haven't seen many tutorials for this new version published yet.

To me, the main "…

Why I don't like the term "AI"

Content note: I replicate some ableist language in this post for the sake of calling it out as ableist.

In games research, some people take pains to distinguish artificial intelligence from computational intelligence (Wikipedia summary), with the primary issue being that AI cares more about replicating human behavior, while CI is "human-behavior-inspired" approaches to solving concrete problems. I don't strongly identify with one of these sub-areas more than the other; the extent to which I hold an opinion is mainly that I find the distinction a bit silly, given that the practical effects seem mainly to be that there are two conferences (CIG and AIIDE) that attract the same people, and a journal (TCIAIG - Transactions on Computational Intelligence and Artificial Intelligence in Games) that seems to resolve the problem by replacing instances of "AI" with "CI/AI."

I have a vague, un-citeable memory of hearing another argument from people who dislike the…

Using Twine for Games Research (Part III)

Where we last left off, I described Twine's basic capabilities and illustrated how to use them in Twine 2 by way of a tiny hack-and-slash RPG mechanic. You can play the result, and you should also be able to download that HTML file and use Twine 2's "import file" mechanism to load the editable source code/passage layout.

Notice that, in terms of game design, it's not much more sophisticated than a slot machine: the only interesting decision we've incorporated is for the player to determine when to stop pushing her luck with repeated adventures and go home with the current spoils.

What makes this type of RPG strategy more interesting to me is the sorts of decisions that can have longer-term effects, the ones where you spend an accumulation of resources on one of several things that might have a substantial payoff down the road. In a more character-based setting, this could be something like increasing skill levels or adding personality traits.

Often, the game-…