Thursday, December 29, 2016

How to Lose at Mancala (Consistently!)

Over winter break, I spent a few days fiddling with a Mancala AI. It came together surprisingly quickly! Here's a little journaling on that project.

Background: Mancala is a fun, simple board game with a surprisingly high skill ceiling. I've never been very good at the game, but I've had fun playing it anyway. After a recent series of ruinous defeats at the hands of my own family I started thinking about what optimal game strategies might look like.

(Side note: It turns out Mancala is a solved game (i.e. it's known how to play "perfectly") for the most common numbers of stones or houses. I haven't looked at any of the relevant publications, but my impression is that they relied mostly on endgame databases and exhaustive search. That's no fun, so I'm going to ignore those results and try to build this project from the ground up. The goal is to write a program that can beat me at Mancala, and hopefully to learn a few things about the game from it.)

If you don't know the rules of the game, here's a quick overview of the variant I'll be writing about. I'll be calling the game pieces "stones", and the little areas they sit in "houses".

I opted to do this all in C because I had a feeling I'd need speed here and because I wanted more practice with the language.

You can find the Github repo here.

The Basics

My goal here was to start by writing a C framework and basic shell for playing Mancala. The idea is that if this is made modular enough, then the game logic can be tested through manual play, and down the road the code that prompts a user for a move can be swapped out for code that asks an AI the equivalent question.

Boards are represented through a struct:

typedef struct {
     char p1_store;
     char p1[6];
     char p2[6];
     char p2_store;
} board;

This struct is of course effectively just a char array, but it's still nice to have pointers into its various sections, both for convenience and for readability. The board will never have more than 4*6*2=48 stones on it, so there's no drawback to the limited range of a char. And in fact using chars rather than larger data types offers an advantage: their small size makes the struct easier to copy quickly.

There's just one gotcha: we haven't addressed the question of how to map this data structure to the physical board. There are three things we want to do with the board: print it, reference specific houses on it, and play moves on it. In printing it and referencing moves, it's convenient to think of reading off the state of the board left-to-right. But when you're playing out a move, stones progress counter-clockwise -- that is to say, left-to-right along the bottom and right-to-left along the top.

Thus, while it's possible to lay out memory so as to make either of these operations natural, the two are mutually exclusive. We have to choose which to bias the design towards.

The main consideration I used is that if we make move calculation easy but printing and indexing more difficult, then that introduces complexity to many parts of the program. Complexity just about inevitably leads to bugs.

On the other hand, if we make move calculation more difficult but printing and indexing easy, then we've encapsulated the complexity within whatever function handles playing a move -- a small function which'll be easy to rigorously test. The rest of the program avoids having to deal with this complexity as long as it trusts the validity of the move function.

The positioning of p1_store and p2_store (the "endzones" where players are trying to collect stones) was purely to simplify the logic of play_move, since nothing else really cares too much about where in the struct those elements are placed.

The "main loop" of the game is a function, play_game, which does exactly what it says on the tin. play_game takes three arguments: a pointer to a board struct, and two function pointers. These function pointers are meant to indicate two functions which, if passed the game state, will pick moves for players 1 and 2 respectively. Those moves are enacted on the board via play_move, which returns a MOVE_RESULT enum that is then used to provide helpful feedback and update the game state.

I'd never really played with function pointers before in C. They're really cool -- they feel like something that got smuggled into the standard, secretly backported from higher-order languages. If C also had a clean way to curry functions, I'm not sure I'd ever want for anything more.

The use of function pointers in play_game allowed me to start out with players 1 and 2 both played from user input. This manual operation was a good chance to verify that the basic guts of the mancala engine were working properly, and I caught a few bugs in the move function right away. Like I mentioned earlier, that was expected, and the whole setup here was designed to make those expected bugs as easy as possible to root out.

Their real utility, though, came when I decided to add my first shot at an AI to the game. All that's required to swap out a human for an AI is to write a C function with an identical signiture which implemented the AI desired and to change one line in main to pass that function instead of get_move. How's that for modularity?

The next step is to write an AI to play the game. That's where "losing consistently" comes in: playing yourself is great, but playing something else is better, and if the thing you've made to play against can beat you, then hey, that's all the better. My goal from the start is to make an AI that's good enough that it can beat me consistently.

Losing Consistently

The Idea

My first strategy is simple: a naive version of Monte Carlo tree search (MCTS). This is a solid go-to strategy for lots of kinds of games. All you need is to be able to simulate playing games out to their conclusions, logging who won as you go. If you have a working engine but don't have any real theory for the game and don't have a bulletproof way of evaluating positions mid-game, MCTS is a good way to get a basic AI working practically "for free". In spite of having such low requirements, MCTS is an incredibly effective strategy -- it's done wonders for many games, notably including Go. In fact, the AlphaGo system was built by using neural nets to provide heuristics for a (thoroughly non-naive) implementation of a similar algorithm.

In short, this is a strategy that gets results, especially in games with emergent mechanics. A full implementation of Monte Carlo tree search involves weighting different moves during the search to provide a basic heuristic for focusing on the most promising candidate moves. We'll start out without this: hence the "naive", since our heuristic will be the random function.

This naive approach probably won't work quite as well as a full MCTS implementation would, but it's much simpler to implement, so it'll make for a good starting point. Down the road, it'll also be interesting to compare this algorithm's performance to that of full MCTS.

Random Movements

Incidentally, here's something not a lot of people get right: how do you pick a random number between 0 and n in C? Most people would just use (rand() % (n+1)), but this introduces a slight amount of skew towards some numbers, which is unacceptable. Random number generation is at the heart of our algorithm. We need to avoid generator skew, or the algorithm will suffer.

I chose to encapsulate the complexity of this problem inside a function called pick_random_move. This function picks a move at random, while taking care to adjust for skew and to check that the move is legal. The function:

char pick_random_move(char side[]) {
    int divisor;
    int move;

    divisor = RAND_MAX/6;
    do {
        do {
            move = rand() / divisor;
        } while (move > 5);
    } while (side[move] == 0);

    return move;
} then polled by the NMCTS engine to get random moves. Notice that to switch from a naive heuristic to any other, all we'd have to do is swap out this function. More modularity at work. It's a pity we can't easily make this function an argument to the MCTS algorithm which could get curried in before passing the algorithm driver to play_game, but that's C for you.

Quick note about how the function works: RAND_MAX is a predefined macro giving the upper limit on the value of rand(). We want to map these values to values between 0 and 5, without skew. However this would only be possible if RAND_MAX is divisible by 6, and we have no guarantee that that is the case. So, we instead create an imperfect map from values of rand() to numbers between 0 and 6. The way we do this (integer division) guarantees that values 0-5 have the same probability of occurring, but has some skew in how often 6 shows up. We deal with that by just throwing out 6es completely -- repolling the RNG whenever we get one -- and returning on 0 through 5. And just like that, we have a reliable random number generator!

Where does this get us?

The Github repo already contains an implementation of the naive Monte Carlo tree search. It runs out a configurable number of simulations (defined by a preprocessor macro, NAIVE_MCTS_NUM_PATHS, which defaults to 200000) and picks the one which had the highest margin of wins over losses.

In each simulation, all moves are chosen randomly. That's all this thing does -- and yet it works. In aggregate, that strategy is already enough to identify which branches are promising and which are not. My suspicion is that this works because Mancala positions have an extremely low branching factor. This makes branch pruning less important than it would be in e.g. computer Go.

There's one more thing to address: how moves are chosen. There's a few different options here. You could go by the ratio of wins to losses, the ratio of wins to total games played, the integer difference between number of wins and losses observed... In full MCTS, the ratio of wins to games played is typically used, which makes sense since those ratios are also used to set the weights for the random move generation function. However, since we're using naive MCTS, complete with even move probability weighting, I opted instead to go for integer difference, just because it makes for simpler code and gives more or less the same choices on average.

The Github repo currently includes code for setting up both human and NMCTS players. It's even possible to pit the AI against itself (and it plays a pretty good game!). The code is decently well-commented, at least in my opinion, and has a bunch of optional, preprocessor-controlled debug print statements that also help to narrate what's going on.

Next Steps

The naive MCTS is not yet completely bug-free, and occasionally picks illegal moves. Hopefully I'll have that issue figured out soon. After that, my next goal is going to be to try and add a more fully-fledged version of MCTS to the program, mostly just to see if it performs tangibly better. It'll be especially fun pitting different versions of the AI against each other and seeing if one has an edge over another.

After that's done, I'm tempted to experiment a little more with different guiding heuristics. It might be interesting to use a decent AI to build up a big corpus of games, then use that corpus to train some sort of statistical model for guessing candidate moves, and use those guesses to help navigate the search space more effectively. I'm not sure how well that will work -- the original algorithm does after all work from random move choices, so as far as data derived from it goes, one is tempted to invoke 'garbage in, garbage out' -- but it'll be interesting regardless. I'm hesitant to make predictions but I think it'll work pretty well.

In any case, I've already achieved my goal of losing consistently, so I know I'm on the right track.

Saturday, December 24, 2016

Academic Computer Science Needs to Get Its Shit Together

The fact is, our beloved field of computer science has reached an embarrassing low. Among programmers in all but some collegiate circles, calling something "academic" is oblique shorthand for calling it overwrought, obscure, inflexible, and/or too fragile to be useful in the real world. And there's a good reason for this: more often than not, the products of academia in computer science meet all those criteria.

But why? Shouldn't universities be where our best and brightest gather? Isn't the whole idea of academia to draw great minds together so they can collaborate and educate?

The short answer: Well, yes, but that idea doesn't work so well when you're competing for talent against an industry that can afford to triple your salary offers. With few exceptions, industry gets who they want and academia gets stuck with the dregs -- and you can't do much with dregs.

What's the long answer? Glad you asked. Buckle up.

Once upon a time, academia really was (I am given to understand) a heavyweight player in computer science. MIT and University of Illinois have their places in the annals of Unix history right next to Bell and GE, and in fact it was academia where Unix first really gained traction. Same with the internet -- there's this map that's been floating around recently:

There's a lot that's interesting about this picture, but as far as our discussion is concerned, let's note the breakdown of who owns which nodes: there's a few gov't agencies and a few companies represented, but a solid majority of the systems are academic. Universities broke a lot of ground in both developing and implementing the technologies that underlie the internet.

Of course, hardware and networking isn't everything, and there are other areas where academia had a lot to offer. For instance, it's hard to overemphasize the eminent (and eminently quotable) Edsger Dijkstra's influence on programming language design, distributed systems, or really any number of other subjects. Or take Donald Knuth, who wrote the book on computer science, then called it "volume one" and set back to work writing volumes two through five (six and seven still pending!). Or Martin Hellman, who advised Ralph Merkel's groundbreaking PhD work on public-key cryptosystems. Hellman later recruited Whitfield Diffie to his lab and together they built on that work, eventually leading to the landmark discovery of what is now called Diffie-Hellman key exchange.

This was all real ground being broken, real problems being solved, challenging work being done well -- all by academics. If these were once academia's exports, when did it become so maligned? Where did things go wrong?

Well, there's a case to be made that in spite of all the above, things might not have gone wrong so much as stayed wrong. Knuth and Dijkstra, for instance, were both outspokenly critical of their field.

Knuth said of computer science during his time as a student that "...the standard of available publications was not that high. A lot of the papers coming out were quite simply wrong." He made it clear practically in the same breath that one of his main goals with The Art of Computer Programming "was to put straight a story that had been very badly told."

Dijkstra, for his part, held practically every language of his time in the highest contempt. COBOL and BASIC "cripple the mind" and leave students "mentally mutilated beyond hope of regeneration," respectively. FORTRAN he described as "the infantile disorder," while PL/I was "the fatal disease." He also had some critical words for those in his field who refused to acknowledge some of its more uncomfortable truths.And in an abstract moment he opined, "Write a paper promising salvation, make it a 'structured' something or a 'virtual' something, or 'abstract', 'distributed' or 'higher-order' or 'applicative' and you can almost be certain of having started a new cult." Some things, it seems, never change.

A professor of mine once commented, in one of the last meetings of his class, that the structured programming practices he'd been teaching us were important because "my generation has already written all the easy programs. The hard ones are up to you guys." He was of course talking about software development -- the bricklaying of computer science -- but I suspect that a similar quip would apply on the theoretical side of things.

Obviously the field is still new, but by and large it really does seem to be true: The easy, useful results have been discovered. The easy, useful definitions have been made. The easy, useful algorithms have been found. Having reached this point, academics now have two choices: either we take on the stuff that's not easy, or we take on the stuff that's not useful. A quick flip through arXiv suggests that most researchers have opted for the latter option.

The sad fact is, of course, that modern academic culture does nothing to discourage this -- in fact, "publish or perish" actually encourages professors to focus on cranking out useless but simple results. Meanwhile, profound guiding problems like P vs NP or even P vs PSPACE go all but untouched. The culture is such that the average academic who's fool enough to really throw themself at such a problem ends up reduced to a "cautionary tale" in a survey paper.

Make no mistake: there are major, important unsolved problems in computer science. Hell, there are enough that Wikipedia's got a whole list. Breakthroughs on any of these would instantly make the reputations of the researchers involved. But those with the expertise to take these issues on are, more often than not, actively discouraged from doing so. How is a field supposed to produce anything of value when this is its culture?

There's a quote I recall reading, but which I can't seem to dig up. It was from one of the leading researchers on the topic of proving program correctness, given some time before the turn of the millennium, and his observation was that while the field had advanced significantly in the years he'd spent studying it, he was slowly coming to realize that the problems they were trying to solve, nobody else was really all that concerned about.

I know of a few kernel programmers and security buffs who'd take issue with that claim, but aside from them and aside from a few other very specialized contexts it turns out that yeah, nobody really cares too much. In the same vein, very few people care about some equivalence result in complexity theory between two games they've never heard of. Same goes doubly for the underachieving grad student's favorite recourse: survey papers directed at the above.

That story is a pretty common one in theoretical computer science, and honestly it's understandable -- the field is new, and it's only fair to give it some time to get its bearings. Not every result is going to have immediate practical applications, and we'd be doing everyone a disservice if we expected otherwise. But while that might excuse some seemingly useless research, it's no excuse for the sheer mediocrity and apathy that pervade the field.

There are important results waiting to be found. It seems apt to compare modern complexity theory to ancient Greek mathematics -- perhaps, if we really want to push the analogy, even going far enough to equate Knuth with Euclid -- and if this comparison holds, then mind-boggling amounts of valuable theory have yet to be discovered. My own background leaves me tempted to bring up the influence these results could have on computer security, but really, it's hard to think of a subject they wouldn't influence. In fact, there are enough connections between computer science and advanced mathematics that results found here could easily filter back and offer unexpected insights on long-standing problems in that domain as well. John Conway's work gives a number of examples of what that might look like.

Speaking of computer security: one exciting thing we learned post-Snowden that, while we've massively dropped the ball on endpoint security, we've managed to build cryptosystems that actually work. Modern cryptography is one of the few things our clearance-sporting buddies in Fort Meade don't seem to be able to crack. And yet we don't have solid proofs for such basic questions as: do one-way functions really exist? Or, are public-key cryptosystems really possible? On the more practical side, we also have a tremendous amount of work to do as far as post-quantum cryptography and cryptanalysis are concerned.

In short, there's important work to be done, and (since abstract arguments are notoriously hard to monetize) you can bet industry isn't going to do it.

If academic computer science wants to be taken seriously, it needs to get its priorities straight. It needs to stop discouraging work on problems that matter, stop encouraging work on problems that don't, and make an honest effort to equal the landmark achievements of previous decades. There's still plenty of room in the history books.