Computers are very good at the game of Go



MIT Speed Go tournament, Jan 2012

There’s an attraction between computer programmers and the Asian game of Go. I think there’s a lot to like about the game — it has very simple rules, high complexity (it’s “deeper” than chess) and pleasing symmetry and aesthetics. I think the real reason programmers are so drawn to it might be a little more self-involved, though: being good at things that computers aren’t good at tends to make programmers happy, and computers are terrible at Go.1

Or at least, that’s the folk knowledge that’s been true for most of my life. Computers have always been worse than an average amateur with a few years of experience, and incomparably bad to true professionals of the game. Someone I was talking to brought up the ineptitude of computers at Go a few days ago, talking about new ideas for CAPTCHAs: “just make the human solve Go problems”, they said, and you’re done; computers can’t do that, right?

So, I’ve enjoyed this feeling of technological superiority to computers as much as anyone, and it hurts me a little to say this, but here I go: the idea that computers are bad at Go is not remotely true anymore. Computers are excellent at Go now. To illustrate this, there’s some history we should go into:

Back in 1997 — the year that Deep Blue beat chess world champion Garry Kasparov for the first time — Darren Cook asked Computer Go enthusiasts for predictions on when computers will get to shodan (a strong amateur level) and when they’ll beat World Champion players. John Tromp, an academic researcher and approximately shodan-level amateur, noticed the optimism of the guesses and wondered aloud whether the bets would continue to be optimistic if money were on the line, culminating in:

John Tromp: “I would happily bet that I won’t be beaten in a 10 game match before the year 2011.”

Darren Cook took the bet for $1000, and waited until 2010 before conducting the match against the “Many Faces of Go” program: Tromp won by 4 games to 0. That seemed to settle things, but the challenge was repeated last month, from January 13th-18th 2012, and things happened rather differently. This time Tromp was playing a rising star of a program named Zen19, which won first place at the Computer Go Olympiad in 2009 and 2011. The results are in, and:

Zen19 won by 3 games to 1. (For more reading on the challenge, see David Ormerod’s page or Darren Cook’s.)

Beating an amateur shodan-level player is shocking by itself — that’s my strength too, and I wasn’t expecting a computer to be able to beat me anytime soon — but that isn’t nearly the limit of Zen19′s accomplishments: its progress on the KGS Go server looks something like this:

Year KGS Rank
2009 1-dan
2010 3-dan
2011 4-dan
2012 5-dan

To put the 5-dan rank in perspective: amongst the players who played American Go Association rated games in 2011, there were only 105 players that are 6-dan and above.2 This suggests that there are only around 100 active tournament players in the US who are significantly stronger than Zen19. I’m sure I’ll never become that strong myself.3

Being able to gain four dan-level ranks in three years is incredible, and there’s no principled reason to expect that Zen19 will stop improving — it seems to have aligned itself on a path where it just continues getting better the more CPU time you throw at it, which is very reminiscent of the story with computer chess. Even more reminiscent (and frustrating!) is the technique used to get it to 5-dan. Before I explain how it works, I’ll explain how an older Go program worked, using my favorite example: NeuroGo.

NeuroGo dates back to 1996, and has what seems like a very plausible design: it’s a hierarchical set of neural networks, containing a set of “Experts” that each get a chance to look at the board and evaluate moves. It’s also possible for an expert to override the other evaluators — for example, a “Life and Death Expert” module could work out whether there’s a way to “kill” a large group, and an “Opening Expert” could play the first moves of the game where balance and global position are most important. This seems to provide a nice balance to the tension of different priorities when considering what to play.

Zen19, on the other hand, incorporates almost no knowledge about how to make strong Go moves! It’s implemented using Monte Carlo Tree Search, as are all of the recent strong Go programs. Monte Carlo methods involve, at the most basic level, choosing between moves by generating many thousands of random games that stem from each possible move and picking the move that leads to the games where you have the highest score; you wouldn’t expect such a random technique to work for a game as deep as Go, but it does. This makes me sad because while I wasn’t foolish enough to believe that humans would always be better at Go than computers, I did think that the process of making a computer that is very good at Go might be equivalent to the process of acquiring a powerful understanding of how human cognition works; that the failure of brute-force solutions to Go would mean that we’d need a way to approximate how humans approach Go before we’d start to be able to beat strong human players reliably by implementing that same approach in silico.

Update: “bitti” comments below that Zen19 is choosing which move to play based on maximizing win probability, not maximizing score.

I think that programs like Zen19 have actually learned even less about how to play good Go than computer chess programs have learned about how to play good chess; at least the chess programs contain heuristics about how to play positionally and how to value different pieces (in the absence of overriding information like a path to checkmate that involves sacrificing them). This lack of inbuilt Go knowledge shows up in Zen19′s games — it regularly makes moves that look obviously bad, breaking proverbs about good play and stone connectivity, leaving you scratching your head at how it’s making decisions. You can read a commented version of one of its wins against Tromp at GoGameGuru, or you could even play against it yourself on KGS.

Update: Matthew Woodcraft comments below that Zen19 does contain significant domain knowledge about Go. It’s hard to know exactly how much; it’s closed-source.

Zen19 is beating extremely strong amateurs, but it hasn’t beaten professionals in games with no handicap yet. That said, now that we know that Zen19 is using Monte Carlo strategies, the reason why it seems to be getting stronger as it’s fed more CPU time is revealed: these strategies are the most obviously parallelizable algorithms out there, and for all we know this exact version of Zen19 could end up becoming World Champion if a few more orders of magnitude of CPU time were made available to it.

Which would feel like a shame, because I was really looking forward to seeing us figure out how brains work.


1: I think this is probably why I’ve never been interested in puzzles like Sudoku; I can’t escape the feeling of “I could write a Perl script that does this for me”. If I wouldn’t put up with such manual labor in my work life, why should I put up with it for fun?

2: Here is the query I used to come up with the 105 number.

3: In fact, KGS ranks are stronger than the same-numbered AGA rank, so the correct number of active players in the US who are stronger than Zen19 may be even smaller. Being stronger than US players isn’t the same as being stronger than professional players, though — there are many players that are much stronger than amateur 5-dan in Asia, because there are high-value tournaments and incentives to dedicating your life to mastering Go there that don’t exist elsewhere.

Comments

  1. My understanding of how neurophysics has seen how brain methods fire off.. it is very much like how the zen19 works but a lot messier. While we think we have one conscious decision making structure, we just have a high level virtual machine sitting on top of a ton of cheap parallel processors (lets say the VM is a 32 bit land but the processors are 8 bit and have been wired in various ways to look 32 bit). For games like go and chess the visual processing parts of the brain make multiple guesses about how things will look if you make a move. One gets chosen by semi-random draw and sent up to the vm. The vm makes a move and then if the game works that choice gets weighted higher.. if that choice looses then the feedback tries to weigh it lower.

    However since the brain has to use various wiring methods, it may or may not work. Some people reach a level where they can’t improve anymore and others don’t. From a programming standpoint this is a highly inefficient way to do things. We would rip out the 8 bit processors and use 32 or even 64 bit. We would rip out bad programming and put in completely new programming.

    I think for computers to approach learning like a human, the programmers and engineers have to be stuck with various limitations that can’t be changed out. Give them a billion 8 bit processors and circuits.. have them have to get them to interconnect and work. and then have them have an OS that is written in a different bit range and “thinks” it is talking to silicon that is that.

    Anyway, thanks for that info.. I did not know about Go getting hit by silicon so quickly.

    Reply
  2. I do wonder to what extent the random-looking play of programs like Zen19 works in its favor. By acting completely differently than human players, Zen19 may gain some advantages over human players expecting some form of apparent strategy.

    Reply
  3. Zen is a great deal more than a Monte Carlo Tree Search algorithm with almost no inbuilt Go knowledge. If that’s all it was, it wouldn’t have such good results against other programs.

    I don’t think there’s an enormous difference between the amount of Go knowledge in current top Go programs and the amount of Chess knowledge in current top Chess programs.

    Reply
  4. Thanks, Matthew! I’ll add a correction. Do you know if anyone’s made a study of what kind of Go knowledge Zen19 contains — is there any documentation about whether it includes fuseki/joseki databases, for example?

    > If that’s all it was, it wouldn’t have such good results against other programs.

    This doesn’t seem to explain things like MoGo (also Monte Carlo) becoming the world champion program in its very first year in the Computer Olympiad, so I’m still not quite convinced that the use of Monte Carlo isn’t the proximal cause of these programs’ extreme strength. (Do you think MoGo contains a great amount of domain knowledge too?)

    Reply
  5. MCTS was a great advance, certainly, but of course there’s been a lot of hard work put in since then and much of that can be described as Go knowledge.

    MoGo’s design was published, and if I remember right the domain knowledge was a handful of patterns and a few heuristics (like bonuses for playing on the third or fourth line, and playing close to the previous move).

    Zen isn’t published, but it had hard-coded Go knowledge in the playouts from the beginning. It’s known to use many patterns and have specialist code for handling ladders, seki, and semeai.

    Its authors mentioned recently that it has an opening book for 9×9 but not for 19×19, and that it does have some kind of joseki patterns.

    Reply
  6. The thing is of course that games like chess or go are interesting to us because they are difficult for us, not because they are easy.

    If these kind of games really relied mostly on typical human cognition we would all be good at them and they would not be very intersting to us. The fact that only a few people can get really good, and only after years of dedicated study, shows that they do not rely on typical human cognitive capabilities, and it is thus not surprising if they fail to teach us anything significant about human reasoning.

    Reply
  7. One of the things to note is that the Monte Carlo method is very interesting in that unlike alpha-beta pruning when you deepen the tree it doesn’t get linearly better.

    There’s also a very significant psychological factor to consider — in Go you’re playing hundreds of moves, so instead of making that one superb move in the match you’ll want to make very many good ones. This can easily be demonstrated with “If I make 5.5 points with every move and my opponent makes 5 points, I will win the match”. It really is that simple.

    When we combine these two things we end up naturally with a property that serves as an advantage: playing speed. When your opponent is playing significantly and constantly faster than you there is a subconscious need to play as fast or faster — probably ending up with poorer play.

    Zen19 and other bots have very fast playing times compared to anything outside the Internet and still pretty fast when considering Internet times. In fact, I believe either Many Faces or Zen’s creator has said that when he increased thinking times the bot’s rank fell two dans(!)

    Of course, I’m not trying to take away from the incredible achievement it has been to get computers to even dan-level, let alone two-to-four-dan level. It’s simply that ranks do not exist in a vacuum. A European two-dan can reasonably give two stones to a Japanese two-dan, these days maybe even two. Internet ranks, of course, are worth the bits they’re written on — not much.

    So where does that leave us? Sitting in the sidelines, eating popcorn while AI after AI starts to challenge high-level pros.

    Golly gee!

    Reply
  8. It’s obvious now that the solution to human-level AI is just brute force. All we need is enough processing power to simulate random universes that stem from quantum causality, pick the most likely universe containing the desired outcome, rinse, and repeat.

    You’re welcome.

    Reply
  9. Zen19 is not as strong as its KGS rank might make it look.

    Go matches are usually decided by the worst mistakes rather than by the most brilliant moves. For human players, the amount and seriousness of their mistakes is closely related to the amount of available thinking time. In a blitz game with 10 seconds per move, humans make many oversights which are way below their usual level. Computer programs do not make silly mistakes under time pressure.

    It probably won’t surprise you that Zen19 plays primarily blitz games. I’d guess its actual tournament performance (1.5 hours per player + byoyomi) to be around European 3dan.

    Reply
  10. tasuki, I think you’re incorrect. Zen19S (S for Slow) plays 20:00 main time + 30 seconds x5 byo-yomi — nowhere near your ten seconds per move — which is considered a long game (at least on KGS), and it’s a solid 5d at that time duration too. It’s not any stronger at blitz than at slow games, by this metric.

    It’s true that it’s not 90:00 + byo-yomi, but I doubt the move from 20 to 90 loses it the 3-4 stones that you’re suggesting. I maintain that Zen19S is probably stronger than almost all European tournament players, right now.

    Reply
  11. Nice article.

    But you got at least two important points wrong:

    > Monte Carlo methods involve, at the most basic level, choosing between moves by generating many thousands of random games that stem from each possible move and picking the move that leads to the games where you have the highest score

    While it’s true that early CrazyStone and MoGo versions worked this way, current MCTS Programs chose the path which leads to the highest winning probability. This distinction is important, because just this change in goal can make a strength difference of serveral stones (using the Go Rules but trying to achieve the highest score is a different Game anyway).

    Therefore for a modern MCTS program it doesn’t matter if it’s winning by 100 points or just 0.5.

    > Zen19, on the other hand, incorporates almost no knowledge about how to make strong Go moves!

    While MCTS is the basic technique there is a lot of freedom on how the “random” moves will actually chosen for the playouts. Using some knowledge in the playouts will make the playouts slower, but the added value to the playouts might overcompensate that. What kind knowledge to add and if it’s worth the slowdown is a delicate point. Zen has “heavy” playouts but it still has to do several thousands playouts per move to get a reasonable strength. A more simple MCTS with “light” playouts is much weaker but it can do several orders of magnitudes more playouts and might still be stronger then a traditional knowledge bases program.

    But MCTS seems to be the framework in which all knowledge has to be incooperated. Just trying to see it as a separate module and combine it with some “knowldege” modules regularily hurts the endresult and effectiveness of MCTS.

    Also your conclusion is at least debatable. I agree to Mr. Smoogen, the “Zen” way might be much nearer to the way the brain works. The old expert systems which are supposed to model the human thinking had to fail badly, because human intuition in science is mostly wrong. Especially regarding in how the brain works. That doesn’t imply that there are some good intuitions, but finding these few good ones is the hard part. Just putting a whole bunch of heuristics together doesn’t reveal the few good ones.

    While MCTS provides a good framework to add Go knowledge I think the other main reason for the recent improvements is the much more systematic way in how Go research is done after the advent MCTS. E.g. well 9 of 10 seemingly reasonable added heuristic make a program actually weaker than before. But to prove if a program has gotten weaker or stronger there might be several thousands games necessary to get a statistical valid result. That’s exactly what happens on the CGOS (Computer Go Server) infrastructure.

    One of the results is that the “knowledge” wich should be added to the playouts is not necessarily the same which is important during actual play. More important is, that the moves chosen in a playout actually make the knowledge of the result of the playout more valuable. Not very intuitive, right? But perfectly logical in hindsight…

    Reply
  12. “I maintain that Zen19S is probably stronger than almost all European tournament players, right now.”

    Lol wot?! Zen is for sure very impressive, but don’t get carried away. I agree with tasuki it’s about European 3 dan, same as me, and I’m the 388th best player in the European Rating List.

    Reply
  13. Uberdude: Oops, my mistake — I thought KGS dan ranks were equal to or stronger than EGF dan ranks (as they are against AGA), but in fact they’re weaker. So tasuki’s estimate of EGF 3d==KGS 5d is reasonable, at least for now.

    Reply
  14. thanks for the good read. came across this site while googling. and lo! there’s a picture of me here. hahaha

    Reply
  15. I saw you link to this on HN. You might want to update the point about further progress–today it definitely looks like progress has slowed somewhat.

    In the computer go section on L19, I linked to some discussion of playouts from the computer go list: it seems apparent that most programs are doing fairly heavy playouts that incorporate significant domain knowledge. Though I do wonder if the “360 million” playouts quotes from the New Yorker mean that Crazy Stone bucks that trend.

    Reply
  16. Pingback: AI | Hartator - Tech and Business Blog

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>