Provide a detailed summary of the following web content, including what type of content it is (e.g. news article, essay, technical report, blog post, product documentation, content marketing, etc). If the content looks like an error message, respond 'content unavailable'. If there is anything controversial please highlight the controversy. If there is something surprising, unique, or clever, please highlight that as well: Title: A Matchbox Game-Learning Machine (1991) [pdf] Site: gwern.net C H A P T E R E I G H T A Matchbox Game- Learning Machine I knew little of chess, but as only a few pieces we?% on the board, it was obvious that the g m e was near i t s close. . . . [Moxon's] face was ghastly white, and his eyes glittered like diamonds. Of his antagonist I had only a back view, but that was sufficient; I should not have cared to see his face. THE QUOTATION is from Ambrose Bierce's classic robot story, "ZCIIoxon's Master" (reprinted in Groff Conklin's excellent science-fiction anthology, Thinking Machines). The inventor Moxon has constructed a chess-playing robot. Moxon wins a game. The robot strangles him. Bierce's story reflects a growing fear. Will computers someday get out of hand and develop a will of their own? Let it not be thought that this question is asked today only by those who do not understand computers. Before his death Norbert Wiener anticipated with increasing apprehension the day when complex government decisions would be turned over to sophisticated game-theory machines. Before we know it, Wiener warned, the machines may shove us over the brink into a suicidal war. A Matchbox Game-Learning Machine 91 The greatest threat of unpredictable behavior comes from the learning machines : computers that improve with experi- ence. Such machines do not do what they have been told to do but what they have learned to do. They quickly reach a point a t which the programmer no longer knows what kinds of circuits his machine contains. Inside most of these compu- ters are randomizing devices. If the device is based on the random decay of atoms in a sample radioactive material, the machine's behavior is not (most physicists believe) pre- dictable even in principle. Much of the current research on learning machines has to do with computers that steadily improve their ability to play games. Some of the work is secret-war is a game. The first significant machine of this type was an IBM 704 computer programed by Arthur L. Samuel of the IBM research depart- ment a t Poughkeepsie, New York. In 1959 Samuel set up the computer so that it not only played a fair game of checkers but also was capable of looking over its past games and modifying its strategy in the light of this experience. At first Samuel found it easy to beat his machine. Instead of strang- ling him, the machine improved rapidly, soon reaching the point at which it could clobber its inventor in every game. So fa,r as I know no similar program has yet been designed for chess, although there have been several ingenious pro- grams for nonlearning chess machines. A few years ago the Russian chess grandmaster Mikhail Botvinnik was quoted as saying that the day would come when a computer would play master chess. "This is of course nonsense," wrote the American chess expert Edward Lasker in an a'rticle on chess machines in the Fall 1961 issue of a magazine called T h e American Chess Quarterly. But it was Lasker who was talking nonsense. A chess computer has three enormous advantages over a human opponent: (1) it never makes a careless mistake; (2) it can analyze moves ahead a t a speed much faster than a human player can; (3) it can improve its skill without limit. There is every reason to expect that a chess-learning machine, after play- ing thousands of games with experts, will someday develop the skill of a master. It is even possible to program a chess machine to play continuously and furiously against itself. 92 The Unexpected Hanging Its speed would enable it to acquire in a short time an ex- perience f a r beyond that of any human player. It is not necessary for the reader who would like to experi- ment with game-learning machines to buy an electronic com- puter. It is only necessary to obtain a supply of empty match- boxes and colored beads. This method of building a simple learning machine is the happy invention of Donald Michie, a biologist a t the University of Edinburgh. Writing on "Trial and Error" in Penguin Science Survey 1961, Vol. 2, Michie describes a ticktacktoe learning machine called MENACE (Matchbox Educable Naughts And Crosses Engine) that he constructed with three hundred matchboxes. MENACE is delightfully simple in operation. On each box is pasted a drawing of a possible ticktacktoe position. The machine always makes the first move, so only patterns that confront the machine on odd moves are required. Inside each box are small glass beads of various colors, each color indi- cating a possible machine play. A V-shaped cardboard fence is glued to the bottom of each box, so that when one shakes the box and tilts it, the beads roll into the V. Chance deter- mines the color of the bead that rolls into the V's corner. First-move boxes contain four beads of each color, third- move boxes contain three beads of each color, fifth-move boxes have two beads of each color, seventh-move boxes have single beads of each color. The robot's move is determined by shaking and tilting a box, opening the drawer and noting the color of the "apical" bead (the bead in the V's apex). Boxes involved in a game are left open until the game ends. If the machine wins, it is rewarded by adding three beads of the apical color to each open box. If the game is a draw, the reward is one bead per box. If the machine loses, it is punished by extracting the apical bead from each open box. This system of reward and punishment closely parallels the way in which animals and even humans are taught and disciplined. I t is obvious that the more games MENACE plays, the more i t will tend to adopt winning lines of play and shun losing lines. This makes it a legitimate learning machine, although of a n extremely simple sort. It does not make (as does Samuel's checker ma- chine) any self-analysis of past plays that causes i t to devise new strategies. A Matchbox Game-Learning Machine 93 Michie's first tournament with MENACE consisted of 220 games over a two-day period. At first the machine was easily trounced. After seventeen games the machine had abandoned all openings except the corner opening. After the twentieth game it was drawing consistently, so Michie began trying unsound variations in the hope of trapping it in a defeat. This paid off until the machine learned to cope with all such variations. When Michie withdrew from the contest after losing eight out of ten games, MENACE had become a mas- ter player. Since few readers are likely to attempt building a learning machine that requires three hundred matchboxes, I have de- signed hexapawn, a much simpler game that requires only twenty-four boxes. The game is easily analyzed-indeed, it is trivial-but the reader is urged not to analyze it. It is much more fun to build the machine, then learn to play the game while the machine is also learning. Hexapawn is played on a 3 x 3 board, with three chess pawns on each side as shown in Figure 43. Dimes and pen- nies can be used instead of actual chess pieces. Only two types of move are allowed: (1) A pawn may advance straight forward one square to an empty square; ( 2 ) a pawn may capture an enemy pawn by moving one square diagonal- ly, left or right, to a square occupied by the enemy. The cap- tured piece is removed from the board. These are the same as pawn moves in chess, except that no double move, en passant capture or promotion of pawns is permitted. The game is won in any of three ways: 1. By advancing a pawn to the third row. 2. By capturing all enemy pieces. 3. By achieving a position in which the enemy cannot move. Figure 43 The game of hexapawn 94 The Unexpected Hanging Players alternate moves, moving one piece a t a time. A draw clearly is impossible, but it is not immediately ap- parent whether the first or second player has the advantage. To construct HER (Hexapawn Educable Robot) you need twenty-four empty matchboxes and a supply of colored beads. Small candies that come in different colors-jujubes for example-or colored popping corn also work nicely. Each matchbox bears one of the diagrams in Figure 44. The robot always makes the second move. Patterns marked "2" repre- sent the two positions open to HER on the second move. You have a choice between a center or an end opening, but only the left end is considered because an opening on the right would obviously lead to identical (although mirror-reflected) lines of play. Patterns marked "4" show the eleven positions that can confront HER on the fourth (its second) move