-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathddis-thesis-EN.html
executable file
·379 lines (335 loc) · 35.9 KB
/
ddis-thesis-EN.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
_______________________________________________________________________________________
____________________________________
_________Building an adapting Poker
Agent
Christian K undig
of Z urich, Switzerland
Student-ID: 04-706-743
ch.kuendig@kuendig.chMaster Thesis April 6, 2010
Advisor: Prof. Abraham Bernstein
Prof. Abraham Bernstein, PhD
Department of Informatics
University of Zurich
http://www.i .uzh.ch/ddis
__________________________________________________________________________________________________________________________________________________________________________________
Acknowledgements
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Nullam a tellus. Aliquam commodo
dui non ipsum. Duis mollis nisi id turpis. Donec quis ipsum. Curabitur sed nibh. Morbi suscipit
justo quis orci. Ut massa tortor, ultricies vitae, lacinia eu, facilisis eu, nisl. Nulla mattis urna
sed metus imperdiet ornare. Praesent sodales. Etiam laoreet. Mauris quam magna, sagittis et,
pharetra eget, congue vitae, arcu. Fusce sollicitudin justo. Suspendisse lectus. Sed lobortis dolor
quis lectus scelerisque ornare. Integer purus. Phasellus vel elit at nibh sagittis lobortis. Aliquam
iaculis malesuada eros. Mauris metus.
___________________________________________________________________________________________________________________________________________________________________________
Abstract
The game of poker offers a clean well-defined domain in which to investigate some truly funda-
mental issues in computing science, such as how to handle deliberate misinformation, and how
to make intelligent guesses based on partial knowledge. In the taxonomy of games, poker lies at
the opposite end of the spectrum from well-studied board games like checkers and chess. Poker
is a multi-player game with stochasticity (random events occurring over a known probability dis-
tribution), imperfect information (some information is hidden from each player), and partially
observable outcomes (some information might never be known). Consequently, the use of decep-
tion, opponent modeling, and coping with uncertainty are indispensable elements of high-level
strategic play. Traditional methods for computer game-playing are incapable of handling these
properties. [Billings, 2006]
___________________________________________________________________________________________________________________________________________________________________________
Zusammenfassung
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Nullam a tellus. Aliquam commodo
dui non ipsum. Duis mollis nisi id turpis. Donec quis ipsum. Curabitur sed nibh. Morbi suscipit
justo quis orci. Ut massa tortor, ultricies vitae, lacinia eu, facilisis eu, nisl. Nulla mattis urna
sed metus imperdiet ornare. Praesent sodales. Etiam laoreet. Mauris quam magna, sagittis et,
pharetra eget, congue vitae, arcu. Fusce sollicitudin justo. Suspendisse lectus. Sed lobortis dolor
quis lectus scelerisque ornare. Integer purus. Phasellus vel elit at nibh sagittis lobortis. Aliquam
iaculis malesuada eros. Mauris metus
___________________________________________________________________________________________________________________________________________________________________________
Table of Contents
Table of Contents ix
1 Introduction 1
1.1 About Poker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Motivation 3
2.1 Poker and A.I. Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Texas Hold’em Poker Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.1 Betting structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.2 Play of the hand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.3 The showdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3.1 Expert System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3.2 Simulation-Based Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3.3 Game-Theoretic Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 Game Tree Search 13
3.1 Deterministic games with perfect information . . . . . . . . . . . . . . . . . . . . . . 13
3.1.1 Optimal strategies: The Minimax Algorithm . . . . . . . . . . . . . . . . . . 14
3.2 Stochastic Games with imperfect information . . . . . . . . . . . . . . . . . . . . . . 16
3.2.1 Incorporating imperfect information in the game tree search: Miximax Search 17
3.2.2 Miximax and Miximix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2.3 Calculation example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3 Multiplayer games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4 Opponent Modelling 23
4.1 Optimal versus Maximal Play . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 Challenges in Opponent Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3 Opponent Modelling Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3.1 Expert Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3.2 Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3.3 Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.3.4 Decision Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3.5 Alternative opponent models . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.4 Choosing the parameters and attributes . . . . . . . . . . . . . . . . . . . . . . . . . 34
x TABLE OF CONTENTS_________________________________________________________________________________
5 Building the opponent model 37
5.1 Aquiring a Training Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2 Updateable Data Mining Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.2.1 Online Backpropagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.2.2 Hoeffding Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.3 Hand Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.4 Quadratic Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.5.1 Performance in Action Prediction: Naive Bayes . . . . . . . . . . . . . . . . 45
5.5.2 Performance in Action Prediction: Online Backpropagation . . . . . . . . . 46
5.5.3 Performance in Action Prediction: Hoeffding Tree . . . . . . . . . . . . . . . 47
5.5.4 Performance in Hand Prediction: Naive Bayes . . . . . . . . . . . . . . . . . 48
5.5.5 Performance in Hand Prediction: Online Backpropagation . . . . . . . . . . 49
5.5.6 Performance in Hand Prediction: Hoeffding Tree . . . . . . . . . . . . . . . 50
5.5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6 Implementation 53
6.1 The Action Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.1.1 Reverse Mapping of Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.2 Building the Game Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.3 Other Optimizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.3.1 Implementaton of Pre-Flop Play . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.3.2 Multi-Threaded Tree search . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
7 Evaluation and Benchmarking 61
7.1 Benchmark Tournament . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.1.1 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.1.2 Benchmark Opponents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
7.1.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.1.4 Results for Opponent Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
7.2 Insights from playing against humans . . . . . . . . . . . . . . . . . . . . . . . . . . 68
7.2.1 Pre-Flop Play . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.2.2 Post-Flop Play . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
8 Limitations and Future Work 71
9 Conclusions 73
A This is an Appendix 75
A.1 Poker Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
A.2 Software Manual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
List of Figures 79
List of Tables 81
List of Listings 83
Bibliography 85
1
Introduction
Game playing always presented an attractive tasks for artificial intelligence research. Already
in the early 1950ies, in the wake of the first programmable computers, games have drawn the
attention of researchers. Chess had been tackled by early computer scientists like Konrad Zuse,
Claude Shannon (the inventor of information theory), Norbert Wiener (the creator of modern
control theory) or Alan Turing [Friedel, 2002].
Also in the fifties, the first academic research of poker was conducted. Since computers were
still way to slow to provide any assistance to analyze a full fledged poker game, [Kuhn, 1950]in-
vented his own simple version of poker. For this very simple version, consisting only of a deck
of three cards, two players with one hole card each, one betting stage and no community cards,
Kuhn was able to calculate a complete equilibrium solution. He demonstrated that there are
many game theoretic optimal strategies for the first player in this game, but only one for the sec-
ond player, and that, when played optimally, the first player should expect to lose at a rate of − 1_
18
per hand.
Since then, there has been steady progress in the standard of play strength of artificial intel-
ligent algorithms, to the point that machines have surpassed humans in Checkers and Othello,
have defeated human champions in Chess and Backgammon, and are competitive in many other
games. Few exceptions remain, one being Go, where computers notoriously perform weak, and
poker, where tremendous progress has been made in the last decade, but a wealth of unsolved
problems still remain.
As [Russell et al., 2003] mentions, game are interesting because they often are too hard to solve.
Chess has an average branching factor of about 35, with games often going to 50 moves per player,
so the search tree has about 35100 (or 10154) nodes. Games, like the real world, therefore require
the ability to make some decision even when calculating the optimal decision is infeasible. Games
also penalize inefficiency severely. Whereas an implementation of a algorithm is half as efficient
will simply cost twice as much to run to completion, a game program that is half as efficient in
using its available time probably will be beaten into the ground.
In poker, the actual game tree does not branch as much as in chess, but the addition of chance
imperfect information pose challenges of similar amplitude. Kuhn Poker remains one of the few
simple examples, where a complete solution has been found, with even simple (compared to other
poker variants) variants like Heads-Up Limit Texas Hold’em still remaining a challenge.
2 Chapter 1. Introduction_________________________________________________________________________________
1.1 About Poker
Poker is a well-known card game which gained a lot of popularity in the last years, thanks to
plenty of online sites, offering to play the game online with real cash.
Poker proceeds in stages, in each stage, cards are dealt (some of which are dealt face down
only for one player to see and some of which are dealt face up for all players to see). Players
then wager against each other that their hand of cards is the best, or will be the best if the hand is
played to the end.
To remain active in the game, and keep a chance to win all the wagered chips, a player must
match the amount of money bet by each of their opponents. If one player makes a bet that no
opponent matches, then the game ends immediately with that player winning the game and all
the bets placed and matched so far (the pot) - regardless of the cards they actually had. In the
other case, if all bets are matched, play continues into the next stage, until there are no stages left
in the game, and all active players reveal their cards in a showdown. The player with the highest
ranked hand at that point then wins the pot.
Because each player only knows their own cards, and the common board cards, but lacks
information about their opponents’ hole cards, poker is an example of a domain of imperfect
information.
While for most of the last century limit poker was popular, the focus of a lot of players has
shifted to No-Limit variations of the game. Unlike with limit poker, in No-Limit Poker the pos-
sible wagers a player can do are only limited by the amount of chips he owns. This thesis will
mainly focus on the No-Limit variant.
2
Motivation
Perhaps chess is the wrong game for the times. Poker is now everywhere, as amateurs
dream of winning millions and being on television for playing a card game whose
complexities can be detailed on a single piece of paper. But while chess is a 100
percent information game - both players are aware of all the data all the time- and
therefore directly susceptible to computing power, poker has hidden cards and
variable stakes, creating critical roles for chance, bluffing, and risk management.
These might seem to be aspects of poker based entirely on human psychology and
therefore invulnerable to computer incursion. A machine can trivially calculate
the odds of every hand, but what to make of an opponent with poor odds making
a large bet?
The Chess Master and the Computer by Garry Kasparov, 2009
2.1 Poker and A.I. Research
Poker is a complex game involving uncertainty, chance and the requirement for strategic and
game-theoretic reasoning. Playing this game well, requires the use of intricate strategies as well
as a fond knowledge of the stochastic foundations of the game.
Novice players might see poker as a pure game of chance and look for strategies that play
each hand according to its chance of being the best hand in play. This completely misses the fact
that the presence of imperfect information allow the usage of advanced strategies, to hide and
misrepresent information about a players own card, and trying to win a pot without holding the
strongest hand.
The two features of stochasticity and imperfect information have made poker an interesting
field for A.I. research. Thanks to a recent boom of poker and computer poker research as well, reg-
ular computer poker competitions ( [University of Alberta, 2010] and human-computer matches
([University of Alberta, 2007],[University of Alberta, 2008]) are conduected.
There has been a recent flurry of research into developing strong programs for playing poker.
Just as chess was once seen as an important challenge problem for AI, poker is now starting to
be seen in the same way. Research has progressed to a point where Heads-Up Limit Hold’em
programs have shown performances capable of winning against some of the best professional
poker players.
Thanks to various characteristics, poker is a appropriate testbed to apply concepts of artificial
intelligence. The most important ones, according to [Schauenberg, 2006] thereby are:
4 Chapter 2. Motivation_________________________________________________________________________________
∙ decision-theory and probabilistic reasoning - The randomness from the deal of cards
and the lack of knowledge of the opponents cards make poker a noisy and un-
certain domain. This forces a player to be adept at both decision-theory and
probabilistic reasoning.
∙ risk assessment - Poker is played in stages containing both the deal of additional
cards and wagering. To be able to handle this type of domain of chance, a suc-
cessful player must be able to assess risks.
∙ opponent modeling - In poker each hand represents a repeated against the same
adaptive agent. This type of environment allows a player to use opponent model-
ing to learn their opponents playing strategies and attempt to exploit this knowl-
edge to increase their profit.
Most of these aspects can also be found in other fields of artificial intelligence, with the chance
of carrying over insights gained by computer poke into other realms, such as:
∙ user modeling - Opponent modeling is a form of user modeling. As such, its a recurring
problem of modern A.I. research.
In typical user modeling research, a observer tries to find patterns in the behavior of an
observed entity. The identified patterns can then be used to customize an application to
take advantage of the identified user preferences.
∙ policy-making or negotiation agents - Game theorists long ago realized poker could be used to
illustrate fundamental principles of game theory that have subsequently been applied to a
variety of fields including law, politics and economics.
The development of automated decision-making in poker could lead to advancements in
creating tools for use in these domains.
The bulk of the research into computer poker, including that from the University of Alberta
- currently one of the driving forces in this field, has been on Heads-Up Limit Texas Hold’em.
In this game, the players only ever have at most three possible actions (fold, call, or raise). In
No-Limit Texas Hold’em on the other hand, players may bet any amount up to the amount of
chips remaining in their stack. This rule change significantly alters the optimal strategies, and
also poses new research problems when developing a computer program for playing the game.
For this reason and the fact that No-Limit Poker has become much more popular in the last
few years his thesis will focus on the No-Limit variant - though most approaches can also be and
have been used in Limit Poker.
2.2 Texas Hold'em Poker Rules
2.2.1 Betting structures
Texas Hold’em is normally played with small and big blinds and the possibility for antes. These
forced contributions by the players next to dealer (blinds) or everyone (antes) provide a starting
pot to win.
A dealer button is used to represent the player in the dealer position; the dealer button rotates
clockwise after each hand, changing the position of the dealer and blinds. The small blind is
posted by the player to the left of the dealer and is usually equal to half of the big blind. The
big blind, posted by the player to the left of the small blind, is equal to the minimum bet. In
2.2 Texas Hold'em Poker Rules 5
tournament poker, the blind/ante structure periodically increases as the tournament progresses,
so players are forced to amass more and more chips to stay in the tournament.
When only playing with two players, e.g. at the end of a tournament, special ’heads-up’ rules
are enforced and the blinds are posted differently. In this case, the person with the dealer button
posts the small blind, while the other player off the button places the big blind. Since the big blind
has option, the dealer acts first before the flop. After the flop, the dealer acts last (as normal) and
continues to do so for the remainder of the hand.
Theres no one rule of how to set up the betting in all games of poker. The four most common
various of Texas Hold’em are limit, spread limit, pot limit and no limit.
∙ Limit Hold’em has historically been the most popular form of Hold’em found in casino live
action games in the United States. In limit Hold’em, bets and raises during the first two
rounds of betting (pre-flop and flop) must be equal to the big blind; this amount is called
the small bet. In the next two rounds of betting (turn and river), bets and raises must be
equal to twice the big blind; this amount is called the big bet.
∙ Spread limit is often played in home games. In a spread-limit game, a player can bet any
amount within some range for instance 1−5. Basically, it means the minimum any player
can bet is 1,andthemostanyonecanbetorraiseatonetimeis5. If someone wishes to re-raise,
they must raise at least the amount of the previous raise. This minimal raise rule also holds
true for pot-limit and no-limit.
∙ In pot-limit Hold’em, the maximum raise is the current size of the pot (including the amount
needed to call). Other then that, all bets and raises are allowed.
∙ No-Limit Hold’em is the form most commonly found in tournament poker and is the game
played in the main event of the World Series of Poker. In no-limit Hold’em, players may bet
or raise any amount over the minimum raise up to all of the chips the player has at the table
(called an all-in bet). The minimum raise is equal to the big blind.
2.2.2 Play of the hand
At the begin of each hand of Texas Hold’em, all players are dealt two cards face down, starting
with the player left of the dealer (the small blind). Texas Hold’em is played with a standard 52-
card deck containing no jokers. These two hidden cards are the players hole cards. They are the
only cards each player receives individually and they will only be revealed if a showdown occurs
or a player shows them voluntarily after finishing the hand.
The game begins with a pre-flop betting round, beginning with the player left of the big blind
and continuing clockwise. In a betting round, each player is asked to act, either checking (passing
on the action, if no call is required), betting (placing a wager), calling (even another players bet)
or raise (call a bet, and further bet an additional amount).
A round of betting continues until every player has folded, put in all of their chips, or matched
the amount put in by all other active players. The blinds are considered ”live” in the pre-flop
betting round, meaning that they contribute to the amount that the blind player must contribute,
and that, if all players call around to the player in the big blind position, that player may either
check or raise (or call/fold).
After the pre-flop betting round, assuming there remain at least two players taking part in the
hand, the dealer deals a flop, three face-up community cards. The flop is followed by a second
betting round. This and all subsequent betting rounds begin with the player to the dealer’s left
and continue clockwise.
6 Chapter 2. Motivation_________________________________________________________________________________
After the flop betting round ends, a single community card (called the turn or fourth street) is
dealt, followed by a third betting round. A final single community card (called the river or fifth
street) is then dealt, followed by a fourth betting round. If at this point of the game, more than one
players remain, each shows his hands (or mucks, if he surrenders the pot) and the pot is awarded
to the player with the strongest hand.
2.2.3 The showdown
If a player bets and all other players fold, then the remaining player is awarded the pot and is not
required to show his hole cards. If two or more players remain after the final stage, a showdown
occurs. On the showdown, each player plays the best poker hand they can make from the seven
cards comprising his two hole cards and the five community cards. A player may use both of his
own two hole cards, only one, or none at all, to form his final five-card hand. If a player decided
to surrender the hand (since another player might already have shown a stronger hand) he can
muck his hand without showing, losing any right to the pot.
If the best hand is shared by more than one player, then the pot is split equally among them,
with any extra chips going to the first players after the button in clockwise order.
2.3 Related Work
There are various ways to tackle the problems posed by Poker. This section will summarize the
most common approaches used in the past and present. The methods involving game tree search
will be thoroughly explained in the next chapter.
2.3.1 Expert System
In a heuristic-based approach to playing poker a rule-based expert system is used to make a poker
betting decision. To make a decision, the expert system first abstracts a complicated game state
into some simplified scenario that has one or more heuristics (e.g. opponent behavior, a play-
ers hole cards and/or the public board texture) associated with it and then it uses that heuristic
information to make a decision.
A heuristic based approach is appealing for novices of the game since it’s possible to formu-
late simple rules to base ones play upon. There is a plethora of poker literature focusing on this
approach by explaining the heuristics a player should spend attention to, and how to react to com-
mon game scenarios given these heuristics. Examples of this approach are hand-ranges to play
or fold in the pre-flop stage or strategies to disguise ones hand. As these rule sets are inherently
easy to reuse, they offer a natural starting point for building poker-playing programs.
There are different ways to compile the heuristics which govern these betting strategies. As
mentioned, they could be compiled by a domain expert (e.g. a professional poker player or book
author), or another way would be to derive and refine computationally through trial and error.
Like most rule-based artificial intelligence algorithms, these rule-sets unfortunately quickly be-
come too large to further develop and test. This also holds true for poker, where a situation rarely
occurs twice - and the required style of play can differ a lot between different opponents, so it’s
hard to find rules covering all possibilities that might occur during a game.
Nevertheless, due to the perceived simplicity, this approach was used for various implemen-
tations of poker-playing program. As reported, [Waterman, 1970] attempted to learn production
rules to define a betting system for 5-card draw poker. His program reportedly achieved roughly
the level of play of an experienced human player. [Smith, 1983] later revisited the same problem