%matplotlib inline
The game of Risk
Risk in Python
Genetic Algorithms
Using a genetic algorithm to play Risk
Each player is assigned a mission. By completing the mission the player wins the game.
Missions are, for example:
We built a framework that handles all aspects of the Risk game. It consists of five classes:
Board
, which handles the armies on the game board,Cards
, which handles the reinforcement cards of a player,Mission
, which describes the mission of a player,Player
, which makes decisions on how to play, andGame
, which handles all other aspects of the game.When initialized, the board randomly distributes the territories amongst the players.
import random
random.seed(42)
from board import Board
b = Board.create(n_players=4)
We can get a graphical representation of the board using plot_board()
:
b.plot_board()
We can easily manipulate the armies on the board:
b.set_owner(territory_id=0, player_id=5)
b.set_armies(territory_id=0, n=15)
b.plot_board()
The board is aware of the layout of the territories:
for territory_id, player_id, armies in b.neighbors(territory_id=0):
print 'Territory', territory_id, 'owned by player', player_id, 'is occupied by', armies, 'armies'
And can handle attack moves:
b.attack(from_territory=0, to_territory=21, n_armies=3)
b.plot_board()
We can get all available missions using the missions
function:
from missions import missions
all_missions = missions(n_players=4)
for m in all_missions:
print m.description
Each mission is aware of the player it is assigned to:
mission = all_missions[0]
mission
mission.assign_to(player_id=0)
mission
...and can evaluate whether the mission has been achieved yet:
mission.evaluate(board=b)
There is a special case when a player's mission is to kill himself:
mission = all_missions[-1]
mission
mission.assign_to(player_id=3)
mission
A player object is required to have four methods:
reinforce()
, attack()
,fortify()
,turn_in_cards()
Let's go through a whole game.
We'll use four RandomPlayer
s, which take a random decision at every step of their turn.
random.seed(42)
import game, player
risk_game = game.Game.create([player.RandomPlayer() for i in range(4)])
risk_game.plot()
Now the players may place armies until they each have 30 armies:
risk_game.initialize_single_army()
risk_game.plot()
Calling initialize_armies()
will have them place all armies:
risk_game.initialize_armies()
risk_game.plot()
Now the first player may play his turn.
risk_game.reinforce(risk_game.current_player)
risk_game.plot()
Then the attack phase:
risk_game.attack(risk_game.current_player)
risk_game.attack(risk_game.current_player)
risk_game.plot()
And finally the fortification phase:
risk_game.fortify(risk_game.current_player)
risk_game.next_turn()
risk_game.plot()
We can do this more quickly by using the play_turn()
method:
risk_game.play_turn()
risk_game.plot()
Now let's fast-forward 10 turns:
for i in range(10):
risk_game.play_turn()
risk_game.plot()
And to the end of the game:
while not risk_game.has_ended():
risk_game.play_turn()
risk_game.plot()
A machine learning algorithm based on evolution and natural selection.
Imagine we are trying to solve a puzzle.
The solution of the puzzle is a string of 16 bits,
e.g. 0110 0010 1101 0001
.
We can evaluate a candidate solution using a function
that returns the number of correct bits.
For example
0110 1101 1000 1010
would yield 7
.
import random
def random_solution():
return ' '.join([
''.join(['0' if random.random() < 0.5 else '1' for i in range(4)])
for i in range(4)
])
def compare_solution(y):
x = '0110 0010 1101 0001'
return sum((
1 if u == v and u != ' ' else 0
for u, v in zip(x, y)
))
for i in range(6):
rs = random_solution()
print rs, compare_solution(rs)
We can randomly generate a few solutions:
1110 1010 0010 0101
$\rightarrow$ 90101 1100 1001 0011
$\rightarrow$ 91101 1111 0111 1111
$\rightarrow$ 50110 0101 0010 1010
$\rightarrow$ 6To get further we can combine the best solutions (the parents),
by splitting them up and pasting them together to form children,
Another way to improve a solution is to randomly mutate one (or more) bits:
1110 1010 1001 0011
$\rightarrow$ 12to
1
010 1010 1001 0011
$\rightarrow$ 111110 1010 1
101 0011
$\rightarrow$ 131110 1010 1001 0
111
$\rightarrow$ 111110 1010 1001 00
01
$\rightarrow$ 13We can keep combining and mutating until we have found a satisfactory solution.
Suppose we have four risk players, [p1, p2, p3, p4]
Which is the best?
We could have them play a game:
game(p1, p2, p3, p4)
$\rightarrow$ p3
And a few more:
game(p1, p2, p3, p4)
$\rightarrow$ p1
game(p1, p2, p3, p4)
$\rightarrow$ p1
game(p1, p2, p3, p4)
$\rightarrow$ p4
game(p1, p2, p3, p4)
$\rightarrow$ p1
game(p1, p2, p3, p4)
$\rightarrow$ p2
game(p1, p2, p3, p4)
$\rightarrow$ p2
What if we have 8 players?
[p1, p2, ..., p7, p8]
game(p1, p2, p3, p4)
$\rightarrow$ p1
game(p5, p6, p7, p8)
$\rightarrow$ p7
Is p1
better than p5
?
We could play games with all combinations:
game(p1, p2, p3, p4)
game(p1, p2, p3, p5)
...
game(p5, p6, p7, p8)
70
Now what if we have 100 players?
[p1, p2, ..., p99, p100]
Playing a single game in every combination would require millions of games.
TrueSkill is a Bayesian ranking algorithm developed by Microsoft Research
source: Vincent Warmerdam - koaning.io
An implementation is readily available:
> pip install trueskill
import trueskill
Every new player gets a default score of 25:
a = trueskill.Rating()
b = trueskill.Rating()
print a
print b
After a game we can calculate the new ratings:
print 'Before:', [(a, ), (b, )]
print 'After: ', trueskill.rate(rating_groups=[[a], [b]], ranks=[1, 2])
(a, ), (b, ) = trueskill.rate(rating_groups=[[a], [b]], ranks=[1, 2])
print 'Before:', [(a, ), (b, )]
print 'A wins:', trueskill.rate(rating_groups=[[a], [b]], ranks=[1, 2])
print 'B wins:', trueskill.rate(rating_groups=[[a], [b]], ranks=[2, 1])
Trueskill also handles multiple players in a tie:
a = trueskill.Rating()
b = trueskill.Rating()
c = trueskill.Rating()
d = trueskill.Rating()
trueskill.rate(rating_groups=[[a], [b, c, d]], ranks=[1, 2])
As said before, a player needs to implement:
reinforce()
, attack()
,fortify()
,turn_in_cards()
Let's define:
Let's say $w_\text{territory_ratio} = 1$ and $w_\text{mission} = 2$, then
$$\begin{array}{ccc} \textbf{territory} & \text{territory_ratio} & \text{mission} & \textbf{rank} \\ 10 & 0.0 & 1 & 2.0 \\ 18 & 0.6 & 1 & 2.6\\ 24 & 0.0 & 1 & 2.0\\ 33 & 0.5 & 0 & 0.5 \\ 40 & 0.3 & 0 & 0.3 \\ 42 & 0.2 & 1 & 2.2 \end{array}$$So we pick territory 18.
Using a GA!
So why ~15? What is it normalized to?
Conclusion: territory ratio is ~15 times more important than the mission!
Play Risk in python
Genetic representation of a Risk player
Used a genetic algorithm to find better players
Watch our blog: blog.godatadriven.com