%matplotlib inline
The game of Risk
Risk in Python
Genetic Algorithms
Using a genetic algorithm to play Risk



Each player is assigned a mission. By completing the mission the player wins the game.
Missions are, for example:
We built a framework that handles all aspects of the Risk game. It consists of five classes:
Board, which handles the armies on the game board,Cards, which handles the reinforcement cards of a player,Mission, which describes the mission of a player,Player, which makes decisions on how to play, andGame, which handles all other aspects of the game.When initialized, the board randomly distributes the territories amongst the players.
import random
random.seed(42)
from board import Board
b = Board.create(n_players=4)
We can get a graphical representation of the board using plot_board():
b.plot_board()
We can easily manipulate the armies on the board:
b.set_owner(territory_id=0, player_id=5)
b.set_armies(territory_id=0, n=15)
b.plot_board()
The board is aware of the layout of the territories:
for territory_id, player_id, armies in b.neighbors(territory_id=0):
print 'Territory', territory_id, 'owned by player', player_id, 'is occupied by', armies, 'armies'
And can handle attack moves:
b.attack(from_territory=0, to_territory=21, n_armies=3)
b.plot_board()
We can get all available missions using the missions function:
from missions import missions
all_missions = missions(n_players=4)
for m in all_missions:
print m.description
Each mission is aware of the player it is assigned to:
mission = all_missions[0]
mission
mission.assign_to(player_id=0)
mission
...and can evaluate whether the mission has been achieved yet:
mission.evaluate(board=b)
There is a special case when a player's mission is to kill himself:
mission = all_missions[-1]
mission
mission.assign_to(player_id=3)
mission
A player object is required to have four methods:
reinforce(), attack(),fortify(),turn_in_cards()Let's go through a whole game.
We'll use four RandomPlayers, which take a random decision at every step of their turn.
random.seed(42)
import game, player
risk_game = game.Game.create([player.RandomPlayer() for i in range(4)])
risk_game.plot()
Now the players may place armies until they each have 30 armies:
risk_game.initialize_single_army()
risk_game.plot()
Calling initialize_armies() will have them place all armies:
risk_game.initialize_armies()
risk_game.plot()
Now the first player may play his turn.
risk_game.reinforce(risk_game.current_player)
risk_game.plot()
Then the attack phase:
risk_game.attack(risk_game.current_player)
risk_game.attack(risk_game.current_player)
risk_game.plot()
And finally the fortification phase:
risk_game.fortify(risk_game.current_player)
risk_game.next_turn()
risk_game.plot()
We can do this more quickly by using the play_turn() method:
risk_game.play_turn()
risk_game.plot()
Now let's fast-forward 10 turns:
for i in range(10):
risk_game.play_turn()
risk_game.plot()
And to the end of the game:
while not risk_game.has_ended():
risk_game.play_turn()
risk_game.plot()
A machine learning algorithm based on evolution and natural selection.
Imagine we are trying to solve a puzzle.
The solution of the puzzle is a string of 16 bits,
e.g. 0110 0010 1101 0001.
We can evaluate a candidate solution using a function
that returns the number of correct bits.
For example
0110 1101 1000 1010 would yield 7.
import random
def random_solution():
return ' '.join([
''.join(['0' if random.random() < 0.5 else '1' for i in range(4)])
for i in range(4)
])
def compare_solution(y):
x = '0110 0010 1101 0001'
return sum((
1 if u == v and u != ' ' else 0
for u, v in zip(x, y)
))
for i in range(6):
rs = random_solution()
print rs, compare_solution(rs)
We can randomly generate a few solutions:
1110 1010 0010 0101 $\rightarrow$ 90101 1100 1001 0011 $\rightarrow$ 91101 1111 0111 1111 $\rightarrow$ 50110 0101 0010 1010 $\rightarrow$ 6To get further we can combine the best solutions (the parents),
by splitting them up and pasting them together to form children,
Another way to improve a solution is to randomly mutate one (or more) bits:
1110 1010 1001 0011 $\rightarrow$ 12to
1010 1010 1001 0011 $\rightarrow$ 111110 1010 1101 0011 $\rightarrow$ 131110 1010 1001 0111 $\rightarrow$ 111110 1010 1001 0001 $\rightarrow$ 13We can keep combining and mutating until we have found a satisfactory solution.
Suppose we have four risk players, [p1, p2, p3, p4]
Which is the best?
We could have them play a game:
game(p1, p2, p3, p4) $\rightarrow$ p3
And a few more:
game(p1, p2, p3, p4) $\rightarrow$ p1
game(p1, p2, p3, p4) $\rightarrow$ p1
game(p1, p2, p3, p4) $\rightarrow$ p4
game(p1, p2, p3, p4) $\rightarrow$ p1
game(p1, p2, p3, p4) $\rightarrow$ p2
game(p1, p2, p3, p4) $\rightarrow$ p2
What if we have 8 players?
[p1, p2, ..., p7, p8]
game(p1, p2, p3, p4) $\rightarrow$ p1
game(p5, p6, p7, p8) $\rightarrow$ p7
Is p1 better than p5?
We could play games with all combinations:
game(p1, p2, p3, p4)
game(p1, p2, p3, p5)
...
game(p5, p6, p7, p8)
70
Now what if we have 100 players?
[p1, p2, ..., p99, p100]
Playing a single game in every combination would require millions of games.
TrueSkill is a Bayesian ranking algorithm developed by Microsoft Research





An implementation is readily available:
> pip install trueskill
import trueskill
Every new player gets a default score of 25:
a = trueskill.Rating()
b = trueskill.Rating()
print a
print b
After a game we can calculate the new ratings:
print 'Before:', [(a, ), (b, )]
print 'After: ', trueskill.rate(rating_groups=[[a], [b]], ranks=[1, 2])
(a, ), (b, ) = trueskill.rate(rating_groups=[[a], [b]], ranks=[1, 2])
print 'Before:', [(a, ), (b, )]
print 'A wins:', trueskill.rate(rating_groups=[[a], [b]], ranks=[1, 2])
print 'B wins:', trueskill.rate(rating_groups=[[a], [b]], ranks=[2, 1])
Trueskill also handles multiple players in a tie:
a = trueskill.Rating()
b = trueskill.Rating()
c = trueskill.Rating()
d = trueskill.Rating()
trueskill.rate(rating_groups=[[a], [b, c, d]], ranks=[1, 2])
As said before, a player needs to implement:
reinforce(), attack(),fortify(),turn_in_cards()Let's define:
Let's say $w_\text{territory_ratio} = 1$ and $w_\text{mission} = 2$, then
$$\begin{array}{ccc} \textbf{territory} & \text{territory_ratio} & \text{mission} & \textbf{rank} \\ 10 & 0.0 & 1 & 2.0 \\ 18 & 0.6 & 1 & 2.6\\ 24 & 0.0 & 1 & 2.0\\ 33 & 0.5 & 0 & 0.5 \\ 40 & 0.3 & 0 & 0.3 \\ 42 & 0.2 & 1 & 2.2 \end{array}$$So we pick territory 18.
Using a GA!

So why ~15? What is it normalized to?

Conclusion: territory ratio is ~15 times more important than the mission!




Play Risk in python
Genetic representation of a Risk player
Used a genetic algorithm to find better players
Watch our blog: blog.godatadriven.com