How to Conquer the World¶

Genetic algorithms and the game of Risk¶

Rogier van der Geer ¶

Data Charmer @ GoDataDriven¶

Risk Analysis¶

%matplotlib inline

Overview¶

The game of Risk
Risk in Python
Genetic Algorithms
Using a genetic algorithm to play Risk

Risk¶

Invented in 1957 as The Conquest of the World,
since 1959: Risk: The Game of Global Domination.

According to the publishers:¶

Players: 2-6
Playing time: 1-8 hours

The game board¶

Assignment of territories¶

Placing armies¶

A turn consists of three stages:¶

Reinforcement
Combat
Fortification

Reinforcement: the player places additional armies:
- one army per three territories,
- bonus armies for owning a full continent,
- additional bonus for a set of reinforcement cards.

Combat: the player may attack neighboring territories:
- the battle is decided using dice,
- the attacker has a good chance to win if he has more armies,
- if the attacker conquers a territory he gets a reinforcement card,
- the player may attack indefinitely.

Fortification: the player may move armies:
- may make only one fortification move,
- from one territory to a neighboring territory,
- with as many armies as he likes.

Missions¶

Each player is assigned a mission. By completing the mission the player wins the game.

Missions are, for example:

Conquer africa and north america
Destroy the yellow player
Conquer at least 24 territories
...

Risk in Python¶

We built a framework that handles all aspects of the Risk game. It consists of five classes:

Board, which handles the armies on the game board,
Cards, which handles the reinforcement cards of a player,
Mission, which describes the mission of a player,
Player, which makes decisions on how to play, and
Game, which handles all other aspects of the game.

The Board¶

When initialized, the board randomly distributes the territories amongst the players.

import random
random.seed(42)

from board import Board
b = Board.create(n_players=4)

We can get a graphical representation of the board using plot_board():

b.plot_board()

We can easily manipulate the armies on the board:

b.set_owner(territory_id=0, player_id=5)
b.set_armies(territory_id=0, n=15)
b.plot_board()

The board is aware of the layout of the territories:

for territory_id, player_id, armies in b.neighbors(territory_id=0):
    print 'Territory', territory_id, 'owned by player', player_id, 'is occupied by', armies, 'armies'

Territory 6 owned by player 0 is occupied by 1 armies
Territory 15 owned by player 0 is occupied by 1 armies
Territory 21 owned by player 1 is occupied by 1 armies
Territory 35 owned by player 0 is occupied by 1 armies
Territory 36 owned by player 1 is occupied by 1 armies

And can handle attack moves:

b.attack(from_territory=0, to_territory=21, n_armies=3)
b.plot_board()

Missions¶

We can get all available missions using the missions function:

from missions import missions
all_missions = missions(n_players=4)

for m in all_missions:
    print m.description

conquer asia and south-america
conquer africa and asia
conquer africa and north-america
conquer north-america and oceania
conquer europe and south-america and an additional continent of choice
conquer europe and oceania and an additional continent of choice
conquer at least 24 territories
conquer at least 18 territories and have at least 2 armies on each territory
eliminate the red player
eliminate the blue player
eliminate the green player
eliminate the yellow player

Each mission is aware of the player it is assigned to:

mission = all_missions[0]
mission

Mission("conquer asia and south-america", unassigned)

mission.assign_to(player_id=0)
mission

Mission("conquer asia and south-america", assigned to red)

...and can evaluate whether the mission has been achieved yet:

mission.evaluate(board=b)

False

There is a special case when a player's mission is to kill himself:

mission = all_missions[-1]
mission

Mission("eliminate the yellow player", unassigned)

mission.assign_to(player_id=3)
mission

Mission("fallback: conquer at least 24 territories", assigned to yellow)

Players¶

A player object is required to have four methods:

reinforce(),
attack(),
fortify(),
turn_in_cards()

Playing a game¶

Let's go through a whole game.

We'll use four RandomPlayers, which take a random decision at every step of their turn.

random.seed(42)

import game, player
risk_game = game.Game.create([player.RandomPlayer() for i in range(4)])
risk_game.plot()

Now the players may place armies until they each have 30 armies:

risk_game.initialize_single_army()
risk_game.plot()

Calling initialize_armies() will have them place all armies:

risk_game.initialize_armies()
risk_game.plot()

Now the first player may play his turn.

risk_game.reinforce(risk_game.current_player)
risk_game.plot()

Then the attack phase:

risk_game.attack(risk_game.current_player)

risk_game.attack(risk_game.current_player)
risk_game.plot()

And finally the fortification phase:

risk_game.fortify(risk_game.current_player)
risk_game.next_turn()
risk_game.plot()

We can do this more quickly by using the play_turn() method:

risk_game.play_turn()
risk_game.plot()

Now let's fast-forward 10 turns:

for i in range(10): 
    risk_game.play_turn()
risk_game.plot()

And to the end of the game:

while not risk_game.has_ended(): 
    risk_game.play_turn()
risk_game.plot()

Genetic Algorithms¶

What is a GA?¶

A machine learning algorithm based on evolution and natural selection.

Why a GA?¶

Easy to use with little knowledge of the problem.

Robust against:

noise
many dimensions

A simple example¶

Imagine we are trying to solve a puzzle.
The solution of the puzzle is a string of 16 bits,
e.g. 0110 0010 1101 0001.

We can evaluate a candidate solution using a function that returns the number of correct bits.
For example 0110 1101 1000 1010 would yield 7.

import random
def random_solution():
    return ' '.join([
        ''.join(['0' if random.random() < 0.5 else '1' for i in range(4)])
        for i in range(4)
    ])
def compare_solution(y):
    x = '0110 0010 1101 0001'
    return sum((
        1 if u == v and u != ' ' else 0
        for u, v in zip(x, y)
    ))

for i in range(6):
    rs = random_solution()
    print rs, compare_solution(rs)

1111 1000 1110 1001 9
1101 1101 0101 1010 5
0101 1101 0010 0101 5
1001 0001 1000 0101 7
1011 1001 0101 1100 6
1111 0101 0101 1100 7

We can randomly generate a few solutions:

1110 1010 0010 0101 $\rightarrow$ 9
0101 1100 1001 0011 $\rightarrow$ 9
1101 1111 0111 1111 $\rightarrow$ 5
0110 0101 0010 1010 $\rightarrow$ 6

To get further we can combine the best solutions (the parents),

1110 1010 0010 0101 $\rightarrow$ 9
0101 1100 1001 0011 $\rightarrow$ 9

by splitting them up and pasting them together to form children,

0101 1100 0010 0101 $\rightarrow$ 6
1110 1010 1001 0011 $\rightarrow$ 12

Another way to improve a solution is to randomly mutate one (or more) bits:

1110 1010 1001 0011 $\rightarrow$ 12

to

1010 1010 1001 0011 $\rightarrow$ 11
1110 1010 1101 0011 $\rightarrow$ 13
1110 1010 1001 0111 $\rightarrow$ 11
1110 1010 1001 0001 $\rightarrow$ 13

We can keep combining and mutating until we have found a satisfactory solution.

A GA in short¶

A (random) initial pool of solutions
An evaluation function
A combine method
A mutate method

A GA for Risk¶

Evaluating a Risk player¶

1110 1010 0010 0101 $\rightarrow$ 9
0101 1100 1001 0011 $\rightarrow$ 9
0101 1100 0010 0101 $\rightarrow$ 6
1110 1010 1001 0011 $\rightarrow$ 12

Suppose we have four risk players, [p1, p2, p3, p4]

Which is the best?

We could have them play a game:

game(p1, p2, p3, p4) $\rightarrow$ p3

And a few more:

game(p1, p2, p3, p4) $\rightarrow$ p1
game(p1, p2, p3, p4) $\rightarrow$ p1
game(p1, p2, p3, p4) $\rightarrow$ p4
game(p1, p2, p3, p4) $\rightarrow$ p1
game(p1, p2, p3, p4) $\rightarrow$ p2
game(p1, p2, p3, p4) $\rightarrow$ p2

What if we have 8 players?

[p1, p2, ..., p7, p8]

game(p1, p2, p3, p4) $\rightarrow$ p1
game(p5, p6, p7, p8) $\rightarrow$ p7

Is p1 better than p5?

We could play games with all combinations:

game(p1, p2, p3, p4)
game(p1, p2, p3, p5)
...
game(p5, p6, p7, p8)

70

Now what if we have 100 players?

[p1, p2, ..., p99, p100]

Playing a single game in every combination would require millions of games.

Trueskill¶

TrueSkill is a Bayesian ranking algorithm developed by Microsoft Research

The prior: two new players¶

trueskill

source: Vincent Warmerdam - koaning.io

Posterior: player 1 wins¶

trueskill

Posterior: player 2 wins¶

trueskill

Prior: player 1 is very good. Player 2 is new¶

trueskill

Posterior: player 1 wins¶

trueskill

Posterior: player 2 wins¶

trueskill

An implementation is readily available:

> pip install trueskill

import trueskill

Using Trueskill¶

Every new player gets a default score of 25:

a = trueskill.Rating()
b = trueskill.Rating()
print a
print b

trueskill.Rating(mu=25.000, sigma=8.333)
trueskill.Rating(mu=25.000, sigma=8.333)

After a game we can calculate the new ratings:

print 'Before:', [(a, ), (b, )]
print 'After: ', trueskill.rate(rating_groups=[[a], [b]], ranks=[1, 2])

Before: [(trueskill.Rating(mu=25.000, sigma=8.333),), (trueskill.Rating(mu=25.000, sigma=8.333),)]
After:  [(trueskill.Rating(mu=29.396, sigma=7.171),), (trueskill.Rating(mu=20.604, sigma=7.171),)]

(a, ), (b, ) = trueskill.rate(rating_groups=[[a], [b]], ranks=[1, 2])

print 'Before:', [(a, ), (b, )]
print 'A wins:', trueskill.rate(rating_groups=[[a], [b]], ranks=[1, 2])

Before: [(trueskill.Rating(mu=29.396, sigma=7.171),), (trueskill.Rating(mu=20.604, sigma=7.171),)]
A wins: [(trueskill.Rating(mu=31.230, sigma=6.523),), (trueskill.Rating(mu=18.770, sigma=6.523),)]

print 'B wins:', trueskill.rate(rating_groups=[[a], [b]], ranks=[2, 1])

B wins: [(trueskill.Rating(mu=23.357, sigma=6.040),), (trueskill.Rating(mu=26.643, sigma=6.040),)]

Trueskill also handles multiple players in a tie:

a = trueskill.Rating()
b = trueskill.Rating()
c = trueskill.Rating()
d = trueskill.Rating()

trueskill.rate(rating_groups=[[a], [b, c, d]], ranks=[1, 2])

[(trueskill.Rating(mu=36.337, sigma=7.528),),
 (trueskill.Rating(mu=13.663, sigma=7.528),
  trueskill.Rating(mu=13.663, sigma=7.528),
  trueskill.Rating(mu=13.663, sigma=7.528))]

A Genetic Risk player¶

As said before, a player needs to implement:

reinforce(),
attack(),
fortify(),
turn_in_cards()

How to pick a move?¶

Create a list of all possible moves.
Rank the moves based on predefined criteria.
Pick the top-ranked move.

An example: Placing a reinforcement.¶

Let's define:

$\text{territory ratio} \equiv (n_{hostile}) / (n_{hostile} + n_{own})$
$\text{mission}$: relevance of territory for mission
...

$$\begin{array}{ccc} \textbf{territory} & \text{territory_ratio} & \text{mission} \\ 10 & 0.0 & 1 \\ 18 & 0.6 & 1 \\ 24 & 0.0 & 1 \\ 33 & 0.5 & 0 \\ 40 & 0.3 & 0 \\ 42 & 0.2 & 1 \end{array}$$

Define a reinforcement rank¶

$$ \begin{align} \text{reinforcement rank} &\equiv \text{territory_ratio} \cdot w_\text{territory_ratio} + \text{mission} \cdot w_\text{mission}. \end{align} $$

Let's say $w_\text{territory_ratio} = 1$ and $w_\text{mission} = 2$, then

$$\begin{array}{ccc} \textbf{territory} & \text{territory_ratio} & \text{mission} & \textbf{rank} \\ 10 & 0.0 & 1 & 2.0 \\ 18 & 0.6 & 1 & 2.6\\ 24 & 0.0 & 1 & 2.0\\ 33 & 0.5 & 0 & 0.5 \\ 40 & 0.3 & 0 & 0.3 \\ 42 & 0.2 & 1 & 2.2 \end{array}$$

So we pick territory 18.