Proof of the minimax theorem

Lecture notes for the course 2D1441 Seminars on Theoretical Computer Science, taken on the seminar of April 5th, 2005 by Oscar Göthberg, d00-ogo@ nada...
Author: Lucy McCoy
0 downloads 2 Views 69KB Size
Lecture notes for the course 2D1441 Seminars on Theoretical Computer Science, taken on the seminar of April 5th, 2005 by Oscar Göthberg, d00-ogo@ nada.kth.se.

Proof of the minimax theorem Lemma: The theorem of separating hyperplanes • Let M be a convex, closed set in Rn • Let x ¯ be a point where x ¯∈ /M Then there exists a vector c¯ and a number k so that c¯ · x ¯ = k and c¯ · y¯ > k for all y¯ ∈ M . Given an m × n game matrix A:    A=  

a11 a12 .. a1n a21 .. · · · · · am1 .. .. amn

     

Where: • Player I’s objective: find x ¯ so that all components of x ¯A are ≥ Vmin . • Player II’s objective: find y¯ so that all components of A¯ y  are ≤ Vmax . The minimax theorem states that Vmin = Vmax . To prove it’s correctness, we will first show that Vmin < 0 < Vmax is impossible. Let:    A0 =    

 a11 a12 .. a1n 1 0 .. 0 .. · 0 1 .. 0  a21  · · · ·   · · · ·  am1 .. .. amn · .. .. 1 

n+m

Now, let B be the convex hull of the columns in A0 . We have two cases: 1. ¯ 0 ∈ B. 2. ¯ 0∈ / B.

1

1) ¯ 0∈B

(¯ 0 ∈ Rm )

, z , ...z , z , ...z so that 0 ≤ z ≤ 1, zi = 1 and Then there are z 1 2 n n+1 n+m i n n n j=1 aij zij + k=1 δik zn+k = 0, which gives us j=1 aij zj ≤ 0 for all i. It is impossible for zj = 0 to be true for all j ≤ n, and thus



j≤n zj

> 0.

z Now, let yj = Pjzj , 1 ≤ j ≤ n. Then nj=1 aij yj ≤ 0 and yj gives a mixed strategy. We can conclude that Vmax ≤ 0. 2) ¯ 0∈ /B Then there are c1 , c2 , ...cm , k so that c¯ · ¯0 = k (that is, k = 0) and c¯·{column in A0 } > 0. This gives us ci > 0, for all i, and m i=j ci aij > 0 for 1 ≤ j ≤ n. Now, x a > 0. Which illustrates that Vmin > 0. xi = Pcici ⇒ m i=1 i ij We have shown that Vmin < 0 < Vmax is impossible. Now we can use this result to show that Vmin < t < Vmax is impossible for all t. Given a game matrix A, create A , equal to A with t subtracted from x, y¯) = x ¯A y¯ holds, then every component. If M (¯ x, y¯) = x ¯A¯ y  and M  (¯  x, y¯) = M (¯ x, y¯) − t and Vmin < t < Vmax ⇒ Vmin < 0 < Vmax , which is M (¯ impossible.

Structure theorem Let x ¯∗ , y¯∗ be optimal strategies. Strategies where x∗i > 0 and yj∗ > 0 are ∗ ∗ called active. Let ki = j aij yj , lj = i aij xi . Now: • if i is active, ki = v. • if j is active, lj = v. ...where v is the payoff of the game.

Proof We have M (¯ x∗ , y¯∗ ) = v, v = v ∗ i (v − ki )xi = 0.



∗ i xi

2

and v =



∗ i ki xi .

This implies that

Since v − ki ≥ 0, we get that v − ki = 0 for all x∗i > 0.

Special case We have a special case if A is an n × n matrix and all strategies are active. Assume A is invertible, and let:   1  1     J¯ = (1, 1, ..., 1), J¯ =   ·   ·  1 Player II is looking for a mixed strategy y¯ so that: A¯ y  = v J¯ ⇒ y¯ = vA−1 J¯ We must have J¯· y¯ = 1 which gives the sum of the matrix components: 1 ¯ −1 J¯ = 1 ⇒ v = v JA −1 ¯ JA J¯ The resulting mixed strategies are: A−1 J¯ y¯ = ¯ −1 ¯ , JA J

¯ −1 JA x ¯ = ¯ −1 ¯ JA J

If A is not invertible Assume that A is not necessarily invertible. We define A∗ by the relation A∗ ). AA∗ = |A| · I. (if |A| = 0 use A−1 = |A| |A| ¯ ∗ J¯ = 0. We can then show that v = and that: Assume that JA JA∗ J  A∗ J¯ y¯ = ¯ ∗ ¯ JA J

J¯A∗ x ¯ = ¯ ∗ ¯ , JA J

The other way around ¯ ∗ J¯ = 0 and Assume that we have an n × n matrix A, and assume that JA ¯ ∗ JA A∗ J¯ that JA ¯ ∗ J¯ , JA ¯ ∗ J¯ are vectors with all components > 0. Then there exist optimal mixed strategies: A∗ J¯ y¯ = ¯ ∗ ¯ JA J

¯ ∗ JA x ¯ = ¯ ∗ ¯ , JA J and the game’s payoff is:

|A| v = ¯ ∗ ¯ JA J 3

Example We have a game matrix:  A=

α 0 0 β

 ,

α > 0, β > 0

This kind of game is usually called “attacking a hidden object”, where α is the probability that the object is destroyed by a successful attack. Where should the object be hidden?   β 0 A∗ = 0 α ¯ ∗ J¯ = α + β, JA

|A| = αβ

And the game’s payoff is the harmonic mean of α and β: v=  x ¯=

α β , α+β α+β

αβ α+β 

 ,

y¯ =

β α+β α α+β



Extensive form In the extensive form, a game is modelled as a tree, with states as nodes and the possible moves as edges. The leaves of the tree represent the possible payoffs. See the example in figure ??. I's position

I

I's move

II

II

II

..................

II's position II's move

V1

V2

...................................

Vk

end positions

Figur 1: Example of a game in the extensive form

4

II

S

0.3

0.1 0.1

Figur 2: Example of chance in the extensive form

Elements of chance do not break the extensive model; chance can be included as a “third player”, see figure ??. If the game lacks elements of chance, and the payoff is either 1 (win) or 0 (loss), we have a purely combinatorial game. The extensive form is not useful for modeling all kinds of games, however. Examples of games that can’t be modelled completely using the extensive form: • games where the players pick their moves simultaneously, • games where the same position can occur an infinite number of times.

Information sets With the addition of information sets, we can use the extensive form to model games where the players do not always have complete information. Let’s say we have a game where the two players I and II pick a number 0 or 1 independently of each other. If they pick the same number, I wins, if they pick different numbers, II wins. Due to the tree structure used in the extensive form, one of the players has to pick first, giving the other player the freedom of picking just the right move to win. But with the introduction of information sets, we can specify that a number of game states are to be viewed as one and the same state from the active player’s point of view. See the example in figure ?? where the two states in the cloud are the same as far as player II knows.

Complete game model A game is modelled as a finite tree Γ, where: • every node is labeled I, II, S, or is a leaf. • no child has the same label as it’s parent. 5

I 1

0

II 0

1

II 1

0

0

1

1

0

Figur 3: Example of an information set

• every leaf is labeled with a number (the payoff for player I), or, for non-zero-sum games, a vector of the payoffs for all players. • the I-nodes are partitioned into information sets. • there is no downward path from an information set into itself. • all nodes in the same information set have the same degree k. Every node has it’s edges labeled with k symbols s1 , s2 , ..., sk . • the same applies to the II-nodes.

6

Suggest Documents