Scaling Up Security Games: Algorithms and Applications

Scaling Up Security Games: Algorithms and Applications Ph.D. Dissertation Proposal submitted by Manish Jain April 2012 Guidance Committee Milind ...
Author: Kerry Cole
0 downloads 2 Views 771KB Size
Scaling Up Security Games: Algorithms and Applications

Ph.D. Dissertation Proposal submitted by

Manish Jain

April 2012

Guidance Committee

Milind Tambe Vincent Conitzer Fernando Ordonez Bhaskar Krishnamachari Mathew McCubbins

(Chairperson)

(Outside Member)

Abstract Protecting critical infrastructure and targets such as airports, historical landmarks, power generation facilities, and political figures is an important task for police and security agencies worldwide. Securing such potential targets using limited resources against intelligent adversaries in the presence of the uncertainty and complexities of the real-world is a major challenge. My research uses a game-theoretic framework to model the strategic interaction between a defender (or security forces) and an attacker (or terrorist adversary) in security domains. The Bayesian Stackelberg game framework has been successfully used to model such security domains. However, existing methods cannot scale to realistic problem sizes (up to billions of action combinations), even in the absence of uncertainty. My thesis presents new models and algorithms that scale-up to realistic problem sizes: (i) they can solve for billions of actions for the defender, (ii) they can solve for billions of actions for the attacker, and (iii) they provide orders of magnitude scale-up in attacker types for Bayesian Stackelberg games. These new models have not only advanced the state of the art in computational game-theory, but have actually been successfully deployed in the real-world. For instance, I led the development of the algorithm A SPEN that provides scale-up to billions of defender actions and has seen practical use through the IRIS system. IRIS has been in use by the Federal Air Marshal Service for scheduling officers on section of international flights since October 2009. My thesis contributes to a very new area that solves game-theoretic problems using insights from large-scale optimization literature. It represents a successful transition from game-theoretic advancements to real-world applications that are already in use, and it has opened exciting new avenues to greatly expand the reach of game theory. In the future, I would like to generalize the insights from this work and provide algorithms for more complex security domains. Specifically, I would like to provide new algorithms that can scale-up in domains where (i) multiple levels of security are deployed by the defender, and (ii) multiple attackers coordinate to conduct a single attack. Modeling these settings with current algorithms requires an exponentially-large game representation, and thus novel algorithms are required for practical use. Finally, I would like to integrate insights from my scalable techniques with other work in robust optimization and build towards a unified scalable robust algorithm.

ii

Chapter 1 Introduction

Deploying limited security resources to protect targets such as airlines and airports, economic centers and historical landmarks from the threat of international terrorism is a growing challenge. For example, in 2001, the 9/11 attack on the World Trade Center in New York City via commercial airliners resulted in $27.2 billion of direct short term costs (Looney, 2002) as well as a loss of 2,974 lives. The 2004 Madrid commuter train bombings resulted in 191 lives lost, 1755 wounded, and an estimated cost of 212 million Euros (Blanco, Valino, Heijs, Baumert, & Gomez, 2007). Finally, the 2008 terrorist attacks in Mumbai resulted in 195 lives lost and nearly 300 wounded (Chandran & Beitchman, 2008). Measures for protecting potential target areas include monitoring entrances and inbound roads, checking inbound traffic and patrolling aboard transportation vehicles. Bayesian Stackelberg games have been successfully used to model the security resource allocation problem (Paruchuri, Pearce, Marecki, Tambe, Ord´on˜ ez, & Kraus, 2008), since they can efficiently model strategic interaction between a defender and a follower. However, beyond the initial application at LAX, the newer real-world application domains make it challenging for existing techniques for Bayesian Stackelberg games to be applied, and thus novel techniques to solve large games are required. For example, the Federal Air Marshals Service (FAMS) schedules armed officers on-board passenger aircrafts. The enormity of the challenge faced by the FAMS can be revealed by a small example: an instance with 100 flights and 10 officers would have more than a billion possible assignments of air marshals to flights; in reality, there are an estimated 3,000–4,000 officers and about 30,000 flights (Keteyian, 2010). Additionally, the FAMS has to also deal with spatiotemporal and logistical constraints inherent in airline schedules, making the problem even harder. Another example domain for the security resource allocation problem is protecting targets in a city. In response to the attacks in 2008, the Mumbai police have started to schedule a limited number of inspection checkpoints on the road network throughout the city. Since the police could schedule any combination of checkpoints on the roads, they have exponentially many choices. Similarly, the attacker has exponentially many choices: a path from any source to any target is a

1

feasible attacker strategy. Similar problems are faced in other security domains as well, especially when there are multiple defender resources to be scheduled. Additionally, uncertainty in the real-world adds to the complexity of these domains. For example, the police may be facing either a well-funded hard-lined terrorist or criminals from local gangs. These two groups may have entirely different preferences, and the police may not know which attacker group they will be facing on any given day. These different attacker preferences are modeled as different attacker types using a Bayesian Stackelberg game. Again, the computational complexity of Bayesian Stackelberg games increases exponentially with the increase in the number of types, which also makes it challenging for the existing techniques to be applied. I have provided algorithms especially designed to scale-up to large real-world domains. These algorithms are built on the following insights: (i) Real-world domains have exponentially many pure strategies for the defender (e.g. a combination of checkpoints), and so, an incremental approach is required. This will avoid enumerating all the pure strategies, and will only add a pure strategy if the pure strategy would help increase defender payoff. (ii) In domains with exponentially many attacker pure strategies, an incremental approach to generate pure strategies for the attacker (e.g. attack paths) should be used to avoid enumeration of the pure strategy set of the attacker. (iii) A Bayesian Stackelberg game can be decomposed into hierarchically-organized smaller games, each with lesser number of attacker types, providing heuristics which can be used to eliminate the never-best-response (that is, dominated) pure strategies of the attacker. These insights provide speed-ups by reducing the size of the game: while insights (i) and (ii) restrict the game size by efficiently generating sub-games that include a pure strategy only if it improves the player’s payoff, insight (iii) pre-processes the input Bayesian Stackelberg game instance and removes the attacker pure strategies that cannot be part of the optimal solution. Additionally, all these techniques provide mathematical guarantees and can generate optimal as well as approximate solutions efficiently. I have developed different algorithms using combinations of these insights. For example, the A SPEN algorithm mentioned earlier does strategy generation for the defender, using insight (i) (Jain, Kardes, Kiekintveld, Ord´on˜ ez, & Tambe, 2010). Similarly, RUGGED does strategy generation for both the defender and the attacker, using both insights (i) and (ii) (Jain, Korzhyk, Vanek, Conitzer, Pechoucek, & Tambe, 2011b). HBGS (Jain, Kiekintveld, & Tambe, 2011a) is provided to efficiently solve Bayesian Stackelberg games: it decomposes the problem into many hierarchically-organized smaller Bayesian games as suggested in insight (iii). Finally, HBSA (Jain et al., 2011a) combines insight (i) and (iii): it uses strategy-generation for the defender and hierarchical decomposition to solve a Bayesian Stackelberg game with billions of defender pure strategies. These techniques overcome the significant limitation of previous solution methods (Conitzer & Sandholm, 2006; Paruchuri et al., 2008) that cannot scale-up to the real-world domains, even in the absence of uncertainty.

2

My research has been applied and deployed in the real-world. Indeed, A SPEN forms the core of I RIS, the scheduling assistant in use by the Federal Air Marshals Service since October 2009 to schedule air marshals on board international commercial flights (Jain, Tsai, Pita, Kiekintveld, Rathi, Tambe, & Ord´on˜ ez, 2010). My research has been successfully transitioned from theory to the real-world, and has opened doors for further large-scale game-theoretic deployments for security in the real-world. In the future, I propose to develop new scalable techniques for more complex security domains: specifically, domains with multiple levels of security and multiple attackers coordinating to conduct a single attack. This dissertation proposal is designed to highlight the contributions of my current research, and to outline the future directions that I intend to pursue. To keep this document short, I have avoided introducing notation and formal definitions; they are instead available in my publications (Kiekintveld, Jain, Tsai, Pita, Tambe, & Ord´on˜ ez, 2009; Jain et al., 2010, 2011a, 2011b).

3

Chapter 2 Background

Security problems are increasingly studied using Stackelberg games, since Stackelberg games can efficiently model the strategic interaction between a defender and an attacker. Stackelberg games were first introduced to model leadership and commitment (von Stackelberg, 1934), and are now widely used to study security problems ranging from “police and robbers” scenario (Gatti, 2008), computer network security (Lye & Wing, 2005), missile defense systems (Brown, Carlyle, Kline, & Wood, 2005) and terrorism (Sandler & M., 2003). Models for arms inspections and border patrolling have also been modeled using inspection games (Avenhaus, von Stengel, & Zamir, 2002), a related family of Stackelberg games. The wide use of Stackelberg games has inspired theoretical and algorithmic progress leading to the development of fielded applications. Conitzer and Sandholm give complexity results and algorithms for computing optimal commitment strategies in Bayesian Stackelberg games, including both pure and mixed-strategy commitments (Conitzer & Sandholm, 2006). D OBSS (Paruchuri et al., 2008), an algorithm for solving Bayesian Stackelberg games, is central to a fielded application in use at the Los Angeles International Airport (Pita, Jain, Western, Portway, Tambe, Ord´on˜ ez, Kraus, & Paruchuri, 2008). The work in this thesis also builds on Stackelberg games for modeling security domains. I now describe security domains followed by security games, the model by which security domains are formulated in the Stackelberg framework.

2.1

Security Domains

In a security domain, a defender must perpetually defend a set of targets using a limited number of resources, whereas the attacker is able to surveil and learn the defender’s strategy and attacks after careful planning. This fits precisely into the description of a Stackelberg game if we map the defender to the leader’s role and the attackers to the follower’s role (Avenhaus et al., 2002; Brown, Carlyle, Salmeron, & Wood, 2006). An action, or pure strategy, for the defender represents deploying a set of resources on patrols or checkpoints, e.g. scheduling checkpoints at the LAX airport or assigning federal air marshals to protect flight tours. The pure strategy for an attacker 4

represents an attack at a target, e.g. a flight. The strategy for the leader is a mixed strategy, a probability distribution over the pure strategies of the defender. Additionally, with each target are also associated a set of payoff values that define the utility for both the defender and the attacker in case of a successful or a failed attack. These payoffs are represented using the security games model, described next.

2.2

Security Games

In a security domain, a set of four payoffs is associated with each target. These four payoffs are the reward and penalty to both the defender and the attacker in case of a successful or an unsuccessful attack, and are sufficient to define the utilities for both players for all possible outcomes in a security domain. Table 2.1 shows an example security game with two targets, t1 and t2 . In this example game, if the defender was covering (protecting) target t1 and the attacker attacked t1 , the defender would get 10 units of reward whereas the attacker would receive −1 units.

Target t1 t2

Defender Covered Uncovered 10 0 0 -10

Attacker Covered Uncovered -1 1 -1 1

Table 2.1: Example security game with two targets. Security games make the realistic assumption that it is always better for the defender to cover a target as compared to leaving it uncovered, whereas it is always better for the attacker to attack an uncovered target. Another crucial feature of the security games is that the payoff of an outcome depends only on the target attacked, and whether or not it is covered by the defender (Kiekintveld et al., 2009). The payoffs do not depend on the remaining aspects of the defender allocation. For example, if an adversary succeeds in attacking target t1 , the penalty for the defender is the same whether the defender was guarding target t2 or not. Therefore, from a payoff perspective, many resource allocations by the defender are identical. This is exploited during the computation of a defender strategy: only the coverage probability of each target is required to compute the utilities of the defender and the attacker. The Bayesian extension to the Stackelberg game allows for multiple types of players, with each associated with its own payoff values (Paruchuri et al., 2008; Jain et al., 2011a). Bayesian games are used to model uncertainty over the payoffs and preferences of the players; indeed more uncertainty can be expressed with increasing number of types. For the security games of interest, there is only one leader type (e.g. only one police force), although there can be multiple follower types (e.g. multiple attacker types trying to infiltrate security). Each follower type is represented using a different payoff matrix, as shown by an example with two attacker types in Table 2.2. 5

Target t1 t2

Attacker Type 1 Defender Attacker Cov. Uncov. Cov. Uncov. 10 0 -1 1 0 -10 -1 1

Target t1 t2

Attacker Type 2 Defender Attacker Cov. Uncov. Cov. Uncov. 5 -4 -2 1 4 -5 -1 2

Table 2.2: Example Bayesian security game with two targets and two attacker types. The leader does not know the follower’s type, but knows the probability distribution over them. The goal is to find the optimal mixed strategy for the leader to commit to, given that the defender could be facing any of the follower types.

2.3

Solution Concept: Strong Stackelberg Equilibrium

The solution to a security game is a mixed strategy for the defender that maximizes the expected utility of the defender, given that the attacker learns the mixed strategy of the defender and chooses a best-response for herself. This solution concept is known as a Stackelberg equilibrium (Leitmann, 1978). However, this work computes the strong form of the Stackelberg equilibrium (Breton, Alg, & Haurie, 1988), which assumes that the follower will always break ties in favor of the leader in cases of indifference. This is because a strong Stackelberg equilibrium (SSE) exists in all Stackelberg games, and additionally, the leader can always induce the favorable strong equilibrium by selecting a strategy arbitrarily close to the equilibrium that causes the the follower to strictly prefer the desired strategy (von Stengel & Zamir, 2004). Indeed, SSE is the mostly commonly adopted concept in related literature (Osbourne & Rubinstein, 1994; Conitzer & Sandholm, 2006; Paruchuri et al., 2008). A SSE for security games is informally defined as follows (the formal definition of SSE is not introduced for brevity, and can instead be found in (Kiekintveld et al., 2009)): Definition 1. A pair of strategies form a Strong Stackelberg Equilibrium (SSE) if they satisfy: 1. The defender plays a best-response, that is, the defender cannot get a higher payoff by choosing any other strategy. 2. The attacker play a best-response, that is, given a defender strategy, the attacker cannot get a higher payoff by attacking any other target. 3. The attacker breaks ties in favor of the leader.

6

Chapter 3 Related Work

Game theory has been applied to a range of security problems, and different solution concepts and algorithms have been proposed. The related work on these game-theoretic approaches for security can be broadly divided into three categories: (i) algorithms on computing Stackelberg equilibria, (ii) algorithms to account for human behavior, and (iii) algorithms for computing robust defender strategies. The first category provides algorithms for computing strong Stackelberg equilibria for Bayesian Stackelberg games. The multiple-LPs approach solves many linear programs to compute the optimal defender strategy in Bayesian Stackelberg games (Conitzer & Sandholm, 2006), while D OBSS (Paruchuri et al., 2008) decomposes the Bayesian game into individual types and solves a mixed-integer linear program to compute the SSE strategy. Multiple-LPs, D OBSS and other previous work on Stackelberg games (Conitzer & Sandholm, 2006; Paruchuri et al., 2008; Basilico, Gatti, & Amigoni, 2009) has typically focused on a single defender, and faces combinatorial explosion in the presence of multiple defender resources. In contrast, algorithms like A SPEN that employ strategy generation focus on games with large numbers of defenders, and can handle the combinatorial explosion in the defender’s strategy space. O RIGAMI and E RASER-C are algorithms that have been developed for larger and more complex security games (Kiekintveld et al., 2009). O RIGAMI is a polynomial time algorithm that computes optimal defender strategies for a security game when no scheduling constraints are present (an example of scheduling constraints is the logistical constraints faced by a federal air marshal). The E RASER-C mixed-integer linear program provides scale-ups by computing the defender coverage per target, instead of computing mixed strategies over a joint assignment for all defender resources. Unfortunately, as the authors note, E RASER-C may fail to generate a correct solution in cases where the defender may face arbitrary scheduling constraints, as in the FAMS domain where a federal air marshal officer can fly multi-city tours with three or more flights. Thus, none of these algorithms are suitable for deployment in the real-world.

7

Additionally, the general idea of attacker-defender games has led to a further family of games, like inspection games (Avenhaus et al., 2002), ambush games (Ruckle, Fennell, Holmes, & Fennemore, 1976) and hider-seeker games (Flood, 1972; Halvorson, Conitzer, & Parr, 2009). These games focus on different aspects of the domain: for example, ambush games consider mobile attacker but fixed defenders whereas hider-seeker games consider mobile defender and mobile attacker. However, these approaches need to be scaled-up to be applied in domains with exponentially many defender and attacker strategies. The second category of related work that applies game-theory to security domains computes defender strategies when faced against human adversaries. This work takes into account the biases and bounded rationality of a human opponent. C OBRA is one such algorithm that considers anchoring-bias of humans (Pita, Jain, Ord´on˜ ez, Tambe, Kraus, & Magori-cohen, 2009), as well as attacker indifference between pure strategies that are at most  away from the optimal. Alternate solution techniques grounded in psychological concepts like Quantal Response Equilibrium (QRE) have also been proposed (Yang, Kiekintveld, Ord´on˜ ez, Tambe, & John, 2011). The focus of these algorithms has been to develop algorithms that perform well against humans; indeed they have been evaluated against human subjects. The third category of related work aims at computing robust solutions. Kiekintveld et. al (Kiekintveld, Marecki, & Tambe, 2011) model distributions over preferences of an attacker using infinite Bayesian games, and propose an algorithm to generate approximate solutions for such games. The C OBRA algorithm (Pita et al., 2009) mentioned before focuses on -rationality of the attacker, specifically for human subjects. In contrast, Yin et. al (Yin et al., 2010) consider the limiting case where an attacker has no observations and thus investigate the equivalence of Stackelberg vs Nash equilibria. Even earlier investigations have emphasized the value of commitment to mixed strategies in Stackelberg games in the presence of noise (van Damme & Hurkens, 1997). Secrecy and deception have also been modeled for Stackelberg games (Zhuang & Bier, 2011). Outside of Stackelberg games, models for execution uncertainty in game-theory have been separately developed (Archibald & Shoham, 2009). Robust solution methods for simultaneous move games have also been studied (Aghassi & Bertsimas, 2006; Porter, Ronen, Shoham, & Tennenholtz, 2002).

8

Chapter 4 Contributions

Real world problems, like the FAMS and urban road networks, present billions of pure strategies to both the defender and the attacker. Such large problem instances cannot even be represented in modern computers, let alone solved using previous techniques. I have provided new models and algorithms that compute optimal defender strategies for massive real-world security domains. In particular, my contributions are as follows: (i) A SPEN, an algorithm to compute strong Stackelberg equilibria (SSE) in domains with a very large number of pure strategies (up to billions of actions) for the defender (Jain et al., 2010); (ii) RUGGED, an algorithm to compute the optimal defender strategy in domains with a very large number of pure strategies for both the defender and the attacker (Jain et al., 2011b); and (iii) a new hierarchical framework for Bayesian games that is applicable to all Stackelberg solvers. I demonstrate its effectiveness for solving general Stackelberg games, and also in combination with strategy generation of A SPEN (Jain et al., 2011a). Moreover, these algorithms have not only been experimentally validated, but A SPEN has also been deployed in the real-world (Jain et al., 2010). These algorithms provide scale-ups in realworld domains by efficiently analyzing the strategy space of the players. A SPEN and RUGGED use strategy generation: the algorithms start by considering a minimal set of pure strategies for both the players (defender and attacker). Pure strategies are then generated iteratively, and a strategy is added to the set only if it would help increase the payoff of the corresponding player (a defender’s pure strategy is added if it helps increase the defender’s payoff). This process is repeated until the optimal solution is obtained. On the other hand, the hierarchical approach pre-processes the Bayesian Stackelberg game and eliminates strategies that can never be the best response of the players. This approach of strategy generation and elimination not only provides the required scale-ups, but also the mathematical guarantees on solution quality.

4.1

Scaling up with defender pure strategies

In this section, I describe how A SPEN generates pure strategies for the defender in domains where the number of pure strategies of the defender can be prohibitively large. As an example, let us 9

consider the problem faced by the Federal Air Marshals Service (FAMS). There are currently tens of thousands of commercial flights flying each day, and public estimates state that there are thousands of air marshals that are scheduled daily by the FAMS (Keteyian, 2010). Air marshals must be scheduled on tours of flights that obey logistical constraints (e.g., the time required to board, fly, and disembark). An example of a valid schedule is an air marshal assigned to a round trip tour from Los Angeles to New York and back. The scale of the domain is massive, since there are billions of possible assignments of air marshals to flight tours. For example, the number of ways in which 10 air marshals can be  4 scheduled over 100 flights is 100 10 ≈ 1.7×10 billion. Simply finding schedules for the marshals is a computational challenge. The task is made more difficult by the need to find an optimal strategy over these schedules that meets the scheduling constraints of the domain, while also accounting for an adaptive attacker and the different payoff values of each flight. I cast this problem as a security game, where the attacker can choose any of the flights to attack, and each air marshal can cover one schedule. Each schedule here is a feasible set of targets that can be covered together; for the FAMS, each schedule would represent a flight tour which satisfies all the logistical constraints that an air marshal could fly. A joint schedule then would assign every air marshal to a flight tour, and there could be exponentially many joint schedules in the domain. A pure strategy for the defender in this security game is a joint schedule. Since all the defender pure strategies (or joint schedules) cannot be enumerated for such massive problems, strategy generation is employed in A SPEN (Jain et al., 2010). A SPEN decomposes the problem into a master problem and a slave problem, which are then solved iteratively, as described in Algorithm 1. Given a limited number of pure strategies, the master solves for the defender and the attacker optimization constraints, while the slave is used to generate a new pure strategy for the defender in every iteration. Algorithm 1: Strategy generation in A SPEN 1. Initialize P // initialize, P: set of pure strategies of the defender 2. Solve Master Problem // compute optimal defender mixed strategy, given P 3. Calculate cost coefficients from solution of master 4. Update objective of slave problem with coefficients // compute whether a pure strategy will increase defender’s payoff 5. Solve Slave Problem // generate the pure strategy that is likely to increase the defender’s payoff the most if Optimal solution obtained then 6. Return (x, P) else 7. Extract new pure strategy and add to P 8. Repeat from Step 2

10

Master

Slave



Figure 4.1: Strategy generation employed in A SPEN: The schedules for a defender are generated iteratively. The slave problem is a novel minimum-cost integer flow formulation that computes the new pure strategy to be added to P; J4 is computed and added in this example. The iteratively process of Algorithm 1 is graphically depicted in Figure 4.1. The master operates on the pure strategies (joint schedules) generated thus far (Step 2), which are represented using the matrix P. Each column of P, Jj , is one pure strategy (or joint schedule). An entry Pij in the matrix P is 1 if a target ti is covered by joint-schedule Jj , and 0 otherwise. The objective of the master problem is to compute x, the optimal mixed strategy of the defender over the pure strategies in P. The objective of the slave problem is to generate the best joint schedule to add to P (Step 5). The best joint schedule is identified using the concept of reduced costs (Bertsimas & Tsitsiklis, 1994) (Step 3–4), which measures if a pure strategy can potentially increase the defender’s expected utility (the details of the approach are provided in (Jain et al., 2010)). While a na¨ıve approach would be to iterate over all possible pure strategies to identify the pure strategy with the maximum potential, A SPEN uses a novel minimum-cost integer flow problem to efficiently identify the best pure strategy to add. A SPEN always converges on the optimal mixed strategy for the defender; the proof can be found in (Jain et al., 2010). Employing strategy generation for large optimization problems is not an “out-of-the-box” approach, the problem has to be formulated in a way that allows for domain properties to be exploited. The novel contribution of A SPEN is to provide a linear formulation for the master and a minimum-cost integer flow formulation for the slave, which enable the application of strategy generation techniques. Additionally, A SPEN also provides a branch-and-bound heuristic to reason over attacker actions. This branch-and-bound heuristic provides a further order of magnitude speed-up, allowing A SPEN to handle the massive sizes of real-world problems. Indeed, A SPEN is currently being used by the FAMS to schedule air marshals on international flights.

11

4.2

Scaling up with defender and attacker pure strategies

In this section, I describe how RUGGED generates pure strategies for both the defender and the attacker in domains where the number of pure strategies of the players are exponentially large. Let us consider as an example the urban network security game. In this domain, the pure strategies of the defender correspond to allocations of resources to edges in the network – for example, an allocation of police checkpoints to roads in the city. The pure strategies of the attacker correspond to paths from any source node to any target node – for example, a path from a landing spot on the coast to the airport. The pure strategy space of the defender grows exponentially with the number of available resources, whereas the pure strategy space of the attacker grows exponentially with the size of the network. For example, in a fully connected graph with 20 nodes and 190 edges,  the number of defender pure strategies for only 5 resources is 190 ≈ 2 billion, while the number 5 of possible attacker paths without any cycles is ≈ 6.6e18. The graphs representing real-world scenarios are significantly larger, e.g., a simplified graph representing the road network in the southern tip of Mumbai has more than 250 nodes (intersections) and 700 edges (streets), and the security forces can deploy tens of resources.

Minimax

Best Response Defender

Best Response Attacker Figure 4.2: Strategy Generation employed in RUGGED: The pure strategies for both the defender and the attacker are generated iteratively. Here, strategy generation is required for both the defender and the attacker since the number of pure strategies of both the players are prohibitively large. Figure 4.2 shows the working of RUGGED: here, the minimax module generates the optimal mixed strategies hx, ai for the two players, whereas the two best response modules generate new strategies for the two players respectively. The rows Xi in the figure are the pure strategies for the defender, they would correspond to an allocation of checkpoints in the urban road network domain. Similarly, the columns Aj are the pure strategies for the attacker, they represent the attack paths in the urban road network domain. The values in the matrix represent the payoffs to the defender. RUGGED models 12

the domain as a zero-sum game, and computes the minimax equilibrium, since the payoff of the minimax strategy is equivalent to the SSE payoff in zero-sum games (Yin et al., 2010). The contribution of RUGGED is to provide the mixed integer formulations for the best response modules which enable the application of such a strategy generation approach. RUGGED can compute the optimal solution for deploying up to 4 resources in real-city network with as many as 250 nodes within a reasonable time frame of 10 hours (the complexity of this problem can be estimated by observing that both the best response problems are NP-hard themselves (Jain et al., 2011b)). While enhancements to RUGGED are required for deployment in larger real-world domains, RUGGED has opened new possibilities for scaling up to larger games.

4.3

Scaling up with attacker types

The different preferences of different attacker types are modeled through Bayesian Stackelberg games. Computing the optimal leader strategy in Bayesian Stackelberg game is NPhard (Conitzer & Sandholm, 2006), and polynomial time algorithms cannot achieve approximation ratios better than O(types) (Letchford, Conitzer, & Munagala, 2009). I have developed a new technique for solving large Bayesian Stackelberg games that decomposes the entire game into many hierarchically-organized restricted Bayesian Stackelberg games; it then utilizes the solutions of these restricted games to more efficiently solve the larger Bayesian Stackelberg game (Jain et al., 2011a). [*,*]

Type

[1,*]

[2,*]

Type [1,1]

[1,2]

[2,1]

[2,2]

(a) Attacker action tree, for a Bayesian game with attackers of two types, with two attacker actions each.

(b) Hierarchical game tree, decomposing a Bayesian Stackelberg game with 4 types into 4 restricted games with one type each

Figure 4.3: Hierarchical approach for solving Bayesian Stackelberg games. Figure 4.3(a) shows the attacker action tree of an example Bayesian Stackelberg game with 2 types and 2 actions per attacker type. The leaf nodes of this tree are all the possible action combinations for the attacker in this game, for example, leaf [1, 1] implies that the attackers of both types λ1 and λ2 chose action a1 . All the leaves of this tree (i.e. all combinations of attacker 13

strategies for all attacker types) need to be evaluated before the optimal strategy for the defender can be computed. This tree is typically evaluated using branch-and-bound. This requirement to evaluate the exponential number of leaves (or pure strategy combinations) is the cause of NPhardness of Bayesian Stackelberg games; indeed the performance of algorithms can be improved if leaves can be pruned by pre-processing. The overarching idea of hierarchical structure is to improve the performance of branch-andbound on the attacker action tree (Figure 4.3(a)) by pruning leaves of this tree. It decomposes the Bayesian Stackelberg game into many hierarchically-organized smaller games, as shown by an example in Figure 4.3(b). Each of the restricted games (‘child’ nodes in Figure 4.3(b)) consider only a few attacker types, and are thus exponentially smaller that the Bayesian Stackelberg game at the ‘parent’. For instance, in this example, the Bayesian Stackelberg game has 4 types (hλ1 , λ2 , λ3 , λ4 i) and it is decomposed into 4 restricted games (leaf nodes) with each restricted game having exactly 1 attacker type. The solutions obtained for the restricted games at the child nodes of the hierarchical game tree are used to provide: (i) pruning rules, (ii) tighter bounds, and (iii) efficient branching heuristics to solve the bigger game at the parent node faster. Such hierarchical techniques have seen little application towards obtaining optimal solutions in Bayesian games, while Stackelberg settings have not seen any application of such hierarchical decomposition. I provide HBGS which applies the hierarchical framework to general Stackelberg games, and HBSA that combines the hierarchical decomposition with strategy generation of A SPEN. I have shown that both HBGS and HBSA are orders of magnitude faster than other Bayesian Stackelberg algorithms for the respective problem settings. Additionally, these algorithms are naturally designed for obtaining quality bounded approximations since they are based on branch-and-bound, and provide a further order of magnitude scale-up without any significant loss in quality if approximate solutions are allowed.

4.4

Real-world Applications

Game-theoretic approaches for security scheduling have been successfully deployed in the real world, with applications like ARMOR and IRIS in use by the Los Angeles airport police and the FAMS since August 2007 and October 2009 respectively (Jain et al., 2010). I have been involved in the development of both these applications. While the algorithm of choice in ARMOR has been D OBSS (Paruchuri et al., 2008) (the LAX domain is small compared to FAMS or urban network security since only 8 terminals need to be protected at LAX), IRIS uses the A SPEN algorithm described before. Currently, IRIS is used to schedule air marshals on-board international flights; FAMS is indeed working towards increasing the scope of IRIS towards domestic and other sectors. Furthermore, game-theoretic software assistants for other agencies like the TSA (Pita, Kiekintveld, Tambe, Steigerwald, & Cullen, 2011), Coast Guard and Border Patrol are under development as well. 14

Chapter 5 Future Work

My current work has provided scale-ups for domains with large numbers of attacker and defender pure strategies, as well as for Bayesian Stackelberg games. However, newer models and algorithms are required for the application of game-theoretic techniques in larger domains. For example, G UARDS, the game-theoretic system developed for the TSA (Pita et al., 2011), would require more scalable techniques for G UARDS to be deployed across all airports in the nation. Specifically, new algorithms are required in domains where multiple levels of security are deployed by the defender. In many real-world security scenarios, the security forces use a multilayered defense strategy: for example, while checkpoints may be set up on in-bound roads to prevent attackers from entering the airport, there also are officers inside the terminals to prevent an attack in case an attacker manages to enter the airport. Such a problem modeled in the current framework would require enumerating all possible combinations of pure strategies of security resources at every level, which grows exponentially in the number of levels, security resources and pure strategies. I would like to provide new scalable algorithms for such domains. This can be achieved by avoiding the explicit enumeration of all exponential possibilities and identifying equivalences in different defender assignments. Additionally, current algorithms for computing SSE assume the presence of a single attacker, who attacks a single target. However, this may not be a realistic assumption. This is another instance where new algorithms are required, since all combinations of all actions for all attackers would have to be explicitly enumerated when using the current set of algorithms. Recent unpublished work by Korzhyk, Conitzer and Parr shows that Nash Equilibrium strategies for multiple attackers can be computed in polynomial time; however, I would like to develop new algorithms for computing SSE with multiple attackers in large-scale Bayesian games. Finally, I would like to integrate insights from my scalable techniques with other work in robust optimization and build towards a unified scalable robust algorithm.

15

My proposed schedule for completing my thesis is the following: 1. March, 2011 - October, 2011: Develop new models that can handle challenges like multiple level of defenses. 2. October, 2012 - March, 2012: Develop new models for multiple attackers, coordinating to conduct a single attack. 3. March, 2012 - April, 2012: Write dissertation. 4. May, 2012 - Defend thesis.

16

Chapter 6 Conclusions

Game-theoretic scheduling assistants are now being used daily to schedule checkpoints, patrols and other security activities by agencies such as LAX police, FAMS and the TSA. Augmenting the game-theoretic framework to handle the key challenge of scalability has been pivotal for the deployment of these game-theoretic approaches. The algorithms proposed in this work have provided the scale-ups required to overcome the challenges posed by some real-world problems, opening up new avenues for deploying game-theoretic techniques in real-world domains. Consequently, this work has been successfully transitioned from theory to applications deployed in the real-world: indeed, I have been involved in the development of ARMOR and IRIS applications which have been in use since August 2007 and October 2009 respectively. This research complements other research focused on handling uncertainty in Stackelberg games (Kiekintveld et al., 2011; Aghassi & Bertsimas, 2006; Jenelius, Westin, & Holmgren, 2010; Zhuang & Bier, 2011; Pita et al., 2009), and could ultimately be part of a single unified scalable robust algorithm.

17

Bibliography

Aghassi, M., & Bertsimas, D. (2006). Robust game theory. Math. Program., 107, 231–273. Archibald, C., & Shoham, Y. (2009). Modeling billiards games. In AAMAS. Avenhaus, R., von Stengel, B., & Zamir, S. (2002). Inspection Games. In Aumann, R. J., & Hart, S. (Eds.), Handbook of Game Theory, Vol. 3, chap. 51, pp. 1947–1987. North-Holland, Amsterdam. Basilico, N., Gatti, N., & Amigoni, F. (2009). Leader-follower strategies for robotic patrolling in environments with arbitrary topologies. In AAMAS, pp. 500–503. Bertsimas, D., & Tsitsiklis, J. N. (1994). Introduction to Linear Optimization. Athena Scientific. Blanco, M., Valino, A., Heijs, J., Baumert, T., & Gomez, J. G. (2007). The Economic Cost of March 11: Measuring the direct economic cost of the terrorist attack on March 11, 2004 in Madrid. Terrorism and Political Violence, 19(4), 489–509. Breton, M., Alg, A., & Haurie, A. (1988). Sequential Stackelberg equilibria in two-person games. Optimization Theory and Applications, 59(1), 71–97. Brown, G., Carlyle, M., Kline, J., & Wood, K. (2005). A Two-Sided Optimization for Theater Ballistic Missile Defense. In Operations Research, Vol. 53, pp. 263–275. Brown, G., Carlyle, M., Salmeron, J., & Wood, K. (2006). Defending Critical Infrastructure. In Interfaces, Vol. 36, pp. 530 – 544. Chandran, R., & Beitchman, G. (29 November 2008). Battle for Mumbai Ends, Death Toll Rises to 195. Times of India. Conitzer, V., & Sandholm, T. (2006). Computing the optimal strategy to commit to. In ACM EC-06, pp. 82–90. Flood, M. M. (1972). The Hide and Seek Game of Von Neumann. MANAGEMENT SCIENCE, 18(5-Part-2), 107–109. Gatti, N. (2008). Game theoretical insights in strategic patrolling: Model and algorithm in normalform. In ECAI-08, pp. 403–407. Halvorson, E., Conitzer, V., & Parr, R. (2009). Multi-step multi-sensor hider-seeker games. In IJCAI, pp. 159–166. Jain, M., Kardes, E., Kiekintveld, C., Ord´on˜ ez, F., & Tambe, M. (2010). Security games with arbitrary schedules: A branch and price approach. In AAAI. Jain, M., Kiekintveld, C., & Tambe, M. (2011a). Quality-bounded solutions for finite bayesian stackelberg games: Scaling up. In AAMAS, p. to appear. 18

Jain, M., Korzhyk, D., Vanek, O., Conitzer, V., Pechoucek, M., & Tambe, M. (2011b). A double oracle algorithm for zero-sum security games on graphs. In AAMAS. Jain, M., Tsai, J., Pita, J., Kiekintveld, C., Rathi, S., Tambe, M., & Ord´on˜ ez, F. (2010). Software Assistants for Randomized Patrol Planning for the LAX Airport Police and the Federal Air Marshals Service. Interfaces, 40, 267–290. Jenelius, E., Westin, J., & Holmgren, A. J. (2010). Critical infrastructure protection under imperfect attacker perception. International Journal of Critical Infrastructure Protection, 3(1), 16–26. Keteyian, A. (2010). TSA: Federal Air Marshals.. http://www.cbsnews.com/stories/ 2010/02/01/earlyshow/main6162291.shtml, retrieved Feb 1, 2011. Kiekintveld, C., Jain, M., Tsai, J., Pita, J., Tambe, M., & Ord´on˜ ez, F. (2009). Computing optimal randomized resource allocations for massive security games. In AAMAS, pp. 689–696. Kiekintveld, C., Marecki, J., & Tambe, M. (2011). Approximation methods for infinite Bayesian Stackelberg games: Modeling distributional payoff uncertainty. In AAMAS, p. to appear. Leitmann, G. (1978). On generalized Stackelberg strategies. Optimization Theory and Applications, 26(4), 637–643. Letchford, J., Conitzer, V., & Munagala, K. (2009). Learning and approximating the optimal strategy to commit to. In Second International Symposium on Algorithmic Game Theory (SAGT), pp. 250–262. Looney, R. (2002). Economic Costs to the United States Stemming From the 9/11 Attacks. Strategic Insights, 1(6). Lye, K., & Wing, J. M. (2005). Game strategies in network security. International Journal of Information Security, 4(1–2), 71–86. Osbourne, M. J., & Rubinstein, A. (1994). A Course in Game Theory. MIT Press. Paruchuri, P., Pearce, J. P., Marecki, J., Tambe, M., Ord´on˜ ez, F., & Kraus, S. (2008). Playing games with security: An efficient exact algorithm for Bayesian Stackelberg games. In AAMAS-08, pp. 895–902. Pita, J., Jain, M., Ord´on˜ ez, F., Tambe, M., Kraus, S., & Magori-cohen, R. (2009). Effective solutions for real-world Stackelberg games: When agents must deal with human uncertainties. In AAMAS. Pita, J., Jain, M., Western, C., Portway, C., Tambe, M., Ord´on˜ ez, F., Kraus, S., & Paruchuri, P. (2008). Deployed ARMOR protection: The application of a game-theoretic model for security at the Los Angeles International Airport. In AAMAS-08 (Industry Track), pp. 125– 132. Pita, J., Kiekintveld, C., Tambe, M., Steigerwald, E., & Cullen, S. (2011). Guards - game theoretic security allocation on a national scale. In In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS). Porter, R., Ronen, A., Shoham, Y., & Tennenholtz, M. (2002). Mechanism design with execution uncertainty. In UAI.

19

Ruckle, W., Fennell, R., Holmes, P. T., & Fennemore, C. (1976). Ambushing Random Walks I: Finite Models. Operations Research, 24, 314–324. Sandler, T., & M., D. G. A. (2003). Terrorism and game theory. Simulation and Gaming, 34(3), 319–337. van Damme, E., & Hurkens, S. (1997). Games with imperfectly observable commitment. Games and Economic Behavior, 21(1-2), 282 – 308. von Stackelberg, H. (1934). Marktform und Gleichgewicht. Springer, Vienna. von Stengel, B., & Zamir, S. (2004). Leadership with commitment to mixed strategies. Tech. rep. LSE-CDAM-2004-01, CDAM Research Report. Yang, R., Kiekintveld, C., Ord´on˜ ez, F., Tambe, M., & John, R. (2011). Improved computational models of human behavior in security games. In International Conference on Autonomous Agents and Multiagent Systems (Ext. Abstract). Yin, Z., Korzhyk, D., Kiekintveld, C., Conitzer, V., & Tambe, M. (2010). Stackelberg vs. Nash in security games: interchangeability, equivalence, and uniqueness. In AAMAS. Zhuang, J., & Bier, V. (2011). Secrecy and deception at equilibrium, with applications to antiterrorism resource allocation. Defence and Peace Economics, 22, 43–61.

20