CS182 − Lecture 13: Planning Algorithms Agenda Announcements Planning as logic and search: Reasoning about plans using PDDL: forward (progression) and backward (regressions) search Planning with plan grahs: Graphplan, SATplan
Acknowledgement: Slides taken or adapted from Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons and connected with the text, Automated Planning: Theory and Practice: http://projects.laas.fr/planning/ CS 182: Intelligent Systems: Reasoning, Actions, & Plans
Fall 2013
Announcements Readings: Today: AIMA 3e, 10.3 Thursday, Oct 24: AIMA 3e, 11.2, skim 11.1; recommended: D. S. Nau. “Current trends in automated planning”, AI Magazine 28(4):43–58, 2007 Background: Ghallab, Nau and Traverso, Automated Planning: Theory and Practice, Morgan Kaufmann, 2004.
Sections this week: planning representations and algorithms.
Assignment date changes: Assignment 4: available Thursday, due November 5 Project proposals: now due Monday, November 4 CS 182: Intelligent Systems: Reasoning, Actions, & Plans
Fall 2013
Planning = Logic + Search Logic to express information about actions and states, and logical representations that enable planners to take advantage of action structure in reasoning about which actions to do when.
Search to generate a sequence of actions that accomplishes goal (gets to state satisfying goal propositions).
Factored Representations of Actions: PDDL Agent: one that acts Actions: simple and complex Beliefs: what an agent knows Goals (aka Desires): the states of the world an agent “prefers”
Intentions: actions the agent is committed to Recipes: descriptions/representations of ways to do a complex action or accomplish a task.
Plans: recipes+beliefs+intentions
FOL Representations of States and Actions Motivation: richer representations to support reasoning states are sets of features (e.g., ground atoms in FOL) actions are represented by structured operators
Represent relations among domain objects with ground atoms: Ground expression: contains no variable symbols; e.g., in(c1,p3) Unground expression: at least one variable symbol; e.g., in(c1,x)
Substitution: = {x1 v1, x2 v2, …, xn vn} Each xi is a variable symbol; each vi is a term Instance of exp: result of applying a substitution to exp; replace variables of e simultaneously, not sequentially
PDDL Operators Operators (schema) are used to model (classes of) actions:
Operator schema include Header (action name and arguments): a(p1,…, pk)
Preconditions: conjunction of literals; what must be true (immediately) before action is done.
Effects: add, delete lists: conjunction of positive and negative literals.
Operators, More Formally Operator: a triple o=(name(o), precond(o), effects(o)) precond(o): preconditions literals that must be true in order to use the operator effects(o): effects literals the operator will make true name(o): a syntactic expression of the form n(x1,…,xk) n is an operator symbol - must be unique for each operator (x1,…,xk) is a list of every variable symbol (parameter) that appears in o
Purpose of name(o) is so we can refer unambiguously to instances of o Rather than writing each operator as a triple, often is written as follows: pack(agt,obj,cont,loc) ;;agent agt packs object obj in container cont at location loc pre: at(agent,loc),at(obj,loc),at(cont,loc),empty(cont) effect: in(obj,cont), ¬empty(cont), ¬at(obj,loc)
example action: pack(Robbie, Books2, Carton1,RobRoom)
PDDL Operators for Semester Abroad Pack(agt,obj,cont,loc) pre: at(agent,loc),at(obj,loc),at(cont,loc),empty(cont) eff: in(obj,cont), ¬empty(cont), ¬at(obj,loc)
Walk(agt,loc1,loc2) pre: at(agt,loc1) eff: at(agt,loc2)
BikeGo(agt,loc1,loc2,bike) pre: at(agt,loc1),at(bike,loc1) eff: at(agt,loc2),at(bike,loc2) CS 182: Intelligent Systems: Reasoning, Actions, & Plans
Fall 2013
PDDL Operators for Semester Abroad (cont’d) ApproveStudyPlan(advisor,student,plan,loc) pre: at(advisor,loc), at(student,loc), finished(student,plan) eff: approved(student,plan)
GetVaccine(agt,disease) pre: at(agt, clinic), ¬vaccinated(agt,disease) effect: vaccinated(agt,disease)
PlaneTicket(agt,fromcity,tocity) pre: have(agt,fare) effect: haveticket(fromcity,tocity), ¬have(agt,fare) CS 182: Intelligent Systems: Reasoning, Actions, & Plans
Fall 2013
PDDL Operators for Semester Abroad (cont’d) CleanOutRoom(agt,room) pre: at(agt,room) eff: empty(room),clean(room)
AdjSuitcaseWeight(suitcase) pre: packed(suitcase) effect: weight(suitcase,OK)
3 groups to report!!
CS 182: Intelligent Systems: Reasoning, Actions, & Plans
Fall 2013
States and Actions State: a set s of ground atoms The atoms represent the things that are true in the state Only finitely many ground atoms, so only finitely many possible states
An action is a ground instance (via substitution) of an operator Let = {agt Robbie, obj RTextBs, cont Carton1, loc RRm}
Then (pack(agt,obj,cont,loc) is the following action: Pack(Robbie,RTextBs,Carton1,RRm)
precond: at(Robbie,RRm), at(RTextBs,RRm), at(Carton1,RRm), empty(Carton1)
effects:
in(RTextBs,Carton1), empty(Carton1), at(RTextBs,RRm)
Notation For S, a set of literals, S+ = {atoms that appear positively in S} S– = {atoms that appear negatively in S}
For an operator a be an operator,
precond+(a) = {atoms that appear positively in a’s preconditions} precond–(a) = {atoms that appear negatively in a’s preconditions} effects+(a) = {atoms that appear positively in a’s effects} effects–(a) = {atoms that appear negatively in a’s effects}
Notation Example Pack(agt,obj,cont,loc) pre: at(agent,loc),at(obj,loc),at(cont,loc),empty(cont) eff: in(obj,cont), ¬empty(cont), ¬at(obj,loc)
pack(Robbie,RTextBs,Carton1,RRm) effects notation: effects+(pack(Robbie,RTextBs,Carton1,RRm) = {in(RTextBs,Carton1)} effects–(pack(Robbie,RTextBs,Carton1,RRm) = {empty(Carton1), at(RTextBs,RRm)}
Applicability Let s be a state and a be an action a is applicable to (or executable in) s if s satisfies precond(a) precond+(a) s precond–(a) s = notation: (s,a)
Executing an applicable action: Remove a’s negative effects and add a’s positive effects (s,a) = (s – effects–(a)) effects+(a)
Planning Problems Formally Given a planning domain (language L, operators O) Statement of a planning problem: a triple P=(O,s0,g) O is the collection of operators s0 is a state (the initial state) g is a set of literals (the goal formula) Planning problem: P = (,s0,Sg) s0 = initial state Sg = set of goal states = (S,A,) is a state-transition system S = {all sets of ground atoms in L} A = {all ground instances of operators in O} = the state-transition function determined by the operators
Plans and Solutions Let P=(O,s0,g) be a planning problem Plan: any sequence of actions π = a1, a2, …, an such that each ai is an instance of an operator in O
π is a solution for P=(O,s0,g) if it is executable and achieves g; i.e., if there are states s0, s1, …, sn such that
(s0,a1) = s1 (s1,a2) = s2 …
(sn–1,an) = sn sn satisfies g
Using the Structure The PDDL language structure can be exploited to achieve search efficiently and in both directions.
INITIAL PRECOND STATE Only some actions are relevant
EFFECT GOAL Only some actions could put us here
Planning via Search We will search for a sequence of actions that take us from initial state to goal. Yields a totally-ordered plan. Next time we’ll look at partial plans.
State-space planning: Each node represents a state of the world, as represented in FOL.
Forward or Progressive Search: Start from the initial state, construct tree of actions, using any complete search technique.
Backward or Regressive Search: Start from the goal state and search for actions that can achieve each of the goals. (cf. backward chaining in logic)
Two Ways to Use PDDL Operators Progression: search forward from initial state looking for goal state: Determine operators that apply in current situation Huge search space: Domain-independent heuristics to the rescue (later).
Regression: search backwards from goal state to initial state. Pick goal to satisfy and look for operator effect (add) that mentions. Challenge: determine what must be true to apply an operator.
Progressive (Forward) Planning Initial State: set of initial world literals Goal State: goal literals Search operator: Choose an Action A, whose preconditions are satisfied Construct new state by using effects of Action A: adding positive literals and removing negative literals
Path Cost: 1 per step (number of actions) State space if finite Methods: any search algorithm, but some better than others (coming up)
Forward Search
goal: in(RClothes,RSuitcase)
si: at(Robbie,LR) at(RSuitcase,Attic) at(RClothes,RRm) at(RBike,LR)
Walk(Robbie,LR,attic)
... Walk(Robbie,LR,RRm) BikeGo(Robbie,LR,AdvOffice) …
Properties Forward-search is sound for any plan returned by any of its nondeterministic traces, this plan is guaranteed to be a solution
Forward-search also is complete if a solution exists then at least one of Forward-search’s nondeterministic traces will return a solution.
Deterministic Implementations Some deterministic implementations of forward search: breadth-first search s0 depth-first search best-first search (e.g., A*) greedy search
a1 a2 a3
s1
a4
s2
a5
s4 s5
…
s3
Breadth-first and best-first search are sound and complete But they usually aren’t practical because they require too much memory Memory requirement is exponential in the length of the solution In practice, more likely to use depth-first search or greedy search Worst-case memory requirement is linear in the length of the solution In general, sound but not complete But classical planning has only finitely many states Thus, can make depth-first search complete by doing loop-checking
sg
Two Ways to Use PDDL Operators Progression: search forward from initial state looking for goal state: Determine operators that apply in current situation Huge search space: Domain-independent heuristics to the rescue (later).
Regression: search backwards from goal state to initial state. Pick goal to satisfy and look for operator effect (add) that mentions. Challenge: determine what must be true to apply an operator.
Regressive (Backward) Planning Initial State: goal literals Search operator: Choose an Action A in state X, such that A is Relevant: effects has at least one of the literals of current state Consistent: effects do not negate another literal in X Then perform regression (“regress goal through action”) New state = remove all positive effects, add preconditions of A
Goal State: initial world literals Method: Again any search technique applies
Backward Search, Formally For forward search, we started at the initial state and computed state transitions new state = (s,a)
For backward search, we start at the goal and compute inverse state transitions new set of subgoals = –1(g,a)
To define -1(g,a), must first define relevance: An action a is relevant for a goal g if a makes at least one of g’s literals true g effects(a) ≠ a does not make any of g’s literals false g+ effects–(a) = and g– effects+(a) =
Blocks World Regression Example pickup(x):
pre: ontable(x), clear(x), HE (i.e., hand empty) eff: holding(x), ¬ontable(x), ¬clear(x), ¬HE putdown(x): pre: holding(x) eff: ontable(x), clear(x), HE, ¬holding(x) stack(x,y): pre: holding(x), clear(y)
eff: HE, on(x,y), clear(x), ¬holding(x), ¬clear(y) unstack(x,y): …
Simple Regression Examples Regress (on(A,B), Pickup(C)) = on(A,B) Regress (on(A,B), Stack(A,B)) = True Regress (HE,Pickup (A)) = False
If goal regresses to True, it can be eliminated from goal set.
If goal regresses to False, goal set cannot be satisfied, so prune branch.
Inverse State Transitions If a is relevant for g, then –1(g,a) = (g – effects(a)) precond(a)
Otherwise –1(g,a) is undefined
Example: suppose that g = {on(b1,b2), on(b2,b3)} a = stack(b1,b2)
What is –1(g,a)?
g4
a4
g1 g2
s0
g5
a1 a2
a5 g3
a3
g0
Efficiency of Backward Search Backward search can also have a very large branching factor E.g., an operator o that is relevant for g may have many ground instances a1, a2, …, an such that each ai’s input state might be unreachable from the initial state
As before, deterministic implementations can waste lots of time trying all of them
b1 b1
b2
b3
initial state
…
b50
goal
Lifting ontable(b1) pickup(b1)
on(b1,b1) unstack(b1,b1) unstack(b1,b2)
...
on(b1,b2)
holding(b1)
unstack(b1,b50)
on(b1,b50)
Can reduce the branching factor of backward search if we partially instantiate the operators this is called lifting
ontable(b1) on(b1,y)
pickup(b1)
unstack(b1,y)
holding(b1)
Lifted Backward Search More complicated than Backward-search, because have to keep track of the substitutions that were performed
But it has a much smaller branching factor
Search Space Still Too Large Lifted-backward-search generates a smaller search space than Backward-search, but it still can be quite large Suppose actions a, b, and c are independent, action d must precede all of them, and there’s no path from s0 to d’s input state We’ll try all possible orderings of a, b, and c before realizing there is no solution More about this in Chapter 5 (Plan-Space Planning) d
s0
a
b
d
b
a
d
b
a
d
a
c
d
b
c
d
c
b
c
b
a
goal
Aside: STRIPS Challenge1: The Sussman Anomaly a c a
b b
Initial state
c
goal
On this problem, STRIPS can’t produce an irredundant solution Try it and see
Aside: STRIPS Challenge 2: Register Assignment Problem Interchange the values stored in two registers State-variable formulation: registers r1, r2, r3 s0: {value(r1)=3, value(r2)=5, value(r3)=0} g:
{value(r1)=5, value(r2)=3}
Operator: assign(r,v,r',v') precond: value(r)=v, value(r')=v' effects: value(r)=v'
STRIPS cannot solve this problem at all
Summary: PDDL for Progression and Regression The PDDL language encoding can expose problem structure Key Take-Away Ideas (a) Action schema allows systematic search forwards & backwards (b) BUT Efficiency still requires smarts! (which actions when?); need heuristics. Next: Action language allows you automatically generate heuristics!
PRECOND Only some actions CS 182: Intelligent Systems: Reasoning, Actions, & Plans are relevant
EFFECT Only some actions Fall 2013 could put us here
Planning Graphs: An Alternative Representation for Search Motivation: A big source of inefficiency in search algorithms is the branching factor, i.e., the number of children of each node forward search may try operators that don’t help a backward search may try lots of actions that can’t be reached from the initial state
Planning graphs: a new data structure that simplifies search and incremental approach that starts with “relaxed problem”
The Graphplan Algorithm Constructing planning graphs Mutual exclusion Solution extraction
Graphplan procedure Graphplan:
for k = 0, 1, 2, … Graph expansion: relaxed create a “planning graph” that contains k “levels” problem Check whether the planning graph satisfies a necessary (but insufficient) condition for plan existence If it does, then possible possible literals actions do solution extraction: backward search, modified to consider only the actions in the planning graph
if we find a solution, then return it
in state si in state si
The Planning Graph Search space for a relaxed version of the planning problem Alternating layers of ground literals and actions Nodes at action-level i: actions that might be possible to execute at time i Nodes at state-level i: literals that might possibly be true at time i Edges: preconditions and effects state-level i-1
action-level i
state-level i
state-level 0 (the literals true in s0)
preconditions Maintenance action: for the case where a literal remains unchanged
effects
Example (from Dan Weld, U. Washington) Suppose you want to prepare dinner as a surprise for your sweetheart who is asleep, initial state and goals are s0 = {garbage, cleanHands, quiet} g = {dinner, present, garbage} Action
Preconditions
Effects
cook() wrap()
cleanHands quiet
dinner present
carry() none garbage, cleanHands dolly() none garbage, quiet Also have the maintenance actions: one for each literal
Example (continued) state-level 0: {all atoms in s0} U {negations of all atoms not in s0}
state-level 0 action-level 1 state-level 1
action-level 1: {all actions whose preconditions are satisfied and non-mutex in s0}
state-level 1: {all effects of all of the actions in action-level 1} Action Preconditions Effects cook() cleanHands dinner wrap() quiet present carry() none garbage, cleanHands dolly() none garbage, quiet Also have the maintenance actions
dinner present
dinner present
Mutual Exclusion
Two actions at the same action-level are mutex if Inconsistent effects: an effect of one negates an effect of the other Interference: one deletes a precondition of the other Competing needs: they have mutually exclusive preconditions Otherwise they don’t interfere with each other Both may appear in a solution plan Two literals at the same state-level are mutex if Inconsistent support: one is the negation of the other, or all ways of achieving them are pairwise mutex
Recursive propagation of mutexes
Example (continued) Augment the graph to indicate mutexes
state-level 0 action-level 1 state-level 1
carry is mutex with the maintenance action for garbage (inconsistent effects)
dolly is mutex with wrap interference ~quiet is mutex with present inconsistent support
each of cook and wrap is mutex with a maintenance operation Action Preconditions Effects cook() cleanHands dinner wrap() quiet present carry() none garbage, cleanHands dolly() none garbage, quiet Also have the maintenance actions
dinner present
dinner present
Example (continued) state-level 0 action-level 1 state-level 1
Check to see whether there’s a possible solution
Recall that the goal is {garbage, dinner, present}
Note that in state-level 1, All of them are there None are mutex with each other Thus, there’s a chance that a plan exists Try to find it Solution extraction
dinner present
dinner present
Solution Extraction The set of goals we are trying to achieve
The level of the state sj
procedure Solution-extraction(g,j) A real action or a maintenance action if j=0 then return the solution for each literal l in g nondeterministically choose an action statestateactionto use in state s j–1 to achieve l level level level if any pair of chosen actions are mutex i-1 i i then backtrack g' := {the preconditions of the chosen actions} Solution-extraction(g', j–1) end Solution-extraction
Example (continued) state-level 0 action-level 1 state-level 1
Two sets of actions for the goals at state-level 1
Neither of them works Both sets contain actions that are mutex
dinner present
dinner present
Recall what the algorithm does procedure Graphplan:
for k = 0, 1, 2, … Graph expansion: create a “planning graph” that contains k “levels” Check whether the planning graph satisfies a necessary (but insufficient) condition for plan existence If it does, then do solution extraction: backward search, modified to consider only the actions in the planning graph if we find a solution, then return it
Example (continued) state-level 0 action-level 1 state-level 1 action-level 2 state-level 2
Go back and do more graph expansion
Generate another action-level and another statelevel
dinner present
dinner present
dinner present
Example (continued) state-level 0 action-level 1 state-level 1 action-level 2 state-level 2
Solution extraction
Twelve combinations at level 4 Three ways to
achieve garb Two ways to achieve dinner Two ways to achieve present
dinner present
dinner present
dinner present
Example (continued) state-level 0 action-level 1 state-level 1 action-level 2 state-level 2
Several of the combinations look OK at level 2
Here’s one of them
dinner present
dinner present
dinner present
Example (continued) Call SolutionExtraction recursively at level 2
state-level 0 action-level 1 state-level 1 action-level 2 state-level 2
It succeeds
Solution whose parallel length is 2
dinner present
dinner present
dinner present
Comparison with Plan-Space Planning Advantage: The backward-search part of Graphplan—which is the hard part—will only look at the actions in the planning graph smaller search space, thus faster
Disadvantage: To generate the planning graph, Graphplan creates a huge number of ground atoms Many of them may be irrelevant
Can alleviate (but not eliminate) this problem by assigning data types to the variables and constants Only instantiate variables to terms of the same data type
For classical planning, the advantage outweighs the disadvantage GraphPlan solves classical planning problems much faster than PSP
Planning Graph for Heuristic Search Given a planning problem Pv, create a relaxed planning problem P'v and use GraphPlan to solve it Convert to set-theoretic representation and no negative literals;
goal is now a set of atoms. Remove the delete lists from the actions. Construct a planning graph until a layer is found that contains all of the goal atoms. The graph will contain no mutexes because the delete lists were removed. Extract a plan π' from the planning graph No mutexes no backtracking polynomial time
|π'| is a lower bound on the length of the best solution to Pv
GraphPlan Heuristic Use reachability “level” as a heuristic for forward search. P = (A,si,g) be a propositional planning problem and G = (N,E) the corresponding planning graph g = {g1,…,gn} gk, k ϵ [1,n], is reachable from si if there is a proposition layer Pg such that gk ϵ Pg in proposition layer Pm: if gk not in Pm then gk not reachable in m steps
define (admissible) hPG(gk) = m for {gk}reachable in m steps
CS 182: Intelligent Systems: Reasoning, Actions, & Plans
Fall 2013
Key Concepts in Modeling Plan-based Activities Agent: one that acts Actions: simple and complex Beliefs: what an agent knows Desires: the states of the world an agent “prefers” Intentions: actions the agent is committed to Recipes: descriptions/representations of ways to do a complex action or accomplish a task.
Plans: recipes+beliefs+intentions
Announcements Reading for Thursday, Oct 24: AIMA 3e, 11.2, skim 11.1. Also recommended: D. S. Nau. “Current trends in automated planning”, AI Magazine 28(4):43–58, 2007 Background: Ghallab, Nau and Traverso, Automated Planning: Theory and Practice, Morgan Kaufmann, 2004.
Assignment date changes: Assignment 4: available Thursday, due November 5 Project proposals: now due Sunday, November 3.
Sections this week: planning as logic+search Course mid-semester questionnaire (updated) CS 182: Intelligent Systems: Reasoning, Actions, & Plans
Fall 2013