Vacuum-cleaner world
A
B
Intelligent Agents
Chapter 2 Percepts: location and contents, e.g., [A, Dirty] Actions: Lef t, Right, Suck, N oOp
Chapter 2
Chapter 2
1
Outline
4
A vacuum-cleaner agent
♦ Agents and environments
Percept sequence [A, Clean] [A, Dirty] [B, Clean] [B, Dirty] [A, Clean], [A, Clean] [A, Clean], [A, Dirty] ..
♦ Rationality ♦ PEAS (Performance measure, Environment, Actuators, Sensors) ♦ Environment types ♦ Agent types
Chapter 2
Action
Chapter 2
2
Agents and environments
5
A vacuum-cleaner agent
sensors
Percept sequence [A, Clean] [A, Dirty] [B, Clean] [B, Dirty] [A, Clean], [A, Clean] [A, Clean], [A, Dirty] ..
percepts ?
environment actions
agent
actuators
Agents include humans, robots, softbots, thermostats, etc. The agent function maps from percept histories to actions:
Action Right Suck Lef t Suck Right Suck ..
function Reflex-Vacuum-Agent( [location,status]) returns an action if status = Dirty then return Suck else if location = A then return Right else if location = B then return Left
∗
f :P →A The agent program runs on the physical architecture to produce f
What is the right function? Can it be implemented in a small agent program? Chapter 2
3
Chapter 2
6
Rationality
Rationality
Fixed performance measure evaluates the environment sequence – one point per square cleaned up in time T ?
Fixed performance measure evaluates the environment sequence – one point per square cleaned up in time T ? – one point per clean square per time step, minus one per move? – penalize for > k dirty squares? A rational agent chooses whichever action maximizes the expected value of the performance measure given the percept sequence to date Rational 6= omniscient – percepts may not supply all relevant information Rational 6= clairvoyant – action outcomes may not be as expected Hence, rational 6= successful Rational ⇒ exploration, learning, autonomy
Chapter 2
7
Chapter 2
10
Chapter 2
11
PEAS
Rationality
To design a rational agent, we must specify the task environment
Fixed performance measure evaluates the environment sequence – one point per square cleaned up in time T ? – one point per clean square per time step, minus one per move?
Consider, e.g., the task of designing an automated taxi: Performance measure?? Environment?? Actuators?? Sensors??
Chapter 2
8
PEAS
Rationality
To design a rational agent, we must specify the task environment
Fixed performance measure evaluates the environment sequence – one point per square cleaned up in time T ? – one point per clean square per time step, minus one per move? – penalize for > k dirty squares?
Consider, e.g., the task of designing an automated taxi: Performance measure?? safety, destination, profits, legality, comfort, . . . Environment?? streets in Lower Mainland, traffic, pedestrians, weather, . . . Actuators?? steering, accelerator, brake, horn, speaker/display, . . . Sensors?? video, accelerometers, gauges, engine sensors, keyboard, GPS, . . .
Chapter 2
9
Chapter 2
12
Internet shopping agent
Environment types
Performance measure?? Observable?? Deterministic??
Environment??
8-Puzzle Yes
Backgammon Yes
Internet shopping No
Taxi No
Actuators?? Sensors??
Chapter 2
13
Chapter 2
Internet shopping agent
16
Environment types
Performance measure?? price, quality, appropriateness, efficiency Observable?? Deterministic?? Episodic??
Environment?? current and future WWW sites, vendors, shippers Actuators?? display to user, follow URL, fill in form
8-Puzzle Yes Yes
Backgammon Yes No
Internet shopping No Partly
Taxi No No
Sensors?? HTML pages (text, graphics, scripts)
Chapter 2
14
Chapter 2
Environment types 8-Puzzle
Backgammon
Internet shopping
17
Environment types Taxi
Observable??
Observable?? Deterministic?? Episodic?? Static??
Chapter 2
15
8-Puzzle Yes Yes No
Backgammon Yes No No
Internet shopping No Partly No
Taxi No No No
Chapter 2
18
Agent types
Environment types Observable?? Deterministic?? Episodic?? Static?? Discrete??
8-Puzzle Yes Yes No Yes
Backgammon Yes No No Yes
Internet shopping No Partly No Semi
Four basic types in order of increasing generality: – simple reflex agents – reflex agents with state – goal-based agents – utility-based agents
Taxi No No No No
All these can be turned into learning agents
Chapter 2
19
Chapter 2
Environment types 8-Puzzle Yes Yes No Yes Yes
Backgammon Yes No No Yes Yes
Internet shopping No Partly No Semi Yes
Simple reflex agents Taxi No No No No No
Agent
Sensors What the world is like now
Condition−action rules
What action I should do now
Environment
Observable?? Deterministic?? Episodic?? Static?? Discrete?? Single-agent??
22
Actuators
Chapter 2
20
Environment types Observable?? Deterministic?? Episodic?? Static?? Discrete?? Single-agent??
Chapter 2
23
Example
8-Puzzle Backgammon Internet shopping Taxi Yes Yes No No Yes No Partly No No No No No Yes Yes Semi No Yes Yes Yes No Yes No Yes (except auctions) No
function Reflex-Vacuum-Agent( [location,status]) returns an action if status = Dirty then return Suck else if location = A then return Right else if location = B then return Left
The environment type largely determines the agent design The real world is (of course) partially observable, stochastic, sequential, dynamic, continuous, multi-agent
Chapter 2
21
Chapter 2
24
Reflex agents with state
Utility-based agents
Sensors
Sensors
State
State
How the world evolves
Condition−action rules
Agent
What action I should do now
How the world evolves
What the world is like now
What my actions do
What it will be like if I do action A
Utility
How happy I will be in such a state What action I should do now
Agent
Actuators
Chapter 2
Environment
What my actions do
Environment
What the world is like now
Actuators
25
Chapter 2
Example
28
Learning agents Performance standard
function Reflex-Vacuum-Agent( [location,status]) returns an action static: last A, last B, numbers, initially ∞
Sensors
Critic
if status = Dirty then . . .
changes Learning element
knowledge
Performance element
learning goals
Environment
feedback
Problem generator
Agent
Chapter 2
26
29
Agents interact with environments through actuators and sensors
Sensors State
The agent function describes what the agent does in all circumstances What the world is like now
What my actions do
What it will be like if I do action A
What action I should do now
The performance measure evaluates the environment sequence
Environment
How the world evolves
Agent
Chapter 2
Summary
Goal-based agents
Goals
Actuators
A perfectly rational agent maximizes expected performance Agent programs implement (some) agent functions PEAS descriptions define task environments Environments are categorized along several dimensions: observable? deterministic? episodic? static? discrete? single-agent? Several basic agent architectures exist: reflex, reflex with state, goal-based, utility-based
Actuators
Chapter 2
27
Chapter 2
30