Vacuum-cleaner world. Intelligent Agents. A vacuum-cleaner agent. Outline. A vacuum-cleaner agent. Agents and environments sensors

Vacuum-cleaner world A B Intelligent Agents Chapter 2 Percepts: location and contents, e.g., [A, Dirty] Actions: Lef t, Right, Suck, N oOp Chapte...
Author: Ursula Elliott
0 downloads 1 Views 67KB Size
Vacuum-cleaner world

A

B

Intelligent Agents

Chapter 2 Percepts: location and contents, e.g., [A, Dirty] Actions: Lef t, Right, Suck, N oOp

Chapter 2

Chapter 2

1

Outline

4

A vacuum-cleaner agent

♦ Agents and environments

Percept sequence [A, Clean] [A, Dirty] [B, Clean] [B, Dirty] [A, Clean], [A, Clean] [A, Clean], [A, Dirty] ..

♦ Rationality ♦ PEAS (Performance measure, Environment, Actuators, Sensors) ♦ Environment types ♦ Agent types

Chapter 2

Action

Chapter 2

2

Agents and environments

5

A vacuum-cleaner agent

sensors

Percept sequence [A, Clean] [A, Dirty] [B, Clean] [B, Dirty] [A, Clean], [A, Clean] [A, Clean], [A, Dirty] ..

percepts ?

environment actions

agent

actuators

Agents include humans, robots, softbots, thermostats, etc. The agent function maps from percept histories to actions:

Action Right Suck Lef t Suck Right Suck ..

function Reflex-Vacuum-Agent( [location,status]) returns an action if status = Dirty then return Suck else if location = A then return Right else if location = B then return Left



f :P →A The agent program runs on the physical architecture to produce f

What is the right function? Can it be implemented in a small agent program? Chapter 2

3

Chapter 2

6

Rationality

Rationality

Fixed performance measure evaluates the environment sequence – one point per square cleaned up in time T ?

Fixed performance measure evaluates the environment sequence – one point per square cleaned up in time T ? – one point per clean square per time step, minus one per move? – penalize for > k dirty squares? A rational agent chooses whichever action maximizes the expected value of the performance measure given the percept sequence to date Rational 6= omniscient – percepts may not supply all relevant information Rational 6= clairvoyant – action outcomes may not be as expected Hence, rational 6= successful Rational ⇒ exploration, learning, autonomy

Chapter 2

7

Chapter 2

10

Chapter 2

11

PEAS

Rationality

To design a rational agent, we must specify the task environment

Fixed performance measure evaluates the environment sequence – one point per square cleaned up in time T ? – one point per clean square per time step, minus one per move?

Consider, e.g., the task of designing an automated taxi: Performance measure?? Environment?? Actuators?? Sensors??

Chapter 2

8

PEAS

Rationality

To design a rational agent, we must specify the task environment

Fixed performance measure evaluates the environment sequence – one point per square cleaned up in time T ? – one point per clean square per time step, minus one per move? – penalize for > k dirty squares?

Consider, e.g., the task of designing an automated taxi: Performance measure?? safety, destination, profits, legality, comfort, . . . Environment?? streets in Lower Mainland, traffic, pedestrians, weather, . . . Actuators?? steering, accelerator, brake, horn, speaker/display, . . . Sensors?? video, accelerometers, gauges, engine sensors, keyboard, GPS, . . .

Chapter 2

9

Chapter 2

12

Internet shopping agent

Environment types

Performance measure?? Observable?? Deterministic??

Environment??

8-Puzzle Yes

Backgammon Yes

Internet shopping No

Taxi No

Actuators?? Sensors??

Chapter 2

13

Chapter 2

Internet shopping agent

16

Environment types

Performance measure?? price, quality, appropriateness, efficiency Observable?? Deterministic?? Episodic??

Environment?? current and future WWW sites, vendors, shippers Actuators?? display to user, follow URL, fill in form

8-Puzzle Yes Yes

Backgammon Yes No

Internet shopping No Partly

Taxi No No

Sensors?? HTML pages (text, graphics, scripts)

Chapter 2

14

Chapter 2

Environment types 8-Puzzle

Backgammon

Internet shopping

17

Environment types Taxi

Observable??

Observable?? Deterministic?? Episodic?? Static??

Chapter 2

15

8-Puzzle Yes Yes No

Backgammon Yes No No

Internet shopping No Partly No

Taxi No No No

Chapter 2

18

Agent types

Environment types Observable?? Deterministic?? Episodic?? Static?? Discrete??

8-Puzzle Yes Yes No Yes

Backgammon Yes No No Yes

Internet shopping No Partly No Semi

Four basic types in order of increasing generality: – simple reflex agents – reflex agents with state – goal-based agents – utility-based agents

Taxi No No No No

All these can be turned into learning agents

Chapter 2

19

Chapter 2

Environment types 8-Puzzle Yes Yes No Yes Yes

Backgammon Yes No No Yes Yes

Internet shopping No Partly No Semi Yes

Simple reflex agents Taxi No No No No No

Agent

Sensors What the world is like now

Condition−action rules

What action I should do now

Environment

Observable?? Deterministic?? Episodic?? Static?? Discrete?? Single-agent??

22

Actuators

Chapter 2

20

Environment types Observable?? Deterministic?? Episodic?? Static?? Discrete?? Single-agent??

Chapter 2

23

Example

8-Puzzle Backgammon Internet shopping Taxi Yes Yes No No Yes No Partly No No No No No Yes Yes Semi No Yes Yes Yes No Yes No Yes (except auctions) No

function Reflex-Vacuum-Agent( [location,status]) returns an action if status = Dirty then return Suck else if location = A then return Right else if location = B then return Left

The environment type largely determines the agent design The real world is (of course) partially observable, stochastic, sequential, dynamic, continuous, multi-agent

Chapter 2

21

Chapter 2

24

Reflex agents with state

Utility-based agents

Sensors

Sensors

State

State

How the world evolves

Condition−action rules

Agent

What action I should do now

How the world evolves

What the world is like now

What my actions do

What it will be like if I do action A

Utility

How happy I will be in such a state What action I should do now

Agent

Actuators

Chapter 2

Environment

What my actions do

Environment

What the world is like now

Actuators

25

Chapter 2

Example

28

Learning agents Performance standard

function Reflex-Vacuum-Agent( [location,status]) returns an action static: last A, last B, numbers, initially ∞

Sensors

Critic

if status = Dirty then . . .

changes Learning element

knowledge

Performance element

learning goals

Environment

feedback

Problem generator

Agent

Chapter 2

26

29

Agents interact with environments through actuators and sensors

Sensors State

The agent function describes what the agent does in all circumstances What the world is like now

What my actions do

What it will be like if I do action A

What action I should do now

The performance measure evaluates the environment sequence

Environment

How the world evolves

Agent

Chapter 2

Summary

Goal-based agents

Goals

Actuators

A perfectly rational agent maximizes expected performance Agent programs implement (some) agent functions PEAS descriptions define task environments Environments are categorized along several dimensions: observable? deterministic? episodic? static? discrete? single-agent? Several basic agent architectures exist: reflex, reflex with state, goal-based, utility-based

Actuators

Chapter 2

27

Chapter 2

30

Suggest Documents