Virtual Vision: A Simulation Framework for Camera Sensor Networks Research

Virtual Vision: A Simulation Framework for Camera Sensor Networks Research Demetri Terzopoulos UCLA 2007 PhD Thesis of Faisal Qureshi University of ...
Author: Mabel Lynch
0 downloads 2 Views 6MB Size
Virtual Vision:

A Simulation Framework for Camera Sensor Networks Research Demetri Terzopoulos UCLA

2007 PhD Thesis of Faisal Qureshi University of Toronto Publications: • • • • •

2008 Proceedings of the IEEE 2008 11th Communications and Networking Simulation Symposium (CNS) 2008 ACM Symposium on Virtual Reality Software and Technology (VRST) 2007 First IEEE/ACM Intl. Conf. on Distributed Smart Cameras (ICDCS) 2007 IEEE/ACM Intl. Conf. on Distributed Computing in Sensor Systems

• • • • •

2007 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) 2006 ACM Intl. Wksp. on Distributed Smart Cameras (DSC) 2006 ACM Multimedia Systems Journal 2005 ACM Wksp. on Video Surveillance and Sensor Networks (VSSN) 2005 IEEE Intl. Wksp. on Visual Surveillance (VS-PETS)

(DCOSS)

1

Camera Networks and Smart Cameras •

Visual surveillance is becoming ubiquitous – London has roughly 4,000,000 cameras



Effective visual coverage of large spaces require multi-camera systems



Operator monitoring is infeasible for large networks



Need networks of smart cameras capable of autonomous operation



Smart cameras = visual sensor nodes – Local on-board processing – Communication with neighbors

Difficulty of Doing Large-Scale Visual Sensor Networks Research • Deploying large-scale camera networks in extensive public spaces for research purposes: – Hardware related technical challenges – Privacy and legal issues – Prohibitive cost

• Infeasible for most computer vision researchers

2

Virtual Camera Sensor Networks

Virtual Vision

Visually and behaviorally realistic simulators for designing and evaluating machine vision systems

Camera Model: Virtual Camera

Environment Models: geometry, texture, illumination

Pedestrian Models:

appearance, movement, behavior Reality Emulator Virtual Penn Station

pan, tilt, zoom, camera jitter, color response, lens distortions, etc.

High-Level Control camera control, assignment, handover, etc.

Virtual Video

Machine Vision tracking, recognition, etc.

3

Virtual Vision for Camera Networks Research •

Emulates the characteristics of a physical vision system



Flexibility during system design and evolution



Readily available ground truth



Offline operation and testing



No legal impediments



No special hardware



Repeatability



Low cost

QuickTime™ and a decompressor are needed to see this picture.

Environmental Model of Original Penn Station in NYC [Shao & Terzopoulos, 2005]

Platforms

Concourses

Main MainWaiting WaitingRoom Room

Concourses Arcade Arcade

Platforms

4

Architecture of the Autonomous Pedestrian Model Environment & Interaction Cognition Behavior Motion Geometry

Pedestrian Simulation

5

Pedestrian Simulation

Human Activity in the Station

6

Virtual Vision • Testbed for multi-camera sensor networks

Computer Vision Emulation Using Synthetic Video • Pedestrian detection – Background subtraction using a learned background model

QuickTime™ and a decompressor are needed to see this picture.

7

Low-Level Computer Vision Emulation Using Synthetic Video •

Pedestrian tracker – Appearance based

Low-Level Computer Vision Emulation Using Synthetic Video • Appearance based pedestrian tracker – [Swain & Ballard 91]

– Zoom invariant

Video frame

Signature Backprojected image: image High intensity suggests presence of target

8

Anatomy of a Camera Node Decision Decision logic logic

Communication Communication subsystem: subsystem: message message passing passing to to neighboring neighboring nodes nodes

Tracking

Task Task “relevance” “relevance” computation computation framework framework

region of interest

failure

reacquire

Vision Vision routines: routines: pedestrian pedestrian tracking tracking

Lost

Free

state management timeout

y

Searching

timeout

left right zoom

z x

Image Image driven driven reactive reactive behaviors: behaviors: fixation fixation and and zooming zooming PD PD controllers controllers

up down

active pan-tilt-zoom camera

Virtual Vision: A Tool for Visual Sensor Network Research High-Level Camera Control

Visual Sensing

Visual Sensing

Synthetic Video

Real Video

Virtual Camera Network

Physical Camera Network

Synthetic World (Reality Emulator)

Real World

Virtual Vision

Real Vision

9

Camera Sensor Network #1 Active PTZ Camera Scheduling

Scheduling Active PTZ Cameras • Number of Pedestrians

>

Number of active cameras

• Task active cameras to observe pedestrians in the scene

Active PTZ camera

Passive wide-FOV camera

10

Scheduling Active PTZ Cameras • Scheduling problem

– Given n PTZ cameras, and m pedestrians, persistently observe every pedestrian using one PTZ camera at a time

• Goal – Observe as many pedestrians for as long as possible

Virtual Active Camera Scheduling Setup • Passive wide-FOV cameras – Calibrated – Pedestrian localization through triangulation

• Active PTZ cameras – Un-calibrated – Learn a coarse mapping between 3D locations and internal pan-tilt settings

• Reliable pedestrian identification in different cameras via appearance based signatures

Calibrated passive cameras at the four corners of the waiting room in the virtual train station

11

Distinguishing Features of Our Active Camera Scheduling Strategy •

Previous attempts –

Costello et al., 2004 Costello et al., 2006 Lim et al., 2006 Huang and Trivedi, 2003



Exit times are not known a priori



Recording duration is not known a priori



Allow multiple observations of the same pedestrian



Multiple pan/tilt/zoom cameras



Tracking failures due to occlusions

Active Camera Scheduling Strategy • Camera assignment via weighted round robin • First come, first served tie breaking • Multiple observation • Preemption

Close-up snapshots captured by active PTZ cameras

12

Active Camera Scheduling Results Preemption (P) No preemption (NP) Single observation (SO) Multiple observations (MO) Single-class model (SC) Multi-class model (MC)

Multiple observations, multi-class, preemption scheduler outperforms other variants

Up to 4 Cameras; 10, 20 Pedestrians

Camera Sensor Network #2 Active PTZ Camera Assignment and Grouping

13

Smart Camera Network •

Cameras self-organize to observe pedestrians during their presence in the region – Smart, PTZ active cameras – Un-calibrated – Ad hoc deployment – Camera additions & removal – Handle camera failures – Deal with imperfect communication



Previous work

– Simmons et al., 2006 Mallet, 2006 Park et al., 2006 Kulkarni et al., 2005 Javed et al., 2000; Devarajan and Radke, 2006

Vision and Goal • Ad hoc deployment

14

Vision and Goal • Cameras work towards common sensing goals

Task assignment

Vision and Goal • Cameras work towards common sensing goals

Group formation

15

Vision and Goal • Cameras work towards common sensing goals

Group evolution

Vision and Goal • Cameras work towards multiple common sensing goals

Group 1

Group 2

16

Vision and Goal • Cameras work towards multiple common sensing goals

Camera failure

Vision and Goal • Cameras work towards multiple common sensing goals

Task assignment

17

Vision and Goal • Cameras work towards multiple common sensing goals

Conflict resolution

Camera Grouping and Reassignment

18

Active PTZ Camera Assignment and Grouping • Camera selection, grouping, and handoff via an auction model – Announcement/Bidding/Selection – ContractNet • Smith, 1983

• Conflict resolution within a Constraint Satisfaction Problem framework – Partially distributed

• A Camera can only perform a single task at any given time

Camera Grouping: Announcement •

Start with a single camera that is tasked to observe a person

Leader

19

Camera Grouping: Announcement •

Seeks out other cameras in the vicinity to form a group to help it with the observation task

Leader

Camera Grouping: Bidding •

One or more cameras that receive the task announcement respond with their relevance values

Never Received the Message

Leader Response: high relevance

Response: low relevance

Did not respond

20

Camera Grouping: Bidding •

One or more cameras that receive the task announcement respond with their relevance values



Relevance encodes how successful a camera will be at an observation task Never Received the Message

Leader Response: high relevance

Response: low relevance

Did not respond

Camera Grouping: Selection •

After the leader gets relevance messages from neighboring cameras, it selects suitable cameras to join the group

Join

Group

Leader

21

Conflict Detection •

A conflict is detected when multiple tasks require the same camera node to proceed successfully

Conflict Detection •

A Red group member receives a recruit query from Green group

22

Conflict Detection •

A Red group member receives a recruit query from Green group

Conflict Detection •

A Red group member receives a recruit query from Green group

Conflict!

23

Conflict Detection •

Nodes belonging to both groups send information to one of the leaders



Leader selection Centralization

Conflict Detection •

The resulting node (camera) assignment is sent to the individual nodes

Solution

24

Conflict Detection •

The resulting node (camera) assignment is sent to the individual nodes

Persistent Observation: Virtual Active PTZ Camera Assignment and Grouping

25

Network Model •

It does not require camera calibration or network calibration – Can take advantage of calibration information, if available – Ad hoc deployment



Camera grouping is a strictly local negotiation – Typically camera groups are spatially local arrangements



Camera groups are dynamic arrangements



Camera handoffs occur naturally during negotiations



It can gracefully handle node and message failures

Network Model • Camera nodes can be added or removed during the lifetime of a task • Even assuming perfect sensing, the proposed model can still fail if – a significant number of messages are lost – catastrophic node failure – group evolution can’t keep up with a fast-changing observation task

• Scalability

– Small group sizes – Conflict resolution is viable as long as the number of relevant sensors for each task remains low ( < 10 ) – Optimal sensor assignment

26

Future Work • Camera sensor network

– Consistent labeling of pedestrians – Who, What, Where, When? – Cognitive modeling

• Physical sensor networks

– Environment monitoring, urban sensing, intelligent environments, …

• Bigger, better reality emulators – High-fidelity synthetic video – An entire city …

Virtual LA Urban Simulation Lab Architecture Dept UCLA

27

Virtual LA • Building level of detail

Virtual LA • Interior spaces

28

Strategy Looking Forward • Collaborative engagement of the CV and CG communities – Open source framework to develop and test state-of-the-art vision algorithms and systems – Modular www-accessible environment • e.g., SRI’s Open Agent Architecture (OAA)

– “Any-world” emulator – More realism (“The Matrix”)

Thank You !

29

Suggest Documents