Gazebo towards Cognitive Robots Acting in Ubiquitous Sensor-equipped Environments

Extending Player/Stage/Gazebo towards Cognitive Robots Acting in Ubiquitous Sensor-equipped Environments Radu Bogdan Rusu, Alexis Maldonado, Michael B...
6 downloads 2 Views 2MB Size
Extending Player/Stage/Gazebo towards Cognitive Robots Acting in Ubiquitous Sensor-equipped Environments Radu Bogdan Rusu, Alexis Maldonado, Michael Beetz

Brian Gerkey

Intelligent Autonomous Systems, Technische Universit¨at M¨unchen {rusu, maldonad, beetz}@cs.tum.edu

Artificial Intelligence Center SRI, International [email protected]

Abstract— Standardized middleware for autonomous robot control has proven itself to enable faster deployment of robots, to make robot control code more interchangeable, and experiments easier to replicate. Unfortunately, the support provided by current middleware is in most cases limited to what current robots do: navigation. However, as we tackle more ambitious service robot applications, more comprehensive middleware support is needed. We increasingly need the middleware to support ubiquitous sensing infrastructures, robot manipulation tasks, and cognitive capabilities. In this paper we describe and discuss current extensions of the Player/Stage/Gazebo (P/S/G) middleware, one of the most widespread used robot middlewares, of which we are active developers, that satisfy these requirements.

I. I NTRODUCTION Up to a decade ago the robotics community was focusing mainly on robots acting autonomously in real and unmodified environments. The goal was to enable the robots to do, like humans and animals do, all sensing, deliberation, and action selection on board. With the establishment of new research fields including ubiquitous computing, alternative paths to competent robotic agency have become very plausible and even more promising. In ubiquitous computing, sensors, effectors, and computing devices are distributed and embedded invisibly into the objects of everyday life. They connect automatically to each others, exchange information, and pass commands to connected components. In sensor-equipped environments we have cupboards that “know” what’s inside of them because objects are tagged with RFID tags and the boards equipped with RFID tag readers. Or, we have people who are instrumented with inertial measurement units at their joints providing sensor data about the orientations of the limbs. If we think about the future of service robotics it seems likely that service robots will be competent and versatile agents in sensor- and effector-equipped operating environments rather than autonomous and insular entities. In ubiquitous robotics, a typical setting is the following one. A service robot establishes a connection to the ubiquitous computing, sensing, and actuation infrastructure and makes itself a part of it. Having established the connection, the robot then perceives what is inside a cupboard in the same way as it perceives what is in its hand: by simply retrieving the respective sensor data and interpreting it — although it is not physically connected to the sensor.

To enable and support such seamless integration we propose a common middleware infrastructure for autonomous robots and ubiquitous computing environments [1], [2]. Our running example will be a mobile kitchen robot (see Figure 1) acquiring the skills for setting a table through imitation learning, where a sensor-equipped kitchen observes people setting the table. The robot learns activity models from these observations, and uses the acquired action models as resources in order to learn high performance action routines. This is an interesting and challenging problem for cognitive robotics because it involves complex manipulation tasks, the acquisition and use of 3D object maps, the learning of complex action models and high-performance action routines, and the integration of an ubiquitous sensing infrastructure into robotic control — aspects that are beyond the scope of current autonomous robot control systems.

Fig. 1.

Cognitive Household Robotic Assistant

This paper gives (1) an overview of recent additions to the Player/Stage/Gazebo software library that include drivers and interfaces of an ubiquitous sensing infrastructure, (2) a report on an extended and comprehensive experiment using this infrastructure for recording kitchen activities of people and learning action models, (3) a sketch of ongoing work to integrate higher level support for complex manipulation and model acquisition tasks, and (4) our plans on developing and integrating cognitive mechanisms into P/S/G. Taken together, these components give an overview of the current state and the future plans of a key development thread of P/S/G. In more detail, the main contributions of this paper are: 1) Extensions for interfacing ubiquitous sensing infrastructure include a coherently designed library of in-

terfaces to a variety of sensing and computing devices with incompatible native software interfaces, additional logging and synchronization mechanisms, and mechanisms for querying multiple sensors and combining the resulting data. 2) Extensions for the integration of robotic manipulation. The upcoming need for more sophisticated manipulation capabilities constitutes another interesting challenge for robotic middleware. Robotic manipulation in a kitchen scenario requires reach planning in sophisticated 3D maps that are generated on the fly during a reach motion. It also requires the recognition and sometimes even the identification of objects, measuring their form accurately in order to propose adequate grip positions. 3) Extensions for enabling cognitive processing in P/S/G will enable Player to realize self-calibrating sensorequipped environments, to effectively support the simplification of sensing tasks by exploiting contextual information, and to realize higher-level perceptual tasks such as activity recognition as it is needed to realize cognitive robots in human environments. In the remainder of the paper we proceed as follows. Section II provides an overview of our system. Section III describes our architectural and cognitive layer. Section IV presents our Player/Stage/Gazebo extensions in greater detail. Section V describes our application scenario and some experimental results. Section VI presents related work and finally, in section VII we draw the conclusions. II. A PPLICATION S CENARIO AND OVERVIEW Let us now (Section II-A) look at a specific application scenario that we will use to motivate our P/S/G extensions and derive requirements for these extensions (Section II-B). A. Kitchen Scenario As the example application scenario that we consider for the extension of P/S/G, we take an autonomous mobile robot equipped with two arms with grippers acting in a sensorequipped kitchen environment. The robot is a RWI B21 robot (see figures 1, 2, 5) equipped with a stereo CCD system and laser rangefinders as its primary sensors. It also has two Amtec Powercube arms with simple grippers. The robot’s task will be to set the kitchen table by getting plates and glasses out of the cupboard and putting them on the table. The sensor-equipped kitchen environment consists of RFID tag readers placed in the cupboards for sensing the identities of the objects placed there. The cupboards also have contact sensors that sense whether the cupboard is open or closed. A variety of wireless sensor nodes equipped with accelerometers and/or ball motion sensors are placed on objects or other items in the environment. Several small, nonintrusive laser range sensors are placed in the environment to track the motions of the people acting there. The kitchen table is equipped with a suite of capacity sensors that essentially report the capacitance of different

areas on the table, when an object is placed there. In addition, seven CCD cameras are mounted such that they cover the whole environment. Finally, machines and tools in the kitchen are also equipped with sensors.

Fig. 2.

A general overview of the AwareKitchen architecture

Small ubiquitous devices offer the possibility to instrument people acting in the environment with additional sensors. In our case, we have built a glove equipped with a RFID tag reader (see Figure 7) that enables us to identify the objects that are manipulated by the person who wears it. In addition, the person is equipped with tiny inertial measurement units (XSens MTx) who provide us with detailed information about the person’s limb motions, and allow us to reconstruct them (see Figure 3). It is characteristic for sensor-equipped environments to have sensors which provide redundant information, thus evidencing that some state variables are provided by multiple sensors. Another aspect is that the sensors only provide partial information about the state we are interested in. For example, the RFID tag reader in the glove senses that an object with a certain id is close to the glove but it cannot tell us whether the object is in the hand, the state we might be interested in.

Fig. 3.

Images from the AwareKitchen experiment

An important property of the sensing infrastructure of the kitchen is that the sensors are wirelessly connected to other small ubiquitous devices (like Gumstix) and personal

computers that perform state estimation and data-mining tasks. This way, activity data can be collected and the activity models are acquired in a distributed manner. As our application task we consider the following scenario that we want to support with our extensions to the P/S/G middleware. Several people are setting the table for different numbers of guests. The sensor-equipped kitchen observes the setting and learns abstract models for the setting activity. These abstract models are then transferred to the service robot which uses the models to imitate the coarse table setting activity. It then uses the learned activity model as the starting point for behavior optimization. B. Requirements for a Middleware in Ubiquitous Robotics In order to support future application scenarios as the one described in the previous section, a middleware infrastructure should successfully address the following requirements. It has to address interoperability protocols, making communication between heterogeneous sensors and actuators possible, and then leave room for fusion, processing and reasoning extensions. As an example, it should be possible to easily combine an off-the-shelf navigation system (motors, encoders and controller) with a range finder sensor (sonar, laser, etc) to build a mobile platform with a minimum programming effort, and use the same platform to develop localization and mapping algorithms, thus making the mobile platform aware of its location. From an architectural point of view, the system must be as flexible and as powerful as possible. For infrastructure to become widely adopted by the community, it must impose few, if any, constraints on how the system can be used. Specifically, we require independence with respect to programming language, control paradigm, computing platform, sensor/actuator hardware, and location within a network. In other words, a researcher should be able to: write a control program in any programming language, structure the program in the best way for the application at hand, run the program on any computer (especially low-power embedded systems), make no changes to the program after integrating new hardware, and remotely access the program over a network. Though not a strict requirement, we also aim to maximize the modularity of our architecture, so that researchers can pick and choose the specific components that they find useful, without using the entire system. We also require a realistic, sensor-based simulation. Simulation is a key capability for ubiquitous computing infrastructure. The main benefits to the user of using a simulation over real hardware are convenience and cost: simulated devices are usually easy to use, their batteries do not run out, and they are much cheaper than real devices. In addition, simulation allows the user to explore system configurations and scales that are not physically realizable because the necessary hardware is not available. The simulation must present the user with the same interface as the real devices, so that moving an experiment between simulation and hardware is seamless, requiring no changes to the code. Though they

may seem lofty, these goals are in fact achievable, as we explain below. As we envision technical cognitive robotic systems who will become more and more decentralized, using ubiquitous computing devices that communicate and exchange information between each other, one of our goals is to combine research results from both the robotics and ubiquitous computing research communities, thus bringing them closer together. III. P/S/G A RCHITECTURE The Player/Stage/Gazebo (P/S/G) project produces tools for rapid development of robot control code. In the P/S/G software suite, Player provides a simple and flexible interface for communicating with, and controlling different physically distributed sensors and actuators. Stage and Gazebo are 2D and 3D simulators for multiple robots that include physical simulations.

Fig. 4.

A general, brief overview of the Player/Stage/Gazebo architecture

Player provides a simple and flexible interface for robot control by realizing powerful classes of interface abstractions for interacting with robot hardware, in particular sensors and effectors. These abstractions enable the programmer to use devices with similar functionality identically from the code point of view, thus increasing the transferability of the code. In addition, an enhanced client/server model featuring auto-discovery mechanisms as well as permitting servers and clients to communicate between them in a heterogeneous network, enables programmers to code their clients from a large variety of programming languages. Due to the standards that P/S/G imposes on its architectural and transport layer, several client libraries for programming languages such as: C, C++, Java, Lisp, Python, Ada, Tcl, Ruby, Octave, Matlab, and Scheme, are already available [3]. An important aspect of the P/S/G middleware is that it is developed as a Free Software infrastructure for robots and embedded sensor systems that improves robotics research practice and accelerates development by handling common tasks and provides a standard development platform. By collaborating on a common system, we share the engineering burden and create a means for objectively evaluating published work. If you and I use a common development platform, then you can send me your code and I can replicate your experiments in my lab. The core of the project is Player itself, which functions as the OS for a sensor-actuator system, providing an abstraction

layer that decouples the user’s program from the details of specific hardware (see Figure 5). Player specifies a set of interfaces, each of which defines the syntax and semantics for the allowable interactions with a particular class of sensor or actuator. Common interfaces include laser and ptz, which respectively provide access to scanning laser range-finders and pan-tilt-zoom cameras. A hardware-specific driver does the work of directly controlling a device and mapping its capabilities onto the corresponding interface. Just as a program that uses an OS’s standard mouse interface will work with any mouse, a program that uses Player’s standard laser interface will work with any laser —be it a SICK or a Hokuyo range-finder.

Fig. 5.

processing, and (3) Reasoning algorithms. One can think of the Player driver system as a graph (see Figure 6), where nodes represent the drivers which interact via well-defined interfaces (edges). Because it is embodied, this graph is grounded in the robot’s (physical or simulated) devices. That is, certain leaf drivers of the graph are connected to sensors and actuators. Internal drivers that implement algorithms (e.g., localization) are connected only to other drivers. Because the interfaces are welldefined, drivers are separable in that one can be written without knowledge of the internal workings of another. If my algorithm requires data from a laser range-finder, then the driver that implements my algorithm will take input over a laser edge from another driver; I don’t care how that data gets produced, just that it is standard laser data. A wide variety of control systems can be constructed by appropriately configuring and connecting drivers. The control system is also accessible from the outside; an external program (e.g., a client) can connect to any driver, via the same standard interfaces. Thus the system is, to a limited extent, reconfigurable at run-time.

An example of the Player ↔ client architecture

The 2D simulator Stage and the 3D simulator Gazebo (see Figure 4) also use a set of drivers to map their simulated devices onto the standard Player interfaces. Thus the interface that is presented to the user remains unchanged from simulation to hardware, and back. For example, a program that drives a simulated laser-equipped robot in Gazebo will also drive a real laser-equipped robot, with no changes to the control code. Programs are often developed and debugged first in simulation, then transitioned to hardware for deployment. In addition to providing access to (physical or simulated) hardware, Player drivers can implement sophisticated algorithms that use other drivers as sources and sinks for data. For example, the lasercspace driver reads range data from a laser device and convolves that data with the shape of a robot’s body to produce the configuration-space boundary. That is, it shortens the range of each laser scan such that the resultant scan delimits the obstacle-free portion of the robot’s configuration space [4]. The lasercspace driver’s output conforms to the laser interface, which makes it easy to use. Other examples of algorithm drivers include adaptive Monte Carlo localization [5]; laser-stabilized odometry [6]; and Vector Field Histogram navigation [7]. By incorporating well-understood algorithms into our infrastructure, we eliminate the need for users to individually re-implement them. Our architectural layer consists of (1) Low-level connections to the hardware platforms, (2) Sensor fusion and

Fig. 6.

Graph of Player drivers and their appropriate interfaces

IV. E XTENDING P/S/G The following sections describe the extensions that we are currently implementing into P/S/G as part of our ongoing research. Some of them are already available in the CVS repository on SourceForge, and the rest will be made available soon, together with the release of Player 3.0. A. Ubiquitous Sensing Infrastructure Extending P/S/G towards ubiquitous sensing includes a development effort to provide drivers for sensors such as heterogeneous Wireless Sensor Networks, RFID technologies, Inertial Measurement Units, and many more. A series of new hardware platforms are going to be supported with the upcoming release of Player 2.1. We have added support for: • Wireless Sensor Networks - with a wide variety of different sensor nodes, ranging from the RCores and

Particles from TeCO/Particle Computers to the Mica2 and Mica2Dots from Crossbow or the Spine; • RFID technologies - several readers such as the Inside M/R300, the Skyetek M1 and the Skyetek M1-mini are now supported; • Inertial Measurement Units - supporting the XSens MT9 as well as the XSens MTx, which provide driftfree 3D orientation and kinematic data. Besides these, we implemented a number of virtual drivers which take care of sensor calibration, data fusion, synchronization and logging. Furthermore, automatic feature extraction plugins are now available (see [1] for more details) and drivers for feature boosting as well as learning SVM models are underway.

Fig. 7.

3D mapping, trajectory planning in 3D space, and finally the safe execution of such trajectories. 1) 3D Mapping: In order to be able to perform robotic manipulation in a complex environment, the system needs a fine-grained 3D polygonal map, updated preferably as often as possible. In the extended P/S/G such maps can be acquired using a probabilistic algorithm. The algorithm gets an unorganized 3D point cloud as input, and provides a 3D polygonal description as output. Model acquisition based on planar surfaces is particularly adequate for mapping indoor environments. Figure 8 depicts pre-processed laser range data (left) and a plane-based representation (right) of the kitchen from Figures 9 and 2.

Wireless Sensor Networks and RFID technologies in Player

Besides driver development, the integration of ubiquitous sensing and computing infrastructure yields interesting challenges, like the incorporation of new sensing infrastructure during operation and the active querying of sensor networks for specific information, e.g. like it is done in TinyDB [8]. B. Supporting robotic manipulation Most middleware for robots provide little support for manipulation tasks. As we go for cognitive/service robots that are more sophisticated and have to solve more complicated tasks, manipulation is essential. Therefore a key direction in extending P/S/G is to provide the adequate infrastructure. Our robot platform (see section II-A) has been modified with two Amtec Powercube arms with six degrees of freedom each. They are very recent state-of-the-art manipulators and support a variety of commands that make control very flexible. Each joint has its own integrated controller that supports position, velocity, and current commands. There is also support for synchronized commands to all joints of one manipulator, and synchronized sampling of position data. In order to make use of these advanced capabilities, we have created a P/S/G middleware infrastructure for the manipulators. This middleware components provide tools for acquiring and using 3D obstacle maps, and efficient manipulator control. We discuss these issues by looking at

Fig. 8. Preliminary results obtained with the polymap driver (left: corrected 3D cloud point, right: polygonal interpretation. Each plane is visualized with a different color)

In order to separate the 3D space into planes and later on, polygons, we use a random consensus based algorithm, which makes use of the distribution of points in space, and clusters the points into inliers and outliers. Once a planar region has been found, using a least-squares technique, the inliers are projected onto it, and a Delaunay triangulation followed by an outline shape detection algorithm is performed. One usage example is demonstrated in Figure 6. As shown there, the laserptzcloud driver takes a laser 2D scan from a laser interface and the pan-tilt angles of a PTZ unit via a ptz interface, computes the corrected 3D coordinates of each point in space, and outputs a 3d point cloud via the pointcloud3d interface. The polymap driver uses the resulted 3d point cloud, and, via a few configuration requests, computes the best N planes that contain X (given) inliers, fits the inliers to the planes, extracts the appropriate clusters and

tries to find polygonal outlines (see Figure 8). 2) 3D Trajectory planning: For reaching and manipulation tasks with the robotic arms, it is important to find the best way to reach a desired position without colliding with obstacles, therefore a trajectory planner is needed. Our proposal is to use the Motion Strategy Library (MSL) [9], that includes a variety of motion planners, including an available RTT planner[10]. The MSL needs a geometrical model of the environment in order to operate, model which is provided by the 3D mapping driver described above. 3) Execution of the planned trajectories: Once we have a suitable trajectory for the end effector, the next step is finding the correct angles for each joint, so that it reaches the desired points in space with the desired orientations (known as the inverse kinematics problem). We have developed a Player driver that has a limb interface for inputting the desired position and pose, and controls an actarray interface connected to a robotic manipulator. This driver reads the description from the arm expressed with the DenavitHartenberg convention, thus it can be used for any robot manipulator that has a driver in P/S/G. In order to interface with our Powercube manipulators, we have developed a P/S/G driver that implements an actarray interface (which we extended accordingly). It receives commands from the driver described above and controls the arms of the robot through the CAN interface. Since the manipulators can move with high velocities and could easily collide with the environments, the driver also implements safety features like a watchdog, that will stop the arm if the Player server stops responding for a fraction of a second. One of the goals of our project is to create models from actions using information from the arms. On this context it is very important to have precise data regarding the state of all the joints in the arm, and the data for all joints should be sampled synchronized and with a constant frame rate. To solve this problem, the Powercube driver uses broadcast commands on the CAN bus to instruct all modules to record their positions at the same time, and the data is then recovered serially from all of them. To have a very constant frame rate, the driver includes code to order the kernel scheduler to treat the Player server as a (soft) real time process, and it also takes advantage of the recent improvements to POSIX timers in the Linux kernel, called the High-Resolution-Timers infrastructure. In this way, the control loop of the arm runs with deviations of less than 10uS in the cycle time on modern computers without having to run the arm controller in kernel-space. Learning a model of the manipulator’s dynamics is very useful if one wants to improve the arm’s movements. The whole arm can be conceptualized as a kinematic chain made up of rigid sections and joints between them. The kinematic response of each joint depends on the torque applied by the motor of the joint, and the positions, velocities and accelerations of all joints in the arm. All these make the feature space for learning the model very large. To successfully learn a model for each joint, we apply an algorithm that works well in high-dimensional spaces: Locally weighted projection

regression (LWPR) [11]. C. Extensions for Cognitive Processing We also extend P/S/G to support cognitive processing. It is certainly an issue whether or not cognitive capabilities should be incorporated into a middleware tool such as P/S/G. The pros are that many perceptual and sensing tasks can be drastically simplified using contextual information. And also that often the behavior or action performance can only be optimized if the cognitive processing is tightly coupled with the low-level perception/action loop. One con is that the provision of cognitive mechanisms could be viewed as a methodological commitment that we might not want to make at the middleware layer. We believe that the advantages outweigh the disadvantages. We can take advantage of cognitive capabilities in the middleware layer to ease the interpretation of sensor data for safety and collision avoidance. For example, when the robotic arm is moving in what it thinks is free space, and it senses a small force applied on it, it should stop immediately, because it has just crashed against something. But when the arm is picking up an object, a sensed force is expected and it is part of the normal manipulation action. The threshold for the emergency stop of the arm based on sensed forces can be set accordingly to the context and the intended action. In the opposite direction, in order to facilitate research in cognitive areas, the middleware has to support certain features. One example is applying learning for the arm manipulator, which requires a very tight integration of the low-level and high level control processes. The sensor data in particular must be synchronized and sampled uniformly and the middleware must support this. This has been taken into account when designing the Player driver for our robot manipulators, as explained in section IV-B. Another example of the benefits of cognitive processing embedded in the middleware is the use of specialized perception routines instead of general ones if the necessary context conditions of the specialized methods are met. Consider a robot navigating through an indoor environment. By using the 3D polygonal mapping driver, it can verify easily the presence of an uniform ground plane. The different parts of the middleware can be combined to create powerful systems. For example, if we want to pick up a cereal box with the robotic arm we can use the 3D Mapping driver to obtain a geometric model of the object and find out its position, while using an RFID reader to detect its identity. Knowing it, the grasp planner can search for a detailed stored model, and select the best grasping method. An important lesson that we have learned is that the cognitive processing layer should not be something isolated on top of the middleware. Instead, embedding it into the the middleware enables and eases the development of more powerful and flexible systems. D. Usage examples The usage of the above described extensions is pretty straightforward, and requires little programming skills. All

the parameters of a driver, may it be a low-level hardware driver or a higher level virtual one, are set via the Player configuration file, and can also be modified later via client configuration requests. The following example depicts the usage of our automatic acceleration feature extraction driver, for nodes in a Wireless Sensor Network. driver ( ... ↓ ...   name ”accelfeatures” driver (       provides [”features:0” ”wsn:1”] name ”accelfeatures”       requires [”wsn:0”] provides [”features:1” ”wsn:2”]    window size 16  requires [”wsn:1”]         )

queue size 10000 overlapping 50 feature list [”wavelet coeff” ”ica”] wavelet params [”daubechies” 20]

... ↓ ...

        

window size 16 queue size 10000 overlapping 50 feature list [”energy” ”rms” 11 ”skewness” ”magnitude” 15 18]

)

(3) All changes of state in the doors of the cupboards. (4) All tags detected by the RFID readers. (5) Movies from all cameras recorded at 15 frames per second. (6) Data from various other sensors. The data was recorded in log-files with timestamps for each event. These research efforts profit from using P/S/G, because of the comprehensive and reliable sensor infrastructure that enables us to perform replicable and long-time experiments with little set-up time. In our ongoing research we use this data for first studies in learning action models for table setting tasks. We will also use the data to investigate whether a subset of the sensors used are sufficient to recognize the activities and explore the effect of particular combinations of them on the reliability, accuracy and coverage.

In this case, the accelfeatures driver will spawn two different threads, using the output of the first as the input of the second. Therefore, the first thread will receive acceleration data via the wsn:0 interface, calculate wavelet coefficients using Daubechies 20 and perform Independent Component Analysis (ICA) in parallel, and finally pack the resulted values in the wsn:1 interface. The second thread will take the results from wsn:1 as input, and will compute standard features such as energy, RMS, magnitude and so on and will provide the results to the user via the wsn:2 and features:1 interfaces. The calculated features will be used later on to learn SVM (Support Vector Machines) models (see [1] for more information), and build a library of motion blueprints. V. E XPERIMENTAL R ESULTS We tested the ubiquitous sensing infrastructure in an experiment that lasted more than ten hours. The complete sensor network was controlled by P/S/G the whole time. Our experiment consisted in having a number of people instrumented with tiny inertial measurement units as well as with a RFID-enabled glove, setting up the table, while the system observes and gathers sensor data. The environment was instrumented with a large number of sensors. Seven video cameras were positioned so that all possible angles are covered, and three small, non-intrusive, laser sensors have been carefully placed in the environment for tracking people movements. Two long-range RFID readers were placed in the cupboards, and several kitchen objects have been equipped with RFID tags. Additionally, four more smaller RFID readers together with eight capacitive sensors were placed under a kitchen table for detecting where and what type of objects were on the table. All sensor data was synchronized and logged using several computing devices in the network. On each cupboard door, we placed magnetic sensors in order to find out if the door is open or closed. Light and temperature sensors together with wireless sensor network nodes have been scattered throughout the AwareKitchen in strategic places. We performed 10 table-setting activities. The collected data includes: (1) 75 measurements per second from the inertial units. (2) 10 scans per second from the laser scanners.

Fig. 9.

A Gazebo simulation of the AwareKitchen

The manipulation control system described in Section IVB has also been tested in extensive learning experiments, where we learned predictive forward models for movement routines when carrying objects of different weights. Those experiments required reliable recording of data and the action control in the low level perception/action loop, as described in section IV-B.3. We also learn action models that were used to optimize sequences of movement actions that produced both substantial as well statistically significant performance improvements. This would not have been possible without the extended P/S/G infrastructure. Using our polygonal mapping algorithms, we were able to reconstruct environments such as the kitchen, resulting in detailed and accurate 3D maps (see Figure 8). We currently focus on extending these maps towards 3D object maps, eg. maps in which we can identify objects and regions of interest, which the robot assistant can use for high-level planning. Most of our maps are created using the small laser rangefinders placed on the end effectors of the robot. Before attempting to use our robot assistant in the real world, we go through several stages of fine-grained simulation of our plans and controllers. To perform an accurate 3D simulation, we make use of the Gazebo simulator, under which we created an environment similar to the real one (see Figure 9). By introducing error models in our sensory

data and employing probabilistic techniques in the simulator, we can achieve similar results in the real world, without changing our programming infrastructure or algorithms. VI. R ELATED W ORK Related work can be categorized on several dimensions. On the middleware side, there are several other software packages positioned closely to Player/Stage/Gazebo. Due to space constraints, a comparison between them will not be made here. Instead, we will give a list of publications that cover a lot of pros and cons of each project from an architectural point of view [12], [13], [2]. A very important aspect for a robotic middleware is that it has to keep up with the rapid advancements in the field, and thus be as flexible as possible. During the last 8 years, Player/Stage has been consistently and continuously evolving. According to [3], over 50 different research laboratories and institutions all around the world are currently involved in the active development of P/S/G, and even more are using it. Overall, we believe that one of the key features of a project of this size should be its ease of maintenance (achieved in P/S/G’s case through simplicity and transparency of the developer API), because the pool of researchers and developers changes rapidly [12]. Research projects in the ubiquitous community have created fully instrumented sensor-equipped environments, one of the leading initiatives in this field being MIT’s PlaceLab [14]. There are other efforts underway, but they do not cover complete environments (entire apartments), but rather more limited scenarios. In comparison to the PlaceLab, we comprise a more comprehensive sensor suite in our AwareKitchen, mostly because we are interested in transferring skills from humans to robots through developmental learning, thus we need more complete action models. With respect to other related work regarding our P/S/G extensions, we have referenced the appropriate articles in our technical section. At the best of our knowledge, the integration of such advanced techniques in such a comprehensive way has not been performed in other comparable initiatives. VII. C ONCLUSION AND F UTURE WORK In this paper we have described current extensions of the P/S/G middleware for autonomous robot control. The extensions cover three areas of robot control: (1) extensions to integrate ubiquitous sensing infrastructure, (2) extensions needed for supporting robotic manipulation, and (3) extensions allowing for cognitive processing. Taken together they enable P/S/G to effectively support the development and deployment of leading-edge service robots that can serve as robot assistants with comprehensive manipulation capabilities acting in ubiquitous sensor-equipped environments. We have also argued that cognitive capabilities should not be realized strictly on top of the middleware layer but rather they should be tightly integrated into it. Among other things, the tight integration allows for much more powerful learning mechanisms and for the use of contextual information to

simplify low-level sensor data processing. While substantial parts of the extensions are still under investigation and development and will become parts of future P/S/G versions, the kernel providing much of the basic functionality has been shown to reliably work in an extended experiment. VIII. ACKNOWLEDGMENTS This work has been jointly conducted by the Intelligent Autonomous Systems group at the Technische Universit¨at M¨unchen and the Artificial Intelligence Center at SRI, International. This work is supported in part by the National Science Foundation under grant IIS-0638794. We would like to thank Matthias Kranz for his numerous contributions to the AwareKitchen project, and also our students Lorenz M¨osenlechner and Nico Blodow for their excellent project work. Special thanks go to the entire P/S/G community for their continuous effort in developing and maintaining free software tools for robot and sensor applications. R EFERENCES [1] M. Kranz, R. B. Rusu, A. Maldonado, M. Beetz, and A. Schmidt, “A Player/Stage System for Context-Aware Intelligent Environments,” in Proceedings of UbiSys’06, System Support for Ubiquitous Computing Workshop, at the 8th Annual Conference on Ubiquitous Computing (Ubicomp 2006), September 17-21 2006. [2] B. Gerkey, R. T. Vaughan, and A. Howard, “The Player/Stage Project: Tools for Multi-Robot and Distributed Sensor Systems,” in In Proceedings of the 11th International Conference on Advanced Robotics (ICAR 2003), pages 317-323, June 2003. [3] “Player/Stage/Gazebo: Free Software tools for robot and sensor applications,” http://playerstage.sourceforge.net. [4] J.-C. Latombe, Robot Motion Planning. Norwell, Massachusetts: Kluwer Academic Publishers, 1991. [5] D. Fox, “KLD-sampling: Adaptive particle filters,” in Proc. of Advances in Neural Information Processing Systems (NIPS). MIT Press, 2001. [6] F. Lu and E. Milios, “Robot pose estimation in unknown environments by matching 2D range scans,” J. of Intelligent and Robotic Systems, vol. 18, no. 3, pp. 249–275, May 1997. [7] I. Ulrich and J. Borenstein, “VFH+: Reliable obstacle for fast mobile robots,” in Proc. of the IEEE Intl. Conf. on Robotics and Automation (ICRA), May 1998, pp. 1572–1577. [8] S. R. Madden, J. M. Hellerstein, and W. Hong, “TinyDB: In-network query processing in TinyOS,” http://telegraph.cs.berkeley.edu/tinydb, September 2003. [9] S. M. LaValle et al, “The Motion Strategy Library Main Page (Online),” http://msl.cs.uiuc.edu/msl/. [10] S. M. LaValle, Planning Algorithms. Cambridge, U.K.: Cambridge University Press, 2006, also available at http://planning.cs.uiuc.edu/. [11] S. Vijayakumar, A. D’Souza, and S. Schaal, “Incremental online learning in high dimensions.” Neural Computation, vol. 17, no. 12, pp. 2602–2634, 2005. [12] T. H. Collett, B. A. MacDonald, and B. P. Gerkey, “Player 2.0: Toward a Practical Robot Programming Framework,” in Proceedings of the Australasian Conference on Robotics and Automation (ACRA 2005), December 2005. [13] R. T. Vaughan, B. Gerkey, and A. Howard, “On Device Abstractions For Portable, Reusable Robot Code,” in In Proceedings of the IEEE/RSJ International Conference on Intelligent Robot Systems (IROS 2003), pages 2121-2427, October 2003. [14] S. S. Intille, K. Larson, J. S. Beaudin, J. Nawyn, E. M. Tapia, and P. Kaushik, “A living laboratory for the design and evaluation of ubiquitous computing interfaces,” 2005.

Suggest Documents