TWO CASE STUDIES IN THE DESIGN AND EVALUATION OF A MOBILE INTERACTIVE SYSTEM

TWO CASE STUDIES IN THE DESIGN AND EVALUATION OF A MOBILE INTERACTIVE SYSTEM Author ___________________________________ Pekka Parhi Supervisor ___...

Author: Edwin O’Neal’

3 downloads 0 Views 5MB Size

Report

Download PDF

Recommend Documents

Design and Implementation of Mobile Services Evaluation System

ECG Interpretation Interactive Case Studies

Human Patient Simulators and Interactive Case Studies

Design and Evaluation of a Mobile Payment System for Public Transport: the MobiPag STCP Prototype

Alexandria Digital Library: User Evaluation Studies and System Design

Analysis of the Pilgrim Profile in Spain: Two Case Studies

7 Case Studies. The site planning principles and design concepts described in the. Case Studies

Australian Case Studies in Mobile Commerce

The FloTrac System Case Studies

An Investigation into the use of Field Methods in the Design and Evaluation of Interactive Systems

Core design. Case studies

CASE STUDIES MOBILE MARKETING 2015

DeDe: Design and Evaluation of a Context-Enhanced Mobile Messaging System

Music and Dementia: Two Case-studies

Sojourning and Motivational Change: Two Case Studies

Design, Implementation and Evaluation of a Revision Control System

Monitoring and Evaluation System: The Case of Chile

Intellectual Capital in the Caribbean Hospitality Industry: Two Case Studies

Design and Development of NFC Smartphone Indoor Interactive Navigation System

FORM AND ANXIETY IN TRANSLATION: TWO CASE STUDIES DENNIS DUNCAN

INFORMATION SYSTEMS IN THE LIVING ROOM: A CASE STUDY OF PERSONALIZED INTERACTIVE TV DESIGN *

EUROPE AND THE ARAB SPRING: TWO CASE STUDIES

SMEs and sustainability management: comparison of two case studies

YOUTH PLACES AND SPACES: CASE STUDIES OF TWO TEEN CLUBS

TWO CASE STUDIES IN THE DESIGN AND EVALUATION OF A MOBILE INTERACTIVE SYSTEM

Author

___________________________________ Pekka Parhi

Supervisor

___________________________________ Timo Ojala

Accepted

______ / ______ 2007

Grade

___________________________________

Parhi, Pekka (2007) Two case studies in the design and evaluation of a mobile interactive system. University of Oulu, Department of Electrical and Information Engineering. Diploma Thesis, 87 p.

ABSTRACT This thesis addresses the design and evaluation of mobile interactive systems in form of an extensive literature review and two case studies. The literature review concerns characteristics and key activities in mobile interactive systems design. In particular, suitable methods for evaluating user acceptance of mobile services are closely inspected. Based on the findings, guidelines for selecting appropriate methods for evaluation and data collection are developed. The first case study is the design process of CampusSearch, a mobile service developed in the Rotuaari project. The service provides mobile users with tools to retrieve information on various resources at university campus, accompanied with map-based guidance. Two empirical evaluations with real users in realistic field setting were conducted to assess the acceptability of the service. Both tests proved CampusSearch as useful, while a comparative evaluation between two devices in the latter test revealed that users preferred Nokia 770 tablet to Nokia 6670 phone for using the service because of its faster data connection, wider screen and more usable interface. The second case is a two-phase study to determine optimal sizes for targets when using a small touchscreen device one-handed with a thumb. Two user interface designs for a PDA were developed and tested in a laboratory setting to study interaction between target size and task performance in single- and multitarget selection tasks. The obtained results suggest that target size of 9.2 mm for single-target selection tasks and targets of 9.6 mm for multi-target tasks should be sufficiently large without degrading performance and user preference. A methodological comparison of the two case studies confirmed the developed guidelines for evaluation of mobile services. The results recommend laboratory setting for evaluation when the focus is solely on user interface and deviceoriented usability related issues. Field experiments are considered indispensable if the goal is to study a wider range of factors affecting the overall acceptance of the mobile service. Keywords: mobile services, empirical evaluation, data collection methods, user acceptance, usability

Parhi, Pekka (2007) Kaksi tapaustutkimusta mobiilin interaktiivisen järjestelmän suunnittelusta ja evaluoinnista. Oulun yliopisto, sähkö- ja tietotekniikan osasto. Diplomityö, 87 s.

TIIVISTELMÄ Tässä työssä tutkitaan mobiilien interaktiivisten järjestelmien suunnittelua ja evaluointia kattavan kirjallisuuskatsauksen ja kahden tapaustutkimuksen muodossa. Kirjallisuuskatsauksessa keskitytään mobiilien interaktiivisten järjestelmien suunnittelun ominaispiirteisiin ja keskeisiin vaiheisiin. Erityisen tarkastelun alla ovat mobiilipalveluiden hyväksyttävyyden evaluointiin soveltuvat menetelmät. Löydösten pohjalta työssä kehitetään suosituksia sopivien evaluointi- ja tiedonkeruumenetelmien valintaan. Ensimmäisenä tapaustutkimuksena kuvataan Rotuaari-projektissa kehitetyn Kampushaku-palvelun suunnitteluprosessi. Karttapohjaisella opastuksella varustetun mobiilipalvelun avulla liikkuvat käyttäjät voivat hakea tietoa erilaisista resursseista kampusalueella. Palvelun hyväksyttävyyttä arvioitiin kahdessa empiirisessä kenttäkokeessa oikeilla käyttäjillä todellisessa käyttöympäristössä. Molemmat testit osoittivat palvelun olevan hyödyllinen. Kahden laitteen välinen vertailu jälkimmäisessä kenttätestissä paljasti, että käyttäjät käyttivät Kampushakua mieluummin Nokia 770 -päätelaitteella kuin Nokia 6670 -matkapuhelimella johtuen edellä mainitun laitteen nopeammasta datayhteydestä, suuremmasta näytöstä ja käytettävyydeltään paremmasta käyttöliittymästä. Toisena tapauksena esitellään kaksivaiheinen tutkimus, jossa selvitettiin optimaalista painikkeiden kokoa käytettäessä pientä kosketusnäytöllistä laitetta yhdellä kädellä peukalon avulla. Painikkeen koon vaikutusta käyttäjän suorituskykyyn tutkittiin laboratorioympäristössä kahdella PDA-laitteelle suunnitellulla käyttöliittymällä. Tulosten perusteella 9.2 mm todettiin riittävän suureksi kooksi yksittäisiin ja 9.6 mm peräkkäisiin painikkeenvalintatehtäviin heikentämättä suorituskykyä ja käyttömukavuutta merkittävästi. Tapaustutkimusten metodologinen vertailu vahvisti mobiilipalveluiden evaluointiin kehitetyt suositukset. Tulosten perusteella laboratorio on sopivampi ympäristö evaluoinnille, kun keskitytään yksinomaan käyttöliittymään ja laitepainotteiseen käytettävyyteen liittyviin asioihin. Kenttäkokeet ovat puolestaan välttämättömiä, jos tavoitteena on tutkia laajemmin mobiilipalvelun hyväksyttävyyteen vaikuttavia tekijöitä. Avainsanat: mobiilipalvelut, empiirinen evaluointi, tiedonkeruumenetelmät, hyväksyttävyys, käytettävyys

TABLE OF CONTENTS ABSTRACT TIIVISTELMÄ TABLE OF CONTENTS PREFACE ABBREVIATIONS 1. INTRODUCTION ................................................................................................8 1.1. SCOPE AND OBJECTIVES .................................................................................8 1.2. CONTRIBUTIONS .............................................................................................9 1.3. OUTLINE OF THE THESIS .................................................................................9 2. INTERACTIVE SYSTEMS DESIGN..............................................................11 2.1. PROBLEM DEFINITION ..................................................................................12 2.1.1. Human Activity to be Supported......................................................13 2.1.2. Identifying the Users .......................................................................14 2.1.3. Level of Support...............................................................................14 2.1.4. The Form of Solution.......................................................................16 2.2. DESIGN PROCESSES AND REPRESENTATIONS ................................................16 2.2.1. User Study .......................................................................................17 2.2.2. Modeling the User’s Activity...........................................................19 2.2.3. Requirements Specification .............................................................20 2.2.4. Analysis of the Design .....................................................................22 2.2.5. Empirical Evaluation ......................................................................23 2.3. ASPECTS OF MOBILE INTERACTIVE SYSTEMS DESIGN ..................................26 3. USER ACCEPTANCE.......................................................................................28 3.1. 3.2. 3.3. 3.4. 3.5.

SYSTEM ACCEPTABILITY FRAMEWORK ........................................................28 TECHNOLOGY ACCEPTANCE MODEL (TAM)................................................28 INNOVATION DIFFUSION THEORY (IDT) ......................................................30 THEORY OF PLANNED BEHAVIOR (TPB) ......................................................31 UNIFIED THEORY OF ACCEPTANCE AND USE OF TECHNOLOGY (UTAUT)...32

4. EMPIRICAL EVALUATION OF MOBILE SYSTEMS...............................33 4.1. USABILITY TESTING .....................................................................................33 4.2. LABORATORY VS. FIELD EXPERIMENTS ........................................................34 4.3. DATA COLLECTION TECHNIQUES FOR MOBILE ENVIRONMENT ....................39 4.3.1. Heuristic Evaluation........................................................................39 4.3.2. Observation .....................................................................................41 4.3.3. Interviews and Focus Groups..........................................................41 4.3.4. Questionnaires.................................................................................42 4.3.5. Logging............................................................................................47 4.4. SUMMARY ....................................................................................................48

5. CASE 1: CAMPUSSEARCH – A MOBILE INTERACTIVE SYSTEM IN CAMPUS ENVIRONMENT .............................................................................49 5.1. 5.2. 5.3. 5.4.

SITUATION OF CONCERN ..............................................................................49 PROBLEM DEFINITION ..................................................................................51 REQUIREMENTS SPECIFICATION ...................................................................51 DESIGN AND IMPLEMENTATION ....................................................................52 5.4.1. Architecture .....................................................................................53 5.4.2. User Interface..................................................................................55 5.4.3. Content ............................................................................................57 5.5. EMPIRICAL EVALUATION 1: SMARTCAMPUS FIELD TRIAL ...........................58 5.5.1. Method.............................................................................................58 5.5.2. Main Results ....................................................................................60 5.6. EMPIRICAL EVALUATION 2: CAMPUSSURF FOCUS GROUPS..........................63 5.6.1. Method.............................................................................................63 5.6.2. Main Results ....................................................................................66 5.7. DISCUSSION ..................................................................................................69 6. CASE 2: TARGET SIZE STUDY FOR ONE-HANDED THUMB USE ON MOBILE TOUCHSCREEN DEVICES ...........................................................70 6.1. METHOD .......................................................................................................70 6.1.1. Discrete Target Phase .....................................................................71 6.1.2. Serial Target Phase .........................................................................72 6.2. RESULTS .......................................................................................................73 6.3. DISCUSSION ..................................................................................................76 7. COMPARISON OF TWO CASE STUDIES ...................................................77 7.1. FIELD EVALUATIONS VS. LABORATORY STUDY ...........................................77 7.2. SMARTCAMPUS VS. CAMPUSSURF ...............................................................79 8. SUMMARY.........................................................................................................80 REFERENCES .........................................................................................................81 APPENDICES ..........................................................................................................87

PREFACE This thesis has been done at the MediaTeam research group, which operates in the Department of Electrical and Information Engineering at the University of Oulu. The thesis addressed the design and evaluation of mobile interactive systems in form of an extensive literature review and two case studies. The thesis has been conducted as part of the Rotuaari project. I would like to express my gratitude to the supervisor of my work, Professor Timo Ojala and to the second reviewer, Professor Jukka Riekki. I would also like to acknowledge Professor Ben Bederson and Amy Karlson for the support I received during my research visit to the University of Maryland. In addition, my warmest thanks go to Annu Ristola for the enjoyable cooperation during the CampusSurf field trial, as well as to the other members of the Rotuaari project who have contributed to my work in numerous ways. And last but not least, I thank Piritta for her love and support in this whole ordeal and my family for encouraging me throughout my life. Oulu, March 20, 2007 Pekka Parhi

ABBREVIATIONS 3G AJP API ASQ BT CASE CPI CSS CSUQ DFD GOMS GPRS GPS HCI HQL HTA HTML HTTP IDT ISO JSP MVC ORM PDA PUEU PUTQ RM-ANOVA RPC QUIS SDK SOAP SQL SUMI SUS TAM TCP TPB TRA UI UML UTAUT WAP WGS84 WLAN XHTML XML

Third Generation Apache JServ Protocol Application Programming Interface After Scenario Questionnaire Bluetooth Computer-Aided Software Engineering Content Provider Interface Cascading Style Sheets Computer System Usability Questionnaire Data Flow Diagram Goals, Operators, Methods, and Selection rules General Packet Radio Service Global Positioning System Human-Computer Interaction Hibernate Query Language Hierarchical Task Analysis Hypertext Markup Language Hypertext Transfer Protocol Innovation Diffusion Theory International Standards Organization Java Server Pages Model-View-Controller Object Relational Mapping Personal Digital Assistant Perceived Usefulness and Ease of Use Purdue Usability Testing Questionnaire Repeated Measures Analysis of Variance Remote Procedure Call Questionnaire for User Interface Satisfaction Software Development Kit Simple Object Access Protocol Structured Query Language Software Usability Measurement Inventory System Usability Scale Technology Acceptance Model Transmission Control Protocol Theory of Planned Behavior Theory of Reasoned Action User Interface Unified Modeling Language Unified Theory of Acceptance and Use of Technology Wireless Application Protocol World Geodetic System 1984 Wireless Local Area Network Extensible Hypertext Markup Language Extensible Markup Language

1. INTRODUCTION Personal computing is on the move from the desktop to a more mobile environment. Powerful, versatile and pervasive handheld devices are changing the way we work, study and communicate. This trend is visible in the increasing capabilities of PDAs and smartphones, enabling these devices to be used for an ever-increasing variety of tasks. At the same time, mobile technologies are becoming a more and more integral part of our everyday lives, and are gaining a wider and more diverse user base. Thus, an increasing number of interactive computing systems are designed to be used on mobile handheld devices. Bringing interactive systems to mobile devices, however, introduces additional challenges for effective design and evaluation of user acceptance and usability. Compared to desktop terminals, handheld devices have various physical and technical constraints that need to be taken into account during the design. This makes software development and user interface design more demanding. Further, mobile services are often used in dynamic, crowded and unpredictable contexts requiring users to divide their attention between several tasks. In terms of evaluation, such contexts of use may be very hard or impossible to control, making traditional methods difficult to apply. As mobile systems often require a network connection to function, the designers are also faced with wireless connectivity issues affecting user acceptance and usability, such as slow or intermittent network connections. 1.1. Scope and Objectives The work covered in this thesis has a number of objectives. First, the goal is to review the essential activities of interactive systems design, and identify the most important characteristics of mobile systems that need to be taken into account during the development process of mobile services. In particular, the methods for evaluating user acceptance and usability of mobile interactive systems are closely discussed. The inspection of mobile interactive systems design and evaluation is conducted by an extensive literature review on the research topic. Further, few suitable data collection methods for mobile system acceptance evaluation are covered. The second goal in this thesis is to design and evaluate a mobile interactive service targeted for campus environment. The service is developed as part of SmartCampus field trial in Rotuaari project, the goal of which is to research mobile services that aid working, studying and community life on a university campus. The stages carried out in the design are based on the design paradigms and activities recommended by the literature review. The purpose of the evaluation is to assess user acceptance of the service by using data gathering methods found applicable for evaluating mobile systems. Moreover, an important goal is to select the participants and a suitable test environment according to the developed guidelines. The third aim of this thesis is to describe a target size study for one-handed use on mobile handheld devices equipped with a touch-sensitive screen. The goal of the study is to determine optimal sizes for interaction targets maximizing performance and user preference when using a mobile device one-handed with a thumb. Instead of concentrating on user experience and perceived usefulness of the designed system, the focus of this evaluation is on quantitative performance measurements, as well as device-oriented usability and user interface design issues.

1.2. Contributions The first contribution of this thesis is a literature review report concerning design and evaluation of mobile interactive systems. Based on the research, the characteristics and essential stages of mobile interactive systems design are identified, and guidelines for the selection of evaluation methods and data collection techniques are presented. The developed guidelines are used as a basis for designing and evaluating the new mobile service. The second contribution of this thesis is the design description of CampusSearch, a location-aware mobile browser-based service implemented as part of the SmartCampus service system. The service, wholly designed and implemented by the author, provides mobile users with tools to browse and retrieve information on places, activities and personnel in a campus area, complemented with map-based guidance. The design comprises of a comprehensive user survey for gathering user needs for mobile services, the design and implementation of CampusSearch based on the identified user needs, and two empirical evaluations in the field. The test setup and the results of the two studies, which assessed the acceptability of the service with real users in natural context of use, are also reported. In the first study, CampusSearch was evaluated in conjunction with an extensive one-month field trial named SmartCampus. The campus dwellers were given an opportunity to try and test various service prototypes with different types of smartphones and PDAs. Data on actual usage of the service was gathered by means of server-side logging, while demographics, subjective assessments of usefulness and user feedback were collected with questionnaires. In the second field experiment, CampusSurf, three task-based evaluation sessions with small groups of pre-selected test users were carried out within three days. Each session also involved two focus group meetings and a comparative evaluation of service usability between two different mobile handheld devices. Observation data was collected with a camcorder during focus group meetings, while server-side logging and questionnaires were used in the task-based evaluation. The final contribution is the reporting of a target size study for one-handed thumb use on small touchscreen devices. The study investigated the interaction between target size and task performance, considering single-target (discrete) and multi-target (serial) tasks. Two user interface designs for a PDA were designed and then tested in a user study with 20 participants. In contrast to evaluations of CampusSearch, the evaluation was conducted in a laboratory. The main data collection methods, however, were similar to those used in CampusSearch field tests. The author has contributed to the study by designing and implementing the tested software, as well as supervising the experiment and analyzing the results. A scientific paper on the study has been published in the proceedings of an international conference on human-computer interaction with mobile devices and services [1]. 1.3. Outline of the Thesis The rest of the thesis is organized as follows. Chapter 2 presents a general overview to interactive system design based on the framework by Newman and Lamming [2]. The chapter introduces the key activities in various stages of design and identifies

10 main characteristics of mobile interactive systems design. Chapter 3 describes various user acceptance models explaining factors that affect how users come to accept and use a system or technology. The methods for evaluating user acceptance and usability of mobile interactive systems are inspected in Chapter 4. The chapter discusses whether empirical evaluation of mobile systems should be carried out in a laboratory or field setting, and investigates suitable data collection methods for mobile environment. Chapter 5 reports the first case study covered in this thesis, the design process of CampusSearch including SmartCampus and CampusSurf field experiments. Chapter 6 presents the second case of this work, a target size study for one-handed thumb use on small touchscreen devices. A methodological comparison of these two case studies is presented in Chapter 7. Finally, the summary of the thesis is presented in Chapter 8.

11

2.

INTERACTIVE SYSTEMS DESIGN

A system can be said to be interactive when it supports bi-directional communication between the human user and the computing system. The user’s actions and the corresponding reactions of the system take place via the system’s user interface, the part of the system that allows the user to interact with the machine and to accomplish various computing tasks. The most essential aspect of any interactive system, as emphasized by Newman and Lamming [2], is its support for people to carry out their activities faster, with fewer errors, and with greater quality. The understanding how to support human activity is the central purpose of designing interactive systems, but it is also a factor that makes designing usable and useful interactive systems challenging. In order to develop a system that is well adapted to the user's needs and the context of use, the designers have to think beyond merely what capabilities the system should have. They also need to understand and support the tasks and processes that users perform, as well as to study the interaction between the user and the computer. This interdisciplinary subject of Human-Computer Interaction (HCI) relates computer science with many other fields of study and research, such as cognitive psychology, sociology, human factors and ergonomics. [2], [3], [4] The challenges surrounding the use of interactive systems have been noted for quite a while in the field of HCI, and a number of approaches to tackle them have been introduced. Norman and Draper recognized the importance of user-centered design [5], in which the needs, wants and limitations of the end user are given extensive attention at each stage of the design. The ISO 13407 standard [6] defines a general process for including human-centered activities throughout a development lifecycle. The four activities that form the main cycle of work in this model are presented in Figure 1a. Hartson and Hix propose a star lifecycle for the development of interactive systems [7]. This lifecycle, as illustrated in Figure 1b, provides an interconnected approach to development in which the central activity, evaluation, is viewed as an essential part of the design at all stages in the lifecycle. In scenariobased design [8], concrete descriptions of usage situations, scenarios, are used to guide the design. In this approach the designer focuses first on the human activities that need to be supported and then allows descriptions of those activities to drive everything else. Common to all models presented above is their focus on iteration and gradual shaping of the design through various concurrent activities. Furthermore, they stress the interactive nature of design and recognize the value of user study and evaluation.

Figure 1. (a) Human-Centered Design (HCD) lifecycle, (b) Star lifecycle.

12 Usability engineering is a systematic approach to design software that is easy to use. Usability engineering lifecycle model by Nielsen [9] defines a set of activities performed throughout the development lifecycle to increase usability of the system. The first steps in the usability engineering process include studying users and their tasks, analyzing competing products, and setting usability goals. In parallel design stage, various parallel design alternatives are explored independently by several different designers. Participatory design attempts to actively involve the users in the design process to help ensure that the designed system meets their needs and is usable. Other essential stages of development include prototyping and empirical user testing, combined with iterative design. After releasing the product, it is important to gather statistics and feedback from real use of the system. The stages of the usability engineering lifecycle are presented in Figure 2.

Figure 2. The stages of the usability engineering lifecycle model. This chapter provides a short overview of the key steps in the process of designing an interactive system, based on the well-known design paradigm by Newman and Lamming [2]. Their user-oriented framework for designing interactive systems covers different stages of development from requirements analysis to implementation and evaluation, with an emphasis on iterative and parallel design and representations. Most stages of Nielsen’s usability engineering lifecycle model are also present in the framework by Newman and Lamming, such as user studies, parallel and participatory design, prototyping, heuristic analysis and empirical evaluation. The description of the initial step, the problem definition, is followed by the introduction of roles and relationships of various design processes and representations involved in the design of an interactive system. Finally, characteristics of designing mobile systems for human use are discussed. 2.1. Problem Definition The first step before starting to design an interactive system is to specify the design problem we are trying to solve. Usually a design problem arises out of a situation of concern [10], a situation that needs changing, and we see ways to resolve the

13 situation through the use of interactive technology. The main purpose of problem definition is to transform the situation of concern into a design problem and specify the following four fundamental issues in the design process [2]: • • • •

Identifying the human activity that the system will support. Identifying the users who will perform the activity. Setting the levels of support (usability) that the system will provide. Selecting the basic form of solution to the design problem.

By defining these aspects of the problem we can achieve a clear understanding right at the outset of what problem is to be solved. This way the project will more likely avoid the dangers of grinding to a halt or drifting dangerously off-course, resulting in a solution to the wrong problem or even no solution at all. Newman and Lamming suggest the use of a problem statement [2], a definition of design objectives that brings together the components of the design problem. At the very outset we can usually find a short phrase to describe each of the four components and incorporate them into a single sentence. An example of one-sentence problem statement showing the components of a design problem is illustrated in Figure 3. As the design moves forward, each component will evolve and needs to be expanded and reformulated. The next sections of this chapter give a brief description of each essential component in the design problem.

Figure 3. One-sentence problem statement. 2.1.1. Human Activity to be Supported To resolve the situation of concern, it is necessary to identify the people’s activities, the performance of which we can improve. There are two basic approaches for doing this: we can either focus on inspecting individual tasks, or look more broadly at linked series of tasks representing processes. A task is a unit of goal-directed human activity and it usually involves a sequence of steps. Tasks can be distinguished between each other by identifying their different goals. Simple activities are usually modeled as tasks, whereas complex activities, involving several people over a period of time, would be modeled as processes consisting of a number of separate tasks contributing to the ultimate goal of the activity. Furthermore, tasks are dependent on single resources, while processes involve multiple dependencies on resources. Methods for modeling tasks and processes are further discussed in Section 2.2.2. [2]

14 2.1.2. Identifying the Users In conjunction with identifying the activity to be supported, the potential users of the proposed interactive system need to be identified and their general and specific needs addressed. By taking a user-centered approach to design [5] and focusing on the people themselves, we can learn about the activities they perform and what they need in the way of support. First, human performance and behavior in general should be taken into account. Second, it is essential to know what kind of knowledge and skills the users already have. To address general needs of the human user, several theories and models for predicting physical, cognitive, social and organizational aspects of human performance and behavior have been developed. These theories, such as explanatory theories explaining the observed behavior, empirical laws offering quantitative predictions of human performance, and dynamic models predicting how whole sequences of actions will be performed, can be useful when predicting human behavior in the use of interactive systems. In order to understand specific needs of target users, various user study methods for gathering real data about them have been established. These methods, including interviews, observation and questionnaires, are briefly described in Section 2.2.1. The reason for identifying specific users in the early stages of development is to ensure that adequate attention is paid to their skills, expertise and working context throughout the design. This enables us to make better predictions about the outcome, resulting in better chances to achieve the objectives of the design. [2], [3] 2.1.3. Level of Support It is crucial to achieve the target level of performance set for the activity supported by the system to resolve the situation of concern. Newman and Lamming state usability as “a collective term for all aspects of an activity’s performance that can be affected by the use of technology” [2]. The ISO 9241 standard defines usability in terms of effectiveness, efficiency and satisfaction with which specified users achieve specified goals in a particular context of use [11]. The system acceptability model by Nielsen [9] defines usability as a part of usefulness, which in turn determines whether the system can be used to achieve a desired goal. As depicted in Figure 4, the framework presents usability as a composition of five attributes: learnability, efficiency, memorability, errors and satisfaction. Learnability corresponds to the ability of the user to learn to use the system, while efficiency stands for the level of productivity that the user is able to achieve with the system once having learned to use it. Memorability deals with the ease of remembering how the system works when returning to it after some period of time. Moreover, the users should make as few errors as possible and they should be able to easily recover from them. Catastrophic errors, such as ones that cause users to lose their work, should not occur. The final usability attribute, subjective satisfaction, represents how pleasant it is to use the system. Factors affecting the user’s satisfaction depend on the goal of the interaction and may include, for instance, the entertainment value of the system and the speed at which tasks can get accomplished. [9]

15

Figure 4. Usability attributes by Nielsen. The individual aspects of usability are known as usability factors or metrics, which provide measures for particular aspects of the activity performance. Examples for usability metrics stated in the ISO 9241 standard [11] are shown in Table 1. During problem definition, it is essential to identify the aspects of activity performance affecting the situation of concern. Some of the usual main usability factors are speed, incidence of errors, and ease of learning. By identifying the key factors concerning the design problem, we can proceed to setting targets for usability levels that need to be achieved with the new system in order to resolve the situation of concern. When setting levels of performance, several aspects should be taken into account. First, it helps to consider whether the proposed system is supporting tasks or processes. Second, the levels of support already being achieved need to be known before targets for usability levels can be set for the new interactive system. This requires usability evaluation of the current system, as well as understanding how tasks and processes are being performed at present. Third, we need to set realistic targets for the levels of usability to be achieved with the new system. In the case of a new and unfamiliar design problem, however, it may be quite tricky to estimate what levels of performance can be easily achieved. This could be partly solved by specifying the form of solution, the final topic of problem definition. [2] Table 1. Examples of usability metrics in the ISO 9241 standard Usability Objective

Effectiveness

Efficiency

Satisfaction

Suitability for the task

Percentage of goals achieved

Time to complete a task

Rating scale for satisfaction

Appropriate for trained users

Number of “power features” used

Relative efficiency compared with an expert user

Rating scale for satisfaction with “power features”

Learnability

Percentage of functions learned

Time to learn criterion

Rating scale for “ease of learning”

Error tolerance

Percentage of errors corrected successfully

Time spent on correcting errors

Rating scale for error handling

The explicit measures developed for system usability form a basis for usability analyses and empirical evaluation, which are further described in Sections 2.2.4 and 2.2.5, respectively. Various techniques have been developed for evaluating usability and other user acceptance attributes of interactive systems. Many of the methods applied to studying users, such as interviews, observation and questionnaires, can also be adapted to the evaluation phase. Chapter 3 describes various user acceptance models, while Chapter 4 presents a detailed discussion on methods for evaluating user acceptance of mobile interactive systems. [2]

16 2.1.4. The Form of Solution Defining the form of solution deals with the question of how the interactive system should be implemented to resolve the situation of concern. The various resources and layers of technology involved in the provision of interactive support are presented below: [2] • • • • •

User interface with which the user interacts directly. Application software that supports the user interface. Operating system that provides standard services to both the user interface and its supporting software. System resources accessed via the user interface and supporting software: information storage, communication, printing, and so on. Hardware that supports all of these resources.

During the design the most attention will most probably be paid to the user interface and the design of the interactive software, as they have a particularly direct impact on interactive support provided to the user. For the remaining layers and resources, such as operating systems, networks, databases and workstations, it is likely to look for existing solutions. During the time of problem definition, it is usually necessary only to determine the constraints that apply to the selected solution. Later in the course of solving the problem, each layer will be specified in sufficient detail in order to implement the interactive system successfully. [2] When defining the design problem, it is reasonable to consider whether it should be solved by enhancing an existing form of solution or innovating something totally new. With an existing solution as a starting point the risks are smaller, as many of the design decisions can be easier and the outcome will be easier to predict. In contrast, innovations are often expensive, risky and time-consuming. However, innovations become necessary when existing solutions begin to reach their limits and the benefits from enhancing them are so little that it is better to invest in designing a novel solution. [2] 2.2. Design Processes and Representations Designing an interactive system involves a great number of tasks that often support the two-stage process of analysis and synthesis. During the analysis the current design is tested against the targets for usability and software quality, while in synthesis the design is enhanced based on the data gathered in the analysis phase. Interactive systems design model by Newman and Lamming [2] describes how various design activities and methods make their contributions to design. The model shows how the design consists of multiple concurrent sequences of activities, known as design processes. Further, it points out how these parallel processes are interlinked by various design representations, that is, artifacts being produced in the design that can act as outcomes of one process and an input to others. Figure 5 illustrates relationships between some of the processes and representations of interactive system design, containing basic iterative steps of analysis and synthesis. [2]

17

Figure 5. Interactive systems design model by Newman and Lamming. The five main processes specified in the framework by Newman and Lamming are user study, model building, requirements specification, design analysis and empirical evaluation. This chapter discusses each of the processes in separate sections, as well as some of the principal representations these processes produce and use. 2.2.1. User Study Conducting user studies is the first step towards understanding the potential users of the system and their activities. The data gathered from these studies will act as an input to analysis and model building process, as depicted in Figure 6. Various user study methods have been developed to provide real-life data about people’s activities, enabling us to design interactive systems to support them. The three main categories of these methods are interviews, observation and questionnaires. Because each study method generates a different kind of data, also different methods for analysis are needed. Besides the problem definition phase, user studies may be needed at various other stages of design. They are also conducted in support of evaluation, in which the same data collection methods as used prior to design may apply. In addition, various user acceptance models may be helpful when planning user studies. They are further discussed in Chapter 3. [2]

Figure 6. The process of user study, leading to the collection of activity data.

18

Interviews are usually done by one interviewer asking a set of questions from one user at a time, and are typically audio-recorded to allow for future analysis. They tend to produce qualitative data, are quite easy to conduct and require less prior planning and preparation compared to questionnaires. Interviews are useful for discovering facts and opinions held by potential users in the early stages of design, but can also be used for gathering user feedback about the system during evaluation. Specific types of interviews include focus groups and contextual inquiries. The former are relatively unstructured group interviews with typically 6-9 participants and can provide a diverse range of opinions [12]. They can help in assessing user needs and also let observe group dynamics and organizational issues. The latter method is a mixture of interview and observation with focus on gathering field data from users [13]. In this approach the users are observed and asked questions while performing their activities in a real world situation, with as little interference from the interviewer as possible. Interviews and focus groups are further discussed in the context of empirical evaluation in Chapter 4. [2], [3], [4] Observation is used for capturing descriptions of how various activities are performed by the users in real context. It is a fast and useful method for obtaining qualitative data at the early stages of design. User behavior can be observed either directly or indirectly. In direct observation the observer is actually present during the observed tasks, and allows focusing attention on specific areas of interest. However, direct observation can be obtrusive as users may be constantly aware of being monitored, which can affect their performance levels and behavior – a phenomenon known as Hawthorne effect [3]. Thus, it is important to use techniques that interfere with the task performance as little as possible. In indirect observation the investigator is not present and the activities are captured by some other means, such as using a video recorder or self-reported logs in which the users are requested to log their actions and observations by themselves. Indirect observation can be less intrusive, is able to capture data that would otherwise be left unrecorded or unnoticed, and provides permanent records from which the data can be reviewed later for further analyses. On the other hand, it may produce huge amounts of data that can be very time-consuming to analyze. Moreover, the experimenter has less control and has to rely on the reported observations. Indirect observation also creates more distance between observers and users, and there can be practical problems with setting up the equipment – especially if the context of activities is highly dynamic. Some specific types of observational methods are verbal protocols such as think aloud protocol, in which the user says aloud what he/she is thinking while performing a task or solving some problem. Other types of observational techniques include passive observation, ethnographic field studies and action research. The use of observation in empirical evaluation phase is further discussed in Chapter 4. [2], [3], [14] User surveys and questionnaires are a series of questions designed to gather specific information about users and their opinions. They are often used in conjunction with other techniques and can give quantitative or qualitative data. User surveys are good at reaching large, dispersed groups of people and thus gathering enough data for performing statistical analyses. Moreover, questionnaires are suitable for obtaining users’ subjective preferences about the system during the evaluation phase. Detailed discussion about questionnaires and their use for evaluating interactive systems is presented in Chapter 4. [2], [4]

19 2.2.2. Modeling the User’s Activity System analysis and design is a collective term for two closely coupled activities of analyzing user study data and synthesizing the design. Although there are various methods of system analysis and design, including task analysis, software systems analysis and participative design, they all involve three fundamental design processes of model building, synthesis (requirements specification) and analysis of the design. These processes and their interconnections are depicted in Figure 7, with focus on gradually developing the specification. In model building, a model of current user activity is being constructed based on data gathered in user studies. The created activity model is then utilized in synthesis and analysis processes. In requirements specification, the design is enhanced to support the activities presented in the model, while in analysis the model is used for predicting usability and identifying needs for improvement. This section focuses on model building, while specification and analysis are further discussed in the two following sections. [2]

Figure 7. Three essential processes of systems analysis and design: (1) modeling the user activity, based on activity data from user studies; (2) synthesis of the design (requirements specification); (3) analysis of the design in terms of the model. Activity models provide views into the range of supporting functions that the interactive system may need to provide. The greatest benefit of models is that they enable usability assessment of the design in terms of usability factors, presented in Section 2.1.3. Model construction from the collected user data is a time-consuming process. Therefore, rather than developing an activity model from scratch, designers tend to choose an existing model as a basis, reuse it during the design and make small changes to it if required. Thus, the choice of the model type is one of the important decisions to be made during the early stages of design. Approaches to modeling activities can be categorized to task-oriented and process-oriented methods, based on the scope of the human activity to be supported. Task modeling methods, such as Hierarchical Task Analysis (HTA), represent a hierarchy of tasks that produces the observed activity, while process modeling methods represent sets of tasks that happen sequentially or in parallel. Figure 8 presents a hierarchical task description using the technique of HTA, while Figure 9 shows a process flow chart with Data Flow Diagram (DFD) notation. Besides HTA, other task modeling methods include Task Knowledge Structures (TKS) [15], which allows modeling of the knowledge people possess about tasks, and Participatory Task Modeling (PTM) [16], combining strengths of task analysis and participatory design. Moreover, Use Case Modeling using graphical notation defined in Unified Modeling Language (UML) standard [17] has become a popular technique for modeling tasks and

20 scenarios of use. It is used for capturing functional requirements of the system, which will be further discussed in the next section. A number of CASE tools have also been developed for modeling activities, such as TaskArchitect [18] supporting HTA and various UML tools for modeling use cases. [2], [3], [4]

Figure 8. Task model using notation of Hierarchical Task Analysis (HTA).

Figure 9. Process model using notation of Data Flow Diagram (DFD). Activity models can also be divided to normative, trying to predict what people normally do, or descriptive, describing what people have actually been observed to do. A common usage of activity models is to start with a normative model, then build several descriptive models based on the normative model and data collected from user studies, and finally synthesize the descriptive models into a general activity model on which we can base predictions of usability. Further, we can make comparisons between different activity models to predict usability improvements, for instance by checking normative models against descriptive models based on observed behavior. [2] 2.2.3. Requirements Specification The requirements specification defines the system to be designed and implemented. The problem statement can be considered as an initial specification, including the functional and performance requirements in a very general form that the system must meet. Further, the gradual enhancement of the problem statement into a final specification can be seen as the central process of the whole interactive system design. A well-designed requirements specification is the first artifact of the design

21 and provides the first chance to analyze and evaluate the designed form of solution against the target levels of usability and system acceptability. It also answers the question of how the system can be built in order to resolve the design problem with the given constraints on resources and costs. The role of the requirements document as a watershed in the specification process is illustrated in Figure 10. [2]

Figure 10. The requirements document enables the design to be validated. In the requirements specification process of an interactive system, the most essential activity is to define a usable functional form for the system. In other words, the four components of the initial problem statement – the users, activity to be supported, level of support, and the form of solution – need to be expanded into an adequate specification of the system’s functionality and usability. In functional requirements definition, a set of functions to provide a sufficient support to the users’ activities will be specified. This process depends on two focal activities – building a model of the supported activities, as discussed in the previous section, and selecting a form of solution. Functional requirements can be specified using some charting techniques, such as context and dataflow diagrams along with detailed descriptions of the functional components. Usability requirements specification has a special emphasis in interactive system design. It refers to definition of targets for performance and usability provided by the system, by means of usability metrics. In addition to these, requirements for the structure and meaning of data need to be considered. Data structures are usually specified using Entity Relationship (ER) diagrams or UML class diagrams [17] along with formal descriptions of the data elements. [2], [3] Requirements gathering process also involves defining requirements for various layers of technology, which were introduced along with the form of solution in Section 2.1.4. A particularly important layer is the user interface, of which requirements definition typically involves the following basic design activities: [2] • • • • •

Adopting a style of interaction appropriate to the task performed and the person performing it. Conceptual design, ensuring that the users have an adequate mental model of the system. Performing analyses of usability. Making decisions about specific aspects of the design, based on design guidelines. Documenting the user interface, with the aid of appropriate notations.

22 User interface design usually involves also sketching multiple representations of appearance and behavior of the user interface. In the case of layers on which existing solutions are used, such as operating systems, system resources and hardware, it is important to define how the new system needs to be integrated with these existing systems. Typically existing systems constrain the design, and thus it is vital to identify the limitations these systems cause to the levels of usability. [2] Once all the requirements for the interactive system have been defined, the specification can be validated with respect to the initial design objectives, as depicted in Figure 10. Before implementation, analytical methods can be used for validation. However, as stressed by Newman and Lamming, the real evaluation of an interactive system can be done only empirically, involving building and testing prototypes of the design with real users. In empirical evaluation, the prototype is tested against the requirements for user acceptance, a broad concept consisting of various factors such as usability, social acceptance and cost that affect how users will accept and use the system. Usability analysis and empirical evaluation of the design are discussed in the two subsequent sections, while Chapter 4 investigates appropriate methods for evaluating user acceptance of mobile interactive systems. 2.2.4. Analysis of the Design Usability analyses can be carried out before building the solution and without involving real users. They are often quicker to conduct and require less prior planning than empirical evaluations. Analysis is usually performed in the two stages of sequence and performance analyses. The first phase focuses on the sequence or method how the activity is performed, while in the second phase the steps in the sequence are being analyzed to specify usability measures. Both stages utilize the requirements specification and operate on the activity models derived from user studies, resulting in a refinement of the model, as illustrated in Figure 11. [2]

Figure 11. The two stages of analysis, transforming the general activity model into more specific sequence and performance models. Various methods for inspecting usability of interactive systems have been developed, including various GOMS analyses, walkthrough methods, and Heuristic Evaluation. GOMS stands for Goals, Operators, Methods, and Selection rules. Goals are what the user intends to accomplish, while operators are basic actions performed in service of a goal. Methods are sequences of operators for achieving a goal, and if multiple methods exist, selection rules can be used for choosing one of them. Keystroke-Level Analysis is based on GOMS model of task performance. The

23 method is used for predicting task execution speed by trained users. The idea is to divide each task completion method into components and assign execution times to each component, derived from repeated experiments. GOMS techniques are especially useful if the number of alternative methods for performing a task is small, such as in dialling a telephone number. However, they are not appropriate if the method of operation is not known or the users are novices. [2] Walkthrough methods, such as Cognitive Walkthrough, are typically used for sequence analyses. In cognitive walkthrough, usability issues are identified by simulating the way users explore and gain familiarity with interactive systems. The method is carried out by expert evaluators who walk through the sequence of steps in task scenarios, asking themselves a set of questions at each step. It is useful for analyzing designs for systems that will be used by novice users, as well as systems of which designs have been modified or extended. Usability metrics that are typically inspected in cognitive walkthrough are users’ success rates in task completion and their recoverability from errors. However, the method is not suitable for measuring speed of task performance or analyzing systems to be used by experienced users. [2] In contrast to cognitive walkthroughs and GOMS analyses, Heuristic Evaluation can be applied to designs where the method of operation is not fully predictable and the user is not a complete novice. It is part of the discount usability engineering approach by Nielsen [9], which is based on three techniques of paper prototyping, simplified thinking aloud and heuristic evaluation. Detailed description of heuristic evaluation is presented in the context of data collection methods in Chapter 4. Additional information about various usability inspection methods can be found e.g. in Nielsen [9] and Preece [3], as well as from Nielsen’s website [19]. 2.2.5. Empirical Evaluation Empirical evaluation is an essential part of interactive system design and usability engineering, since designs can be validated only partially with analytical methods. Furthermore, because the system is interactive, it needs to be tested with people as subjects. While analytical methods such as cognitive walkthrough and heuristic evaluation involve usability experts, empirical methods are based on real end users testing the designed system. As already highlighted, Newman and Lamming [2] state that the real evaluation of interactive systems can be carried out only empirically. Similarly, Nielsen promotes testing with real end users as the essential method for assessing usability and overall acceptability [9]. Evaluations with real users are also seen crucial when assessing user acceptance and adoption of new technology. Expert inspection approaches have also been criticized for finding fewer problems in total and relatively more cosmetic problems in comparison to user-based evaluations. There are two main types of evaluation: formative evaluation is done at various stages of design to help forming a system that meets its requirements, while summative evaluation assesses the quality of the final product. For interactive systems design, formative evaluation through iterative ‘design-test-redesign’ cycles is generally seen as a more effective approach. The basic evaluation process, as depicted in Figure 12, involves building prototypes of the design based on the current specification, testing them with real users, analyzing activity data gathered from the experiments, and finally reporting the results. Activity data may be collected using the same methods as in user studies, such as observation and questionnaires, but this

24 time using them to evaluate the design. Further, techniques especially designed for evaluation can be applied, including monitoring actual use of the system by logging, and conducting quantitative or qualitative performance measurements when the system is used for completing specific tasks. The analysis of gathered data usually follows a similar pattern to pre-design modeling and analysis. [2], [3], [4], [9]

Figure 12. The basic stages of formative evaluation process. Two crucial activities in evaluation are the planning of the investigation and the documenting of the results. Investigation plan states the objectives of the evaluation and links the outcome to these objectives. Further, it describes in detail how the evaluation will be conducted, including usability and other user acceptance properties to be measured, the prototype and experimental methods to be used, the selection of subjects and the activities they should perform, time allocated for the test, and the type of data to be gathered. Before conducting the actual test, it is necessary to review the plan and run pilot studies to test out the procedures to be used, helping to ensure success in the main study. Investigation report summarizes the results and provides income to the next iteration round by drawing conclusions about any improvements needed in the specification. Besides the empirical evaluation process by Newman and Lamming presented above, the six-step usability testing procedure by Nielsen [9] is presented in Chapter 4. [2], [3], [9] Empirical evaluation may be time-consuming to plan and conduct. To reduce the time and effort spent on evaluation, Newman and Lamming identify four general methods for evaluating interactive system designs efficiently, which are: [2] • • • •

Learning through prototyping, i.e. from designing and building them Informal testing of the initial prototype Field experiments, conducted under real-world conditions Controlled experiments, typically conducted in a usability lab

Prototyping provides an effective way to test various design aspects, illustrate ideas or features, gather early user feedback, and identify usability problems in the interaction design. Further, prototype building allows discovering many aspects of the user interface that would be difficult to determine in the earlier stages of design, such as how wide a text box in the user interface should be. Prototypes can be

25 categorized into low-fidelity and high-fidelity ones, as well as horizontal and vertical. Low-fidelity prototypes, such as paper mock-ups shown in Figure 13, can be used in earlier stages of development. They are typically paper-based, cheap, quick to develop, and easily modifiable. However, they are far away from the final product and are better for demonstration than user evaluation. High-fidelity prototypes are fully interactive computer-based prototypes, mimicking the actual interface as closely as possible. They provide more functionality and are well suitable for evaluation, although they are not as quick and easy to create. Horizontal prototypes cover a broad spectrum of features while the level of functionality is reduced, thus making them suitable for evaluating interface design but not actual use of the interface. Vertical prototypes, on the other hand, lack several features but contain all interactive functionality of the selected parts of the system, thus enabling usability testing in a subset of the system. [2], [3], [9], [20]

Figure 13. Paper prototype of a tabs-based design. An important phase in the design and implementation of prototypes is the selection of appropriate tools to assist the designers and speed-up the prototyping process. The requirements for prototyping tools and toolkits usually include ease of developing and modifying screen layouts, ease of linking screens together and modifying links, support for various types of user interfaces and input devices, as well as means of calling external programs and media. Several tools and toolkits for rapidly producing prototypes have been developed over the years, including Microsoft PowerPoint and Visio, script-based programming systems such as Visual Basic, HyperCard [21] and many of its clones including ToolBook [22]. Further, various web technologies, such as HTML, CSS and JavaScript, can be used in tandem to prototype web applications and other interactive systems effectively in an iterative manner. [2], [23], [24] Prototyping and informal testing can be applied quite easily to any design, while it is harder to decide whether to conduct empirical evaluation in the field, lab, or both. In lab experiments the conditions under which the system is used can be controlled carefully, thus providing more accurate measurements of usability levels than other evaluation methods. However, they may lack ecological validity and may not be able to simulate the real use context adequately. Field trials usually enable obtaining higher realism than laboratory tests as they take place in the natural context of use. Unfortunately they are also often more difficult and time-consuming to conduct as well as harder to replicate, and evaluators may lack control and precision in data collection under real-world conditions. Discussion of lab versus field testing will be continued in Chapter 4 in the context of empirical evaluation of mobile interactive systems. [2], [25]

26 2.3. Aspects of Mobile Interactive Systems Design In recent years, computing has rapidly evolved from being stationary to mobile. Mobile devices, such as smartphones and PDAs, have become more powerful and versatile, and they can be used for an increasing number of tasks. Moreover, mobile technologies are becoming increasingly widespread and people have adopted them into their daily lives. Thus, it is common today to see ways to resolve situations of concern by using mobile interactive technology. Mobile interactive systems are different from their stationary counterparts in various ways, and hence the designers are faced with new challenges in the course of the design. Johnson considers aspects of mobile systems from an HCI perspective and raises four concerns of usability and mobility: [26] • • • •

The demands of designing for mobile users, their tasks and contexts, Accommodating the diversity and integration of devices, network services and applications, The current inadequacy of HCI models to address the varied demands of mobile systems, The demands of evaluating mobile systems.

Possibly the most crucial distinction between mobile and desktop applications is their different usage patterns. Mobile systems tend to be used for frequent, short and instantaneous search and retrieval-oriented activities. They are typically designed for supporting few specific activities very well, rather than providing a general-purpose exploratory environment as desktop systems usually do. An important aspect of mobile services is that they are typically used in highly dynamic and mobile contexts, involving several people distributed in the user’s physical setting. These physical spaces are often unstable, crowded and noisy, and the user’s attention might be divided among various tasks, such as walking down the street and carrying a suitcase while reading e-mail with a mobile device. Thus, understanding the real use context of the proposed system becomes extremely crucial. Further, if the system is targeted at wider everyday use instead of some specific place and application field, the demands on the system’s adaptation to the changing usage contexts become higher. Also modeling the user’s activity and interaction becomes increasingly complex and difficult in mobile usage contexts. [26], [27], [28] Regarding the form of solution, mobile handhelds have a number of physical and technical constraints that affect user interface design and the underlying layers of technology. First, mobile devices need to be small so they could fit into a pocket and be carried easily by the mobile users. This causes constraints to screen size as well as to input and interaction capabilities. Since handhelds are generally operated by only one hand and possibly held in the other, interaction techniques for mobile devices are quite different from those used with fixed terminals. In the case of PDAs, stylusbased interaction with touch-sensitive screen is usually applied, while interaction with mobile phones is typically keypad-based. Despite recent advances in mobile and wireless computing, mobile devices are still limited in terms of memory, processing power, storage space, and battery life. Further, as mobile applications are often network-based, various problems with wireless connectivity can heavily affect the usability of the system, including slow or intermittent network connections. In terms of reliability requirements, mobile devices actually resemble servers more than

27 desktops. For instance, mobile devices and systems are often left running for long periods of time without reboots, thus requiring efficient resource management. Typically they also need to provide instant access to services and have to deal effectively with unexpected failures. Various characteristics of mobile and desktop applications are summarized in Table 2. [26], [27] Table 2. Characteristics of mobile vs. desktop applications Feature

Mobile applications

Desktop applications

Context of use

Dynamic, unpredictable, many people in the surroundings

Stationary, predictable, few people in the surroundings

Duration of activity

Short

Long

Frequency of activity

High

Low

Type of activity

Focused, few specific tasks

Exploratory, general-purpose

Interaction

Simple, one or two-handed, keypad or stylus-based

Complex, two-handed, keyboard and mouse-based

Start-up time

Short

Long

Reboot interval

Long

Short

Connection failures

Frequent

Infrequent

Besides the design of the solution, characteristics of mobile interactive systems also raise new concerns and challenges for effective evaluation of user acceptance and usability. These issues are considered in Chapter 4.

28

3. USER ACCEPTANCE The framework by Newman and Lamming and other human-centered design models presented in the previous chapter provide well-established practices to design interactive systems for usability. Good usability, however, does not guarantee that the system is eventually accepted and adopted by its intended users. This chapter introduces various approaches to studying user acceptance of interactive systems that can be used to enhance the development process. The understanding on the forces shaping user acceptance is important, since the success of any interactive system on the market correlates with the acceptance of its intended users. The purpose of user acceptance models is to explain key factors that affect how users come to accept and use a system or technology. These models can be applied in the evaluation phase as well as in earlier stages of interactive systems design. 3.1. System Acceptability Framework Figure 14 depicts the system acceptability framework by Nielsen [9]. The concept of system acceptability basically determines whether the system is good enough to satisfy all user needs and requirements. The overall acceptability of an interactive system is defined as a combination of its practical and social acceptability. Practical acceptability is again broken down into various categories, including usefulness, reliability, cost, compatibility with existing systems, and so on. Usefulness specifies whether the system can be used to achieve a desired goal. Usability and utility are defined as its subcomponents. Usability corresponds to how well users can use the functionality of the system, while utility corresponds to whether the desired goal can be achieved with the provided functionality. Thus, usability can be seen as quite a narrow concern when considering overall acceptability of an interactive system. The five attributes of usability in Nielsen’s model were discussed earlier in Chapter 2.

Figure 14. System acceptability framework by Nielsen. 3.2. Technology Acceptance Model (TAM) Technology acceptance models aim at studying how users’ perceptions affect the intentions to use information technology and the actual use of it. The Technology

29 Acceptance Model (TAM) by Davis [29] provides a solid framework for identifying issues that may affect user acceptance of technical solutions. The model emphasizes that perceived ease of use and perceived usefulness affect the intention to use. Each user perceives the characteristics of a system individually, based on various factors such as personal characteristics, attitudes, previous experiences and social environment. Davis defines perceived ease of use as “the degree to which a person believes that using a particular system would be free from effort” and perceived usefulness as “the degree to which a person believes that using a particular system would enhance his or her job performance”. Further, the model points out that perceived ease of use has a direct impact on perceived usefulness, while the intention to use affects real usage behavior. The model has later been extended to TAM2, which provides a detailed description of the key forces underlying judgments of perceived usefulness. TAM2 stresses that both social influence processes (subjective norm, voluntariness and image) and cognitive instrumental processes (job relevance, output quality, result demonstrability and perceived ease of use) have a significant impact on the acceptance of technology. The original TAM is illustrated in Figure 15. [30]

Figure 15. Technology Acceptance Model (TAM) by Davis. TAM is not based on observing real usage of the system but users reporting their perceptions. Acceptance data is collected with questionnaires, such as standardized PUEU questionnaire [29], of which questions are constructed so that they reflect the different aspects of TAM. The model was initially targeted for studying technology at work, but later it has been utilized for inspecting acceptance of consumer services such as Internet services and e-commerce. Besides assessing existing systems, TAM can also be used for studying user acceptance of planned product concepts. This indicates that the model could also be applied in technology development projects and processes to assess the usefulness of proposed system designs, thus supporting the human-centered design approach. [30] Based on case studies of mobile Internet and location-aware systems targeted at consumers, Kaasinen proposed a Technology Acceptance Model for Mobile Services [30]. The model suggests that user acceptance and intention to use a mobile service is built on three factors: perceived ease of use, perceived value and trust. To move from an intention to use to real usage, the user has to take the service into use. This transition is affected by the fourth acceptance factor: perceived ease of adoption. Thus, perceived value, perceived ease of use, trust and perceived ease of adoption need to be studied in order to assess user acceptance of mobile systems. [30]

30

Figure 16. Technology Acceptance Model for Mobile Services by Kaasinen. 3.3. Innovation Diffusion Theory (IDT) Innovation Diffusion Theory (IDT) by Rogers explains the adoption of new practices and technologies in the society in different user groups. The theory aims at predicting likelihood and the rate of an innovation being adopted by various adopter categories. According to Rogers, the five main factors affecting the rate of adoption are relative advantage, compatibility, complexity, trialability and observability. The definitions of these core factors are described in Table 3. Later Moore and Benbasat expanded the factor set with voluntariness, result demonstrability and image. [30], [31], [32] Table 3. Core factors of Innovation Diffusion Theory Factor Relative advantage Compatibility Complexity Trialability Observability

Definition: ”The degree to which ...” “an innovation is perceived as being better than its superseding practice.” “adopting the innovation is compatible with what people do.” “an innovation is perceived as being difficult to understand and use.” “an innovation may be experimented with before making the decision.” “the results of an innovation are visible to others.”

Rogers categorizes innovation adopters to five groups: innovators, early adopters, early majority, late majority and laggards. Innovators are described as venturesome risk-takers who serve as gatekeepers for those who follow. Early adopters are opinion leaders who are the first adopters within their group, and willing to maintain their position by assessing innovations for the others. Both innovators and early adopters are also seen as educated technology enthusiasts and visionaries. Early majority are described as pragmatists who value new solutions and convenience. They are deliberate in their adoption decision but want to wait until others have evaluated the innovation. The late majority includes skeptical and cautious users who prefer to wait until most of the population has adopted the innovation. Laggards are the last to adopt, basing their decision on the past rather than the future. They may be suspicious of new innovations or may have limited resources. The adopter categories based in a bell curve are illustrated in Figure 17. [30], [31], [33] Low-cost innovations may have a rapid take-off while innovations whose value increases with widespread adoption may have faster late growth. The critical mass occurs when enough individuals in a society have adopted an innovation so that the further rate of adoption becomes self-sustaining. The critical mass is especially important in the diffusion of interactive innovations, where each additional adopter

31 increases the utility of adopting the innovation for all adopters. Crossing the Chasm model by Moore adds “cracks” in the IDT curve, which refer to different needs and expectations across adopter groups. According to this model, innovations succeeding among innovators and early adopters may fail among the early majority or late majority, if the innovation lacks characteristics that appeal to these groups. Moore argues that this chasm needs to be bridged in order to hit the mass market. The chasm and other cracks defined by Moore are also depicted in Figure 17. [30], [31], [33] IDT targets the innovation adoption phase, but it can be utilized also in earlier system design phases. When evaluating a designed interactive system, test users can be categorized into different adopter groups and these groups can be weighted according to the objectives of the evaluation. [30]

Figure 17. Innovation adopter categories in a bell curve featuring the chasm. 3.4. Theory of Planned Behavior (TPB) Theory of Planned Behavior (TPB) by Ajzen [34] proposes a model for predicting deliberate human behavior. According to the theory, individual behavior is driven by behavioral intention, which is an indication of a person’s readiness to perform a given behavior. As depicted in Figure 18, the intention is based on attitude toward the behavior, subjective norm and perceived behavioral control. As a general rule, the more favorable the attitude and subjective norm, and the greater the perceived control, the stronger should be the person’s intention to perform the behavior in question. TPB extends the Theory of Reasoned Action (TRA) by adding perceived behavioral control as the third intention predictor to the model. The primary instrument for evaluating planned behavior is questionnaires. [34], [35] Attitude toward the behavior refers to the person’s positive or negative feelings about performing certain behavior. It is guided by the set of accessible behavioral beliefs linking the behavior in question to expected outcomes. ‘Accessible’ means that while a person may hold many beliefs with respect to any behavior, only a relatively small number are accessible at a given moment. Subjective norm is the perceived social pressure to engage or not to engage in certain behavior, guided by accessible normative beliefs concerning the behavioral expectations of people important to the person. The strength of each belief is weighted by motivation to comply with the relevant person in question. Behavioral control is the perceived ease or difficulty of performing certain behavior, determined by the set of accessible control beliefs. These are beliefs about the presence of factors that may facilitate or hinder performance of the behavior. Perceived behavioral control can also have a

32 direct effect on actual behavior, since successful performance of the behavior depends not only on a favorable intention but also on a sufficient level of behavioral control. Actual behavioral control refers to the extent to which a person has resources and other prerequisites needed to perform a given behavior, such as money, time and skills. [34], [35]

Figure 18. Theory of Planned Behavior (TPB) by Ajzen. 3.5. Unified Theory of Acceptance and Use of Technology (UTAUT) In an attempt to integrate main competing user acceptance models, Venkatesh et al. [32] designed the Unified Theory of Acceptance and Use of Technology (UTAUT). It aims to explain user intentions to use an interactive system and subsequent usage behavior. The proposed view combines elements across eight existing approaches, including TAM, IDT, TRA, and TPB. UTAUT contains four core determinants of intention and usage: performance expectancy, effort expectancy, social influence, and facilitating conditions. Performance expectancy replaces the perceived utility in TAM and the relative advantage in IDT, while effort expectancy replaces perceived ease of use (TAM) and complexity (IDT). Social influence has been constructed from subjective norm (TAM2 and TPB) and image (IDT). Facilitating conditions have been adopted from TPB and IDT, featuring perceived behavioral control and compatibility, respectively. UTAUT also defines four moderators influencing the core determinants, which are gender, age, experience and voluntariness of use. The UTAUT model is illustrated in Figure 19. [30], [32]

Figure 19. Unified Theory of Acceptance and Use of Technology by Venkatesh et al.

33

4. EMPIRICAL EVALUATION OF MOBILE SYSTEMS Mobile HCI is a relatively young research field, and so far quite little knowledge on the applicability of traditional HCI research methodology for evaluating mobile systems exists. However, it is widely acknowledged that adapting established evaluation methods to mobile environment is challenging. The questions, concerns and challenges surrounding the empirical evaluation of mobile systems are mainly derived from the mobile, changing and unpredictable context of use. Such context may be very hard or impossible to control, and makes various traditional data collection methods difficult to use. In addition, the users of mobile systems are likely to have to divide their attention among the elements of the environment and the technology in use. Also the constraints in software and hardware capabilities of mobile equipment may limit the options in evaluation [36]. This chapter starts with an overview of a usability testing process by Nielsen that can be applied in empirical evaluation of mobile systems. After that the chapter proceeds with a discussion whether to conduct field or lab-based evaluations for mobile systems. Further, data collection techniques suitable for evaluation of mobile systems are introduced. 4.1. Usability Testing Besides the empirical evaluation process by Newman and Lamming described in Chapter 2, Nielsen has defined a well-established six-step procedure for successful usability testing of interactive systems as part of the usability engineering lifecycle. The six essential stages of usability testing by Nielsen are specified as follows [9]: • • • • • •

Clarify test goals and write the test plan Get test users Choose experimenters Consider ethical aspects of tests with human subjects Choose test tasks Conduct the test

A test plan should be made before the start of the test. The issues the test plan should address according to Nielsen are mainly similar to the investigation plan proposed by Newman and Lamming. These issues include the goals of the test, where and when the test will be conducted, expected duration of each test session, the software and experimental methods to be used, the selection of test users and the tasks they will be asked to perform, the type of data to be collected, and data analysis methods to be used. According to Nielsen, the test plan should also address the expected network load and response times, the experimenters selected for the test, what user aids will be available, and to what extent the evaluator will be allowed to help the subjects during the test. A test budget should also be included in the plan. In addition, Nielsen stresses the importance of running pilot tests before the actual test. [9] The test users should be as representative as possible of the intended users of the system. If the test will be conducted with very few users, additional care should be taken to involve average users. If more participants are to be used, the users should be selected from several different subpopulations to cover the main categories of expected users. The users could also be categorized into different innovation adopter

34 groups according to Innovation Diffusion Theory [31]. One of the main distinctions between user categories is that between novice and expert users. Typically these two groups should be tested in separate tests. It should also be considered whether to employ between-subject or within-subject testing. In the former method, each user only participates in a single test session, while in the latter all the users get to use all the systems that are being tested. Recruiting suitable test users may be challenging. Some recruitment methods include advertising, getting students in the domain of interest from a local university, or using recruiting agencies. It should also be determined what kind of incentive will be provided to participants. A training session before the actual test might be worthwhile, if the users are unfamiliar with the interaction devices and techniques used in the test. Otherwise the effects of the users’ struggle with these issues may dominate the test, and no valuable information on the usability of the user interface will be gained. [9] Other important stages of usability testing include choosing the experimenters, considering ethical aspects of testing with human users, and selecting the test tasks. Obviously, it is preferable that the evaluators have previous experience on usability testing and knowledge of the test method used. Moreover, the experimenter must have extensive knowledge of the application and its user interface. The evaluator is responsible for making the users feel as comfortable as possible during and after the test, and their privacy must be respected. The test tasks should provide reasonable coverage of the most important parts of the system and the user interface. They also need to be small enough to be completed within the time allocated for the test. The tasks should define precisely what result the user is being asked to produce, and they should normally be given to the users in writing. [9] A typical usability test consists of four stages, including preparation, introduction, the test itself, and debriefing. First, the evaluator should make sure that the place for the test is ready for the experiment and all necessary equipment and materials are available. During the introduction, the experimenter welcomes the test user and gives a brief explanation of the purpose of the test. Then the evaluator introduces the test procedure, explains about any video or audio recording that may be taking place, gives specific instructions for the kind of test that is being conducted, invites the user to ask any clarifying questions before starting the experiment, and finally gives any written instructions for the test. During the test itself, the evaluator should normally refrain from interacting with and helping the user. The main exception from this rule is when the user is clearly stuck and is getting unhappy with the situation. After the test, the user is debriefed and asked to fill in any questionnaires. The questionnaires should be administered before any discussion to avoid any bias. During debriefing, users are asked for any comments they might have about the system and for any suggestions for improvement. [9] 4.2. Laboratory vs. Field Experiments In recent years, there has been quite a lot of debate on whether mobile systems should be evaluated in the field or in the more traditional laboratory environment. First of all, usability researchers and specialists have been concerned that laboratory evaluations do not adequately simulate the context where mobile systems are used, and also lack the desired ecological validity. Moreover, various aspects in the realworld settings that could affect the user’s performance and behavior, such as

35 interruptions, movement, noise, multitasking, lighting levels and weather, are not present or may be very difficult to recreate realistically in lab-based experiments. Abowd and Mynatt [37] state that effective evaluation of ubiquitous computing systems requires a deployment into an authentic setting, as the scaling dimensions of device, space, people and time that characterize these systems make it impossible to use traditional usability laboratories. Further, field experiments have been seen as the preferred method for evaluating mobile guides [38], [39], as these systems and the information they present are closely related to the physical location of the users and the objects in their surroundings. Evaluation in the natural usage context has also been recommended for mobile systems targeted for highly dynamic and unpredictable urban environments [40], such as parks, shopping districts and city centers. [26], [39], [41] Kjeldskov et al. [38] evaluated a mobile guide designed to support the use of public transport with four different approaches, including lab and field evaluation, heuristic walkthrough [42] and rapid reflection. The identified usability problems were categorized as critical, serious and cosmetic, which were analogous to catastrophic, major and minor/cosmetic severity ratings by Nielsen [9]. The results showed that critical problems were generally uncovered regardless of approach, while field and lab evaluations had considerable overlapping in serious problems. On the other hand, both heuristic walkthrough and rapid reflection missed several serious problems from those collectively detected in the field and lab. Although field evaluation missed out most of the cosmetic problems, it was the most effective method in finding serious problems. None of the cosmetic problems were found by all methods, while lab and heuristic walkthrough interestingly identified the same set of these problems. The researchers noted that the field study stressed problems of mobile use rather than simply device usability, and drew attention to issues such as real-world validity, social acceptability and precision of the data presented by the system. In contrast, the lab evaluation tended to produce more device-centered findings. It was also stated that contextually related problems are best identified in the field, as they may appear and be experienced and described very differently, or be passed unremarked in laboratory settings. Overview of the types of usability issues identified with different techniques in the study is illustrated in Figure 20.

Figure 20. Schematic overview of the usability issue types identified with different evaluation methods in the study by Kjeldskov et al. [38]. In comparison to lab-based studies, field trials are often seen as substantially more time-consuming and harder to conduct, including a number of additional challenges such as difficulty of controlling possibly confounding variables. Goodman et al. [39], however, argued that field-based evaluations of mobile services need not be as

36 difficult as commonly thought, and presented a set of techniques and suggestions to make running such experiments easier. In their method for evaluating a mobile guide, the participants were given specific tasks to find one’s way along specific routes. The guide was tested against a standard paper map for the test area, and two distinct but similar routes were used. Quantitative data was gathered using direct observation by the evaluator walking a few steps behind the user, while user experiences were collected with questionnaires or interviews after the experiment. The measures included timings, errors, perceived workload, traveled distances and subjective preferences. Further, Goodman et al. considered how to act with variables that were difficult or impossible to control in the field, such as light, noise levels, weather conditions and traffic. Instead of spending time and resources on trying to keep the levels of such variables consistent during the field trial, they found it more effective to let them vary across conditions as they would in real-world use. Kellar et al. [40] applied traditional experimental methods for evaluating two mobile services in urban environments. In the first experiment, the participants performed three rendezvous scenarios in a busy downtown shopping district using location-aware map application with cell phones and handheld computers. The user behavior was observed in the context of rendezvous scenarios by evaluators following each participant. In the second study, the benefits of sharing annotations across mobile handhelds for co-located users were explored during an organized event in which teams performed navigational tasks in a city. One team was directly observed, while two other teams were subject to audio capture and software logging. Rendezvous and shared annotations studies are depicted in Figure 21.

Figure 21. Two urban navigation experiments by Kellar et al.: (a) direct observation during the rendezvous study, (b) an observer (on the left) prepares to pull a video camera out of his backpack during the shared annotations study. During the experiments, several issues affecting experimental control and ability to observe user behavior were encountered, involving problems related to software, materials, social considerations, weather, audio and video recording, and mobility. First, issues with Bluetooth connectivity caused interrupts and affected user behavior in the rendezvous study. Further, the study was notably influenced by the lack of “home base”, meaning that the equipment had to be carried throughout the study. This required careful management of battery power as there were no power outlets. Also the general public and weather conditions such as sun, wind and rain affected the flow of each study. Quality audio capture was difficult due to background noise, while video recording was heavily affected by high mobility. Further, mobility and time pressure had a significant impact on technology use and observation. Although

37 field experiments in urban contexts were found problematic, the authors emphasized that it is also crucial to understand the factors such as mobility, dynamic environment and context-dependent behavior which make these evaluations challenging. Further, the realism that was felt by the users during the experiments was highlighted. [40] Despite general agreement on the importance of field-based evaluations for mobile interactive systems, a literature study by Kjeldskov and Graham [25] revealed that 71% of mobile device evaluations were done in laboratory settings, while only 19% were carried out in the field. The review also showed that evaluations are often focused on functionality rather than contextual issues. One possible reason for the low usage of field experiments, as noted by Goodman et al. [39], is the lack of a clear, carefully worked out method for carrying out such experiments. Other reasons for this may be that lab experiments are simply easier to conduct and manage than field studies, as well as the strong roots of mobile HCI in computer science and human-computer interaction [25]. Also opposing views on the necessity of evaluating mobile systems in the field have been presented. Kjeldskov et al. [43] questioned whether conducting expensive field experiments are “worth the hassle”. In their study, the effectiveness of field and lab evaluations for identifying usability issues in a context-aware mobile electronic patient record system was compared. The realism of laboratory settings was improved by recreating a hospital ward in a usability lab. As a result, the lab study revealed far more usability problems than a similar field study for roughly half the cost, and thus the researchers concluded that field studies are not efficient for testing mobile services. The experiment also showed that simulating real-world aspects in a laboratory environment by including mobility and context can diminish the differences between results achieved in the laboratory and in the field. The result of this study, however, has raised objections among scientists to its generality, and the experiment has been criticized for focusing solely on finding usability issues in contrast to studying long-term use and adoption of new technology [44]. Further, as noted by Kaikkonen et al. [41], the tested service was designed for professional users who usually have a clear picture of their tasks. The results might be different with a consumer application, the users of which do not necessarily have a clear idea of the possibilities provided by the system. Kaikkonen et al. [41] reported a study that compared the effectiveness of field and laboratory testing on the usability evaluation of a mobile consumer application. The tested application was used for transferring files between computers and the mobile device. Both evaluations were task-based and used think-aloud protocol. The laboratory test sessions were held in a typical usability lab and were recorded with three video cameras and a microphone. The field tests were carried out in various locations around Helsinki, including a metro trip from an office district to a large shopping center. To allow data gathering while the test users were on the move, the users wore special equipment consisting of a microphone and three video cameras for recording the display and the keyboard of the mobile device, as well as the user’s surroundings. In addition, the moderator wore a unit that included a video camera for recording the user’s surroundings from the moderator’s point of view, as well as a monitor that enabled the moderator to see what the user was doing with the application, which would have been impossible without special equipment in various occasions during the test. Both the test user and moderator units were also equipped with wireless video signal transceivers. The equipment and an ongoing test session are shown in Figure 22.

38

Figure 22. Field testing equipment and an ongoing test by Kaikkonen et al. [41]. The results revealed that the same problems were found in both environments, while differences occurred in the frequency of findings between the contexts. The issues that came out more frequently in the field seemed to relate to understanding the logic of the mobile service. Task performance times were also the same in both studies, although the total time taken by the study was longer in the field. Interestingly, the researchers noted that potential interrupts did not seem to affect user performance and behavior much during the field study. Instead, they interpreted that users were performing the tasks “inside a bubble”, concentrating on their own activities and ignoring others ongoing around them, unless those activities were noisy or threatening. In the case of more complex tasks, the users sought a safe haven to interact with the device. Further, it was noted that the field test seemed to be more casual, and the users told about the use of the application and their feelings more freely. As a conclusion, Kaikkonen stated that conducting time-consuming field experiments may not be worthwhile when searching for problems with the user interface to improve user interaction. However, field testing in a natural context of use was seen useful when also user behavior and the effect of the environment are investigated, instead of focusing solely on interaction with the system. Further, field experiments were seen necessary for evaluating services with location information as the laboratory setting may be inadequate for yielding valid results. [41] To further stress the importance of field evaluations, Wagner defended field research based on the results of a project exploring the potential for introducing mobile support systems for Danish pig farmers [44]. He argued that only real-world evaluation with end users can reliably prove whether mobile technologies are usable and acceptable enough to replace existing tools and technologies in industry. Moreover, the 2005 UbiApp workshop emphasized evaluating ubiquitous computing applications in realistic use contexts [45]. The workshop aimed to identify methodological problems in the way researchers carry out application-led research in ubiquitous computing and to recommend how to address these problems. As a result, the workshop produced four guidelines to boost the effectiveness of application-led research: [45] • •

Choose applications carefully. Rather than settling for trivial, low-value problems, aim to build applications that address severe, persistent problems. Share technical infrastructure. Design applications so that the developed software and infrastructure can be reused to the wider community’s benefit.

39 • •

Evaluate applications in realistic environments. Real-world deployment is the only way to fully investigate the complex three-way interactions between ubiquitous computing applications, their users, and the environment. Perform comparative evaluations. Publicly release data sets derived from field trials so that other researchers can verify the published results and analyze the data. Design studies should not only validate a single application but also explicitly compare similar applications.

Small-scale laboratory studies, however, were also seen useful in the early stages of user-centered design. The data gathered in these tests should be used to continue designing and evaluating ubiquitous systems further on larger scale field trials. [45] 4.3. Data Collection Techniques for Mobile Environment This section presents data gathering techniques suitable for evaluation of mobile interactive systems. As indicated earlier in comparison between laboratory and field evaluations, data collection can be problematic when evaluating mobile systems in realistic usage contexts. Especially many of the traditional observational methods, introduced earlier in Section 2.2, have been found cumbersome to use in changing, unpredictable, crowded and noisy usage environments. To overcome the challenges caused by such field settings, various novel observational techniques have been proposed for capturing high-quality data while users are on the move. Some of these techniques are presented in the specific observation section of this chapter. Although new techniques have been introduced in response to the challenges with data collection in mobile usage contexts, few traditional methods have been found well-adapted for field evaluation of mobile systems. Two principal types of such techniques are software logging and questionnaires. Logging allows automated data collection of actual usage of the system. As noted by Goodman et al. [39], it is well suited for recording events and taking measurements such as timings and errors when evaluating mobile systems in dynamic, unpredictable contexts. Logging is also cheap, time-saving and easy to instrument compared to special equipment for mobile setting such as small wearable video cameras. Questionnaires, on the other hand, have been found efficient in gathering users’ perceptions on user acceptance when carrying out evaluations in mobile contexts (e.g. [39]). A combination of logging and qualitative methods has also been seen as a promising approach for collecting data from evaluations of increasingly complex services and applications [46]. Other suitable traditional methods for collecting qualitative data on mobile systems include interviews and focus groups. Expert-based usability inspection methods, such as heuristic evaluation, can also be utilized to complement empirical evaluation and to achieve a deeper understanding on the usability of mobile services. The methods presented above are further discussed in the following sections of this chapter. 4.3.1. Heuristic Evaluation Heuristic evaluation is a usability inspection method that can be conducted at various stages of the development lifecycle. In heuristic evaluation, typically 3-5 usability experts independently evaluate the system using a set of heuristic rules or guidelines.

40 Ten usability heuristics defined by Nielsen that have been widely used for usability inspection of user interfaces are presented in Table 4. Alternative sets of heuristics have also been developed, such as a research-based set of heuristics by GerhardtPowals [47]. Severity ratings can be used to categorize the identified problems, often by using 0 to 4 rating scale in which 0 = not a usability problem, 1 = cosmetic, 2 = minor, 3 = major, 4 = catastrophic. Further, evaluators can be assisted in using the inspected interface by providing typical usage scenarios [8], listing the steps a user would take to carry out a sample set of realistic tasks. [2], [9], [19] Table 4. Ten usability heuristics for user interface design by Nielsen Heuristic Visibility of system status Match between system and the real world User control and freedom Consistency and standards Error prevention

Recognition rather than recall Flexibility and efficiency of use Aesthetic and minimalist design

Description The system should always keep users informed about what is going on, through appropriate feedback within reasonable time. The system should speak the users' language, with words, phrases and concepts familiar to the user, rather than system-oriented terms. Follow realworld conventions, making information appear in a natural and logical order. Users often choose system functions by mistake and will need a clearly marked "emergency exit" to leave the unwanted state without having to go through an extended dialogue. Support undo and redo. Users should not have to wonder whether different words, situations, or actions mean the same thing. Follow platform conventions. Even better than good error messages is a careful design which prevents a problem from occurring in the first place. Either eliminate error-prone conditions or check for them and present users with a confirmation option before they commit to the action. Minimize the user's memory load by making objects, actions, and options visible. The user should not have to remember information from one part of the dialogue to another. Instructions for use of the system should be visible or easily retrievable whenever appropriate. Accelerators, unseen by the novice user, may often speed up the interaction for the expert user such that the system can cater to both inexperienced and experienced users. Allow users to tailor frequent actions. Dialogues should not contain information which is irrelevant or rarely needed. Every extra unit of information in a dialogue competes with the relevant units of information and diminishes their relative visibility.

Help users recognize, Error messages should be expressed in plain language (no codes), precisely diagnose, and recover indicate the problem, and constructively suggest a solution. from errors Even though it is better if the system can be used without documentation, it may be necessary to provide help and documentation. Any such information Help and should be easy to search, focused on the user's task, list concrete steps to be documentation carried out, and not be too large.

The main advantages of heuristic evaluation are that it is cheap, intuitive, quick and easy to perform, and requires no advance planning. It can be applied for analyzing low-fidelity prototypes of the design, such as paper mock-ups, as well as high-fidelity prototypes, after building a fully functional software prototype during empirical evaluation process. On the other hand, heuristic evaluation is targeted for identifying problems and does not focus much on finding solutions. Based on several case studies [48], major problems are found more easily than minor problems by a single evaluator in heuristic evaluation, while more minor problems in numbers are typically found. Although heuristic evaluation has proven to be an efficient technique for finding usability problems that would be harder to find with other usability

41 inspection or testing techniques, it has also been criticized for its validity [49], suggesting that many of the identified problems are cosmetic or not true problems at all. Moreover, rating the severity of each problem has found to be difficult. [9], [19] 4.3.2. Observation To address the practical problems related to observation when evaluating mobile systems in the field, several novel techniques have been developed for gathering high-quality data. Isomursu et al. [50] presented Experience Clip, a self-reporting method based on the use of mobile camera phones. The technique encourages users themselves to take short video and audio clips during the field experiment for collecting user experience information. In addition, small video cameras worn by the user [41] [51] or attached to the mobile device [43], as depicted in Figure 23, have been introduced to enable observation with high-quality data collection in highly mobile and changing contexts. Common to these techniques presented above is also their effort to prevent the Hawthorne effect [3], i.e. to overcome the obtrusive influence of a moderator during observation in a mobile context. Goodman et al. proposed techniques for measuring the distance traveled and route taken by the users during field trials of mobile devices. While pedometers can be used for measuring the traveled distance by counting the number of steps taken, GPS navigators and other location-sensing equipment take one step further and also enable tracking the route. These techniques overcome the obtrusiveness caused by a researcher “shadowing” the user during the test, but may lack reliability in terms of technical problems and inaccurate measurements. [39]

Figure 23. Novel observational data collection techniques for mobile usage contexts: (a) “helmet cam”, (b) wireless camera mounted on a PDA. 4.3.3. Interviews and Focus Groups Interviews were introduced earlier in Chapter 2. Compared to questionnaires, they require less planning and preparation. Interviews may be structured, semi-structured or unstructured. Structured interviews are tightly scripted and often quite similar to questionnaires. They are replicable and easier to conduct and analyze, but may lack richness. They are also more suitable for evaluations than for pre-design studies.

42 Unstructured interviews are flexible, less formal and are allowed to take their course without constraints. They can offer rich and detailed data about facts and opinions of the user and may be used early in design. However, they are not replicable and may be time-consuming and difficult to analyze. Semi-structured interviews are guided by a script but allow interesting issues to be explored in more depth. Thus they can provide a good balance between richness and replicability. [2], [3], [4] A focus group is a form of qualitative research in which a group of people are asked about their attitude towards a product, service, concept, or idea. Questions are asked in an interactive group setting, typically with 6-9 participants who are free to talk with other group members. Each group is run by a moderator who is responsible for keeping the focus on the issues of interest. The moderator also needs to ensure that all group members get to contribute to the discussion and guard against having the opinions of any single participant dominate excessively. A short demonstration, a video, or presenting examples of artifacts relevant to the focus group topic (e.g. design prototypes) may also be used to start the discussion. [9], [12] In the world of marketing, focus groups are an important tool for acquiring feedback regarding new products. In particular, focus groups allow companies to discuss, view and test their new products before they are made available to the public. This can provide valuable information about the potential market acceptance of the product. Focus groups can be used to assess user needs and feelings both prior to design and after the new product has been taken into use. Focus groups often bring out spontaneous reactions and ideas through the interaction between the participants, and also let observe group dynamics and organizational issues. Because of their unstructured nature, focus groups are often difficult and time-consuming to analyze. Further, since users are asked what they want instead of measuring or observing what they do, focus groups involve the risk that the users may think they want one thing even though they in fact need another. According to Nielsen, focus groups are not particularly suitable for evaluating interaction styles or design usability, and should not be used as the only source of usability data. However, focus groups can well complement the evaluation by discovering what users want from the system, in what kind of usage scenarios it may be useful, and how it could be improved. [9], [12] 4.3.4. Questionnaires Questionnaires have long been used to evaluate software systems and user interfaces [52]. They can be paper prints or interactive presentations on a computer. Kirakowski [53] defines a questionnaire as a method for eliciting information from the respondents’ minds and recording collected data onto a permanent medium for further analysis and reference. The biggest single advantage of using questionnaires in evaluation of interactive systems is that they provide data on system acceptance from the users’ point of view. Questionnaires are also relatively cost-effective to distribute and analyze, and they can be administered to large populations without the need for evaluator to be present. Although questionnaires are good for gathering subjective measures as well as demographic data about the respondents, they are usually unreliable for collecting performance measures. [4], [9], [53] Another use of questionnaires, taken by e.g. Root and Draper [52], is pre- and post-questionnaires that allow researchers to compare changes in attitudes and performance. In this approach, the participants are given a questionnaire to bring out

43 their expectations or test their performance before an experiment. Then, after the experiment, they are given the same questionnaire again. [3] Questions can be either closed or open. In the case of closed-ended questions the respondents are asked to select an answer from a set of fixed alternatives, whereas open-ended questions the respondents are requested to answer in their own words. There are advantages and disadvantages to both types. By restricting respondents’ choice of reply, closed questions are easier to analyze – especially if there are huge quantities of data to be processed. They are also good for gathering numeric data. However, the results of such questionnaires can be easily biased as respondents have to react to very tightly focused questions. Open questions avoid imposing any restrictions on the respondents, who can be asked for very specific and individual comments that cannot be gathered in terms of predefined responses. However, they may leave room for misinterpretation and may produce irrelevant or confusing answers. Moreover, analysis becomes more complex and time-consuming requiring extra phases such as condensing answers into categories. [3], [53], [54] Closed questions are often provided with some form of rating scale. Questionnaire format may include checklists with basic alternatives (e.g. three-point scale of “yes”, “no” and “don’t know”), multi-point rating scales with meanings of individual points or just end points given, Likert scales (a special case of multi-point rating scale that measures strength of agreement with a clear statement), semantic differentials with bi-polar adjectives at the end points of the scale, or ranked order questions where the alternatives are ranked. Figure 24 shows examples of basic alternative, Likert, semantic differential and ranked order rating scales. [3], [4]

Figure 24. Rating scale types: (a) basic alternatives, (b) Likert scale, (c) semantic differential, (d) ranked order questionnaire. An important aspect of using questionnaires for user acceptance assessment is that they should give reliable and valid results. Reliability means that the questionnaire is able to gather the same results when filled out by like-minded people in similar conditions, whereas validity is the degree to which the results actually reflect the issues the questionnaire is intended to gather [53]. If we start developing a questionnaire from scratch, it can be problematic and time-consuming to ensure whether the questionnaire has adequate reliability and validity. Therefore, as suggested by Kirakowski [53], it is often a good idea to use questionnaires that have already been developed and standardized rather than composing one of your own. The reliance on standard instrumentation has two important advantages. First of all, the instruments have already been assessed for validity and reliability. Secondly, by using common instruments it is easier to compare results between different studies [54]. At present, there are several standardized questionnaires available that are specifically designed for usability and user acceptance evaluation, including Perceived Usefulness and Ease of Use (PUEU) [29], Questionnaire for User Interface

44 Satisfaction (QUIS) [55], System Usability Scale (SUS) [56], Software Usability Measurement Inventory (SUMI) [57], Computer System Usability Questionnaire (CSUQ) [58], After Scenario Questionnaire (ASQ) [58], and Purdue Usability Testing Questionnaire (PUTQ) [59]. The original version of QUIS, developed at the University of Maryland, is composed of 27 closed questions with semantic differential scale. It also provides a space for respondents to mention any other issues that they consider important. Both short and long versions of the questionnaire are provided. The current version of QUIS, 7.0, is commercial and it contains a demographic questionnaire, a measure of overall system satisfaction along six scales, and hierarchically organized measures of eleven specific interface factors. ASQ is intended to be used after each task of an evaluation session. It measures the ease and subjective speed of task completion, as well as availability of support information. SUS provides a high-level subjective view of usability with 10 five-point Likert-scale questions. It has been widely used for evaluating a range of systems, as well as comparative evaluation between systems. It has also been translated into Finnish. SUS data can be analyzed quite easily, yielding a single number representing a composite measure of the overall usability of the system. These SUS scores have a range of 0 to 100. The original English version of SUS questionnaire is depicted in Figure 25. [3], [56], [58], [60]

Figure 25. System Usability Scale (SUS) questionnaire. Few studies to determine the effectiveness of various standardized questionnaires have been reported. Tullis and Stetson [61] compared five questionnaires (SUS, QUIS, CSUQ, Words, and their own questionnaire) for assessing website usability. As a result, SUS was found to be the most reliable. Tullis and Stetson also noticed that SUS was the only questionnaire where questions addressed different aspects of the user’s reaction to the website as a whole. Further, they found that sample sizes of at least 12-14 participants were needed to get reasonably reliable results for the conditions of their study. Designing a novel questionnaire might be necessary, if existing standardized questionnaires seem to be unsuitable for collecting data from the system under evaluation. This may be the case if we find existing questionnaires unable to measure aspects we are interested in our study, or we need to gather more detailed

45 information about the research subject than standardized questionnaires would be able to bring out. Instead of designing a new questionnaire from scratch, we might also be able to use an existing questionnaire as a template and modify it. This could save time and ease the reliability and validity assessment process remarkably – however it may also introduce new problems (e.g. if the original questionnaire is copyrighted, we may need permission to change it). Moreover, we might not be able to compare our results with other studies that have used the original standardized form, if the questionnaire has been altered substantially. Kitchenham and Pfleeger [54] mention four reasons for amending existing questionnaires: • • • •

Questionnaire is too long to be used in entirety. A different population is being studied from the one for which the original questionnaire was designed. Questionnaire needs to be translated. The data collection method is different in some way from the method of the original questionnaire.

A long questionnaire needs to be shortened if it includes questions that are unessential regarding the system to be evaluated and therefore would only frustrate respondents. A shortened questionnaire is also necessary if the respondents are not willing to commit the amount of time required to fill in the original form, or if this needs to be done in a place where spending a long time for answering questions would be inconvenient. If the abilities and knowledge required to answer the questions differ substantially between the original target population and the population being studied, the questions should be reformulated so that current respondents could also answer them easily and accurately. Influential factors between populations might be the level of experience (novice vs. expert users), or age (adults vs. children). [54] One evident reason for modifying standardized questionnaires is the need for translation, which rarely succeeds word for word. Moreover, a different context and timeframe in which the users draw their answers might affect the interpretation of the questions, requiring them to be reformulated for unambiguity. Questionnaires can also be used as templates for interviews, in which case the questions may need editing before they can be asked verbally. In addition to the reasons listed by Kitchenham and Pfleeger, we may need to alter an existing questionnaire in order to gather data about more detailed aspects of the system that cannot be asked with the standard questions being too abstract. This can be done by modifying the standard questions to be more detailed, or by adding extra questions. [54] Whether deciding to create an entirely new questionnaire or to use an existing one as a template, certain aspects should be considered when developing a questionnaire. First of all, the questions need to be purposeful and concrete. Purposeful questions are worded so that the respondent can see the relationship between the intention of the question and the objectives of the study, whereas concreteness means that the questions are precise and unambiguous. It is also reasonable to decide on whether phrases will all be positive, negative or mixed. Kitchenham and Pfleeger [54] suggest avoiding the use of negative questions as they are usually more difficult to understand than straightforward ones. Secondly, it is wise to consider ways to increase the chance of respondents completing and returning the questionnaire. Thus the questionnaire should be made as easy as possible to use. One rule of thumb for

46 achieving this is to try to keep the questionnaire short. Preece et al. [3] recommend to aim for no more than two sides of paper, unless absolutely necessary. However, in case we need to prune questions in a long questionnaire, we must maintain the balance between the objectives it should address and the time and workload acceptable for the participants for answering the questions. If a long questionnaire is needed, it may be a good idea to provide an incentive to respondents. In addition, it may be useful to provide two versions of the questionnaire: a full version and a short version for impatient and busy respondents. [3], [54] In case of closed questions, we need to pay attention to the design of the rating scales associated with them. With closed categorized questions, it is important to ensure that the categories are mutually exclusive and do not overlap in case we want the respondent to give only one answer. It is also encouraged to use standardized response formats, such as terms ‘strongly agree’ and ‘strongly disagree’ used in Likert scales, to reduce the time taken to complete the questionnaire. By standardizing all responses, the respondents can usually answer the questions more quickly than in the case of non-standard ones, as they need not read the choices carefully, question by question. Another issue is the number of response options. First, we need to decide whether to have an odd or even number of them. If there is a possibility of having a ‘neutral’ response, then we should have an odd number of options with the central point being the neutral one. Secondly, we need to select an appropriate width for the rating scale. In standardized usability questionnaires 3, 5, 7 and 9-point rating scales have been widely used. Furthermore, we have to consider whether to include a ‘no answer’ option or not for the questions. Some researchers think that ‘not applicable’ boxes allow respondents to avoid answering a question and are just cutting down the amount of data to be gathered. On the other hand, it may be disadvantageous to force people to answer questions they do not want to, or to force them to make a choice about which they feel ambivalent. However, as Kirakowski, Kitchenham and Pfleeger state, this as a case-specific issue that depends on the type and particular requirements of the questionnaire. [53], [54] Besides asking an appropriate number of unambiguous well-structured questions, it is important to ensure that the whole questionnaire is clearly laid out and formally structured. It is often reasonable to include some open-ended questions in an otherwise closed-ended questionnaire in order to get specific information that cannot be gathered in terms of predefined responses. During the design it should be decided how the data will be analyzed later, and the anonymity of the participants needs to be guaranteed as well. It is also important to consider when the users will be asked to rate their task experience, as the time when the questionnaire is answered may have a significant impact on subjective ratings. The study by Teague et al. [62] showed that there were significant differences between questionnaire ratings whether the users were asked to rate task ease and satisfaction during task execution (concurrent) or after task completion (post-task). When asked concurrently with task execution, more encountered usability problems were remembered and their descriptions were more detailed than after the test. However, concurrent questionnaires may distract users and alter their performance levels and behavior during the test. And finally, it is essential to carry out a pilot study to test the questionnaire design and to avoid many of the potential problems during the actual study. [3], [4], [53] Online questionnaires can be reached via Internet and their creation usually requires familiarity with web authoring programs, HTML and script programming. Online questionnaires have the advantage that data can be automatically validated

47 and directly stored into a database for analysis, thus saving time and effort of the researchers. They can also reach people at distant locations and are relatively cheap. Disadvantages of online questionnaires include uncertainty over the validity of the data, users may experience technical problems while answering questions and may respond more than once, and evaluators may have concerns surrounding the design and implementation of the electronic questionnaire. [4], [63] 4.3.5. Logging Software logging involves an automatic collection of statistics about detailed use of the system. Typically, a log file contains data on the occurrences of various events, or the frequency with which each user has used each feature in the system. The analysis requires data retrieval from the logs, which can be done manually or by using special programs designed for data extraction. Software logging has become a popular method for gathering data among researchers for several reasons. First, it does not require presence of the researcher, and data analysis can be at least partly automated. Second, it enables more accurate performance measurements than the manual recording of data. For example, log files may show the situations where the user paused or otherwise wasted time, and what errors were encountered most frequently. Logging also finds highly used or unused features of the system and can be run continuously. Further, it is unobtrusive as the users are not constantly aware of their performance being monitored, thus not altering their performance levels and behavior during the evaluation. However, logging also has some weaknesses. In contrast to recording accurate performance data, it cannot provide subjective data about user preference and behavior. It may also raise ethical issues by violating users’ privacy, and thus the users must be informed that their actions will be logged during the evaluation. Further, special programs are often required for analysis as automated logging has an aptitude for generating huge amounts of data. [3], [9] Preece et al. [3] categorize software logging tools into two main types, timestamped key presses and interaction logging. The former involves recording of each keystroke of the user and the exact time of these events. The latter is similar except that the recording is made and can be replayed in real time, letting the observer see the interaction between the user and the computer exactly as it happened. In practice, keystroke or interaction logging is often combined with other data capturing techniques such as interviews, questionnaires and video or audio protocols. [3] Logging techniques for usability studies on web-based applications fall into two categories: server-side and client-side logging. Server logs are stored on the server machine. They may collect data about which pages are getting visited on a website, along with timestamps and data on IP addresses, browsers, and operating systems of the computers making requests for the documents. Client logs are stored on the client device and may identify problems and difficulties that users are experiencing with the user interface. They may contain data on actions performed by a user while viewing a web page, such as filling out a form, clicking, scrolling, and so on. Both logging techniques can be used for recording various events and user-initiated activities during the use of a client-server application, depending on which side the processing level (i.e. the core functionality) of the application is located. One advantage of server-side logging is that the server provides a centralized location for the log files, thus reducing overhead on data retrieval from the logs during analysis.

48 Currently there is a multitude of logging and log analysis tools available on the market, including both commercial products such as WebTrends Log Analyzer [64] and NetTracker [65], and free tools such as Analog [66] and AWStats [67]. 4.4. Summary As a summary, I consider that field experiments with real users are necessary for evaluating mobile interactive systems, when the focus of the study is not solely on identifying device-oriented usability issues. When inspecting user behavior and experience, social acceptability, effects and issues related to mobility and the natural use context, real-world validity, or long-term use and adoption of mobile technology, empirical evaluations carried out in the field are indispensable. If usability issues regarding interaction with the device and the user interface are merely investigated, laboratory evaluations may be sufficient and more efficient as they are usually easier, cheaper and less time-consuming to conduct. Various traditional data collection techniques, such as software logging and questionnaires, have proven to be feasible and relatively cost-effective for evaluating mobile systems. On the other hand, direct observation, a focal method in traditional user-centered research, has been found difficult to use in a mobile and changing use context of mobile services. However, with the support of emerging techniques for gathering data while the users are highly mobile, such as wearable or device-attached video cameras, high-quality data capture during observation in a mobile environment has become easier than before. The next chapter covers the first case study of this thesis; the design and evaluation of CampusSearch, a mobile location-based service targeted for campus community. The second case, a laboratory study with focus on determining optimal target sizes for one-handed interaction with touchscreen-based handheld devices, is presented in the subsequent Chapter 6. Finally, a methodological comparison of the two cases is presented in Chapter 7.

49

5. CASE 1: CAMPUSSEARCH – A MOBILE INTERACTIVE SYSTEM IN CAMPUS ENVIRONMENT This chapter describes the design and evaluation of CampusSearch, a location-aware mobile service targeted for campus environment. The service provides mobile users with tools to browse and retrieve information of places and activities in a campus area, complemented with map-based guidance. CampusSearch was developed as part of SmartCampus field trial in Rotuaari project [68], the objective of which was to research mobile services that aid working, studying and community life in a university campus. The design follows the interactive system design paradigm by Newman and Lamming [2], comprising of a comprehensive user survey for gathering user needs for such services, the design and implementation of CampusSearch, and two field experiments for evaluating the developed service. The purpose of the two field experiments presented in this chapter was to evaluate overall acceptability of CampusSearch with real users in a real-world use context. Realistic field setting was a natural choice over laboratory environment for the context of empirical evaluation, since the goal of both studies was to investigate user acceptance and service usage when also mobility, social aspects and other effects of natural context of use are present. Further, realistic test environment was crucial for assessing whether the designed mobile interactive service was usable and acceptable enough to replace existing methods for locating various resources on campus. In the first experiment, SmartCampus, the service was evaluated by collecting perceived usefulness and ease of use as well as actual usage data in an extensive one-month field trial. The second study, CampusSurf, employed a task-based method for collecting user feedback during three evaluation sessions in three days. The study also included a comparative evaluation between two different mobile devices. Rotuaari is a research project funded by the National Technology Agency of Finland and operating at the University of Oulu, with a goal to develop and test technologies and business models for future context-aware mobile multimedia services. SmartRotuaari [69] service system developed in the project comprises of wireless multiple access networks, service platforms, prototype services used with different kinds of mobile devices including laptops, PDAs and smartphones, and content. The developed services are tested in field trials with actual end users in real usage environments. The purpose of the field trials is to gather valuable data on the technical performance of the service system, the usability of the developed services, real consumer needs, consumer behavior and the functioning of earnings models. The services are developed further on the basis of the obtained results. CampusSearch is part of SmartRotuaari service system. [68], [69] 5.1. Situation of Concern A situation of concern regarding navigation and guidance in a university campus environment was identified by an extensive online user survey called Mobiilikysely, conducted in the University of Oulu in April 2004. The survey investigating the use of and need for mobile services facilitating work and studies in the university was targeted to students and staff. The purpose of the survey was to find out what kind of services staff and students are willing to use, which services they are already acquainted with, what kind of mobile devices they possess, and how prepared they

50 are to use the services. The gathered information will be utilized in projects focused on the research of mobile technologies and services in the university. [70] A total of 880 people answered the online questionnaire, of which 719 were students. The age of the students varied from 20 to 58 with a mean of 24.8 years (SD = 4.1), while the average duration of the studies taken was 3.7 years (SD = 2.7). 62.2% of the student respondents were men. Most respondents were studying in the faculty of technology (46.2%), followed by the faculties of science (30.0%), humanities (8.2%), education (7.9%), and economics and business administration (6.8%). The majority (76.8%) used mobile phones several times a day, while only 16.5% of the respondents used PDAs at least a few times a week. Mobile phones were almost solely used for calls and text messaging, and web browsing and e-mail were still quite rare activities (68.5% and 71.3% of the subjects never used their mobile devices for these tasks, respectively). This can be partly explained by the relatively old phone models the respondents possessed, as the majority of these devices could not be used for reading e-mail and browsing Internet, or featured only a WAP browser. The respondents were asked to rate the usefulness of various scenarios supporting studies using a 5-point Likert scale (1 = strongly disagree, 5 = strongly agree). When asked about a mobile map-based guidance service for locating people, places and items in the university, the majority of the respondents (59.2%) saw this kind of service as useful or very useful, as illustrated in Figure 26a. Further, the participants were asked with an open question about what kind of mobile services they would like to use and to be developed in the university. As a result, several respondents wished for a mobile guide for retrieving information about places, events and personnel in the university, with maps to find the located resources. Examples of such information included locations of rooms, places for lectures and exams, contact information of personnel as well as their office room codes and locations. The respondents were also asked how interested they were in mobile technology and services in general. The results showed that 64.2% of the student respondents were either very interested or interested, as depicted in Figure 26b, and 100 students (14.2%) were willing to participate in the development of mobile services e.g. by allowing to be interviewed.

Figure 26. Answer distribution when asked about (a) usefulness of locating people, places and items with a mobile device, (b) interest in mobile technology and services.

51 5.2. Problem Definition The understanding gained from the user survey about the situation of concern on finding various resources in the university assisted the project team in defining the design problem. The problem statement [2], specifying the four essential components of the interactive system to be designed, was defined as follows (Figure 27):

Figure 27. One-sentence problem statement of CampusSearch. Besides defining the four main aspects of the design problem, the problem statement also specified the context of use, which in this case was the university campus area. The intended users of the interactive system were the three main groups of people staying in the university campus. Besides students and staff whose views and needs were studied in the online survey, also visitors to the campus were seen as potential users of the service, since the campus area may often be an unfamiliar environment to them. The most significant target user group, however, was students, as they are by far the largest community in the university and the situation of concern was brought out from their opinions in Mobiilikysely. The selected human activity to be supported by the system was based on the results of the online survey, which were discussed in Section 5.1. As with many other types of previously designed interactive guidance systems, the form of solution was chosen to be mobile, enabling users to use the service with their mobile devices while moving around campus. This also allows users to continue using the service while navigating their way to the place they are looking for, which would not be possible with fixed terminals. The target level of support was to provide a quicker and easier way to locate and retrieve information about different resources on campus, compared to traditional methods such as asking help from an information desk or other people. 5.3. Requirements Specification After studying users and supported activity in the user survey and defining the design problem to be solved, the project team proceeded to requirements definition phase. This meant that activity modeling stage was skipped, saving time for the latter stages of design in a tightly-scheduled project, but which might have provided a deeper understanding on the human activity to be supported. To define the requirements for the mobile guidance service to be developed, several use cases illustrating typical usage scenarios of the system were designed. The typical flow of a use case in which

52 Alice finds a room with CampusSearch is shown in Table 5. As a result, the following most important functional requirements for the system were specified: • • • • • • •

User must be able to search objects and resources located in campus area by using search words and 3 predefined categories (places, activities, personnel). User must be able to see the locations of searched places on campus maps. User must be able to see detailed information related to the searched places, activities and personnel, if there is any detailed information available. User must be able to zoom and pan maps. The system must detect if the user's location can be shown on the map. Activities must contain at least one occurrence, and every occurrence must be linked with exactly one place. Content provider must be able to add, modify and delete places, activities and people in CampusSearch database with a content provider interface.

The requirements definition phase also involved developing specifications for the user interface and other layers of technology, as well as for the structure and meaning of data. UML deployment and dialogue diagrams [17] were used for defining the technology layers, while UML class diagrams and relational database schemas along with their data dictionaries were used for specifying the data structures in the software and the database, respectively. The next section of this chapter presents the overview of the designed solution conforming to these requirements, and describes the implementation stage of the design. Table 5. Use case in which Alice finds a room by using CampusSearch Actors Preconditions

Description

Post-conditions

Alice Alice has a terminal device which has a working network connection. Alice wants to find a room with the code ‘TS335’. She opens the search view of CampusSearch, selects ‘places’ as the search category, enters the room code into a search field and clicks ‘Search’ button. The results view is shown, containing result ‘TS335’. Alice clicks the result, which opens the details view presenting information about the place (e.g. name, facilities in the room and capacity) and a ‘Locate’ link. Then Alice clicks the ‘Locate’ link in order to locate the room on campus. A map view is opened and the locations of the room and Alice are shown on a map, where Alice’s location is centered. Alice’s location on the map is static, and Alice selects to refresh the map as she walks forward. Then Alice wants to change the zooming level of the map and selects to zoom in. The view is refreshed and the locations of the room and Alice are marked on the map if they are within the displayed area. Alice walks to the right location and finds the room. Alice found the room inside the campus area.

5.4. Design and Implementation CampusSearch was implemented as a browser-based web application. The software is based on three-tiered client-server architecture, in which an XHTML web browser of the client device provides a graphical user interface (GUI) for end users to interact with the service. The core functionality of the application is placed on an application server, while the data is maintained in a relational database on which the application

53 operates. Hyper-Text Transfer Protocol (HTTP) is used as the application layer protocol for data transfer between clients and the server, while TCP (Transmission Control Protocol) acts as the underlying transport protocol. CampusSearch was designed as platform independent, and thus it can be deployed on a variety of different computing environments. Although CampusSearch was primarily designed for use with mobile devices, it can be used with fixed terminals as well. 5.4.1. Architecture Figure 28 illustrates the run-time environment of CampusSearch as part of the SmartCampus service system. HTTP requests from clients are primarily handled by Apache HTTP server (later Apache) [71], which also serves the provision of static content such as images and HTML documents. Dynamic content is served by Tomcat Servlet/JSP container (later Tomcat) [72], an application server developed by Apache Foundation that provides an environment for Java code to run in cooperation with a web server. Both CampusSearch and CampusSearch CPI are deployed on Tomcat. Apache forwards all requests for dynamic content to Tomcat by using mod_jk module, which implements the interface between the two servers. Apache JServ Protocol version 1.3 (AJP13), a binary format protocol based on HTTP, is used for transferring data between Apache and Tomcat. Although Tomcat can also function as an independent HTTP server (using port 8080 by default), there are several reasons to integrate it with Apache. First, Apache is faster and more configurable at serving static content than Tomcat. Second, letting Apache act as a front door allows clustering content to multiple Tomcat instances, improving error tolerance in case one of the Tomcats fails [72]. Further, MySQL [73] relational database is used for storing CampusSearch content, and its proprietary protocol (MySQL protocol) over TCP is used for transferring data between web applications and the database.

Figure 28. Run-time environment of CampusSearch.

54 CampusSearch was implemented in Java using Servlets and Java Server Pages (JSP) technologies. The service consists of an end user application (CampusSearch) and a content provider interface (CPI) for managing content in the service. While the service operates on CampusSearch database, it is also integrated with Timmi space reservation system administered by the janitors of the university, containing information about rooms reserved for various activities on campus. CPI fetches activity data from Timmi database and stores it into CampusSearch database. CampusSearch also has access to MyCampus, a central web application in SmartCampus service system which provides users with a trusted, centralized and personalized service portal with links to other SmartCampus services. MyCampus database contains the profiles of registered SmartCampus users, which can be accessed by other SmartCampus services to personalize their functionality. For instance, the physical location of the user can be utilized in CampusSearch by showing it on the map, given that the location is available in the user’s profile. The location information in SmartCampus is collected by Positioning Service, utilizing WLAN and Bluetooth-based positioning technologies. As depicted in Figure 28, SOAP (Simple Object Access Protocol) over HTTP with RPC (Remote Procedure Call) method is used for messaging between various SmartCampus services. The software architecture of CampusSearch is based on Model-View-Controller (MVC) architectural pattern, as illustrated in Figure 29. MVC separates the data model, user interface, and control logic of an application into three distinct components so that modifications to one component can be made with minimal impact to the others. It also supports scalability, ease of maintenance, and distribution of coding effort in application development. Here, the ActionDispatcher servlet acts as the controller determining the overall flow of the application, while JSP pages represent the view component of MVC architecture. ActionDispatcher processes all requests received from clients and, depending on the type of the action, passes control to one of the action handlers. The selected action handler performs the requested business logic, creates and modifies any beans or objects stored as request or session attributes and used by the JSP, as well as deciding to which JSP page to forward the request. The JSP page accesses the objects created by the action handler, extracts dynamic content for insertion within a static template, and generates an HTML document to be sent to the browser.

Figure 29. Software structure of CampusSearch, using MVC architectural pattern.

55 Hibernate [74] framework has been used as an adapter between business logic and the database. Hibernate is an object-relational mapping (ORM) and query framework for Java that takes care of mapping data representation of an object model to a relational data model and its corresponding database schema. Besides mapping Java classes to database tables and Java data types to SQL data types, it also provides data query and retrieval facilities powered by object-oriented Hibernate Query Language (HQL). Several benefits were gained after designing CampusSearch as a browser-based service. Due to the ubiquity of the browser as a client application, the service can be used with a wide variety of different mobile devices on disparate platforms and operating systems. Further, there is no need to install any additional software on a client device, enabling users to start using the service instantly. The service can also be updated and maintained without distributing and installing software on client devices. Due to its popularity, a web browser also provides a relatively familiar user interface for users, thus lowering the learning curve of using the service. From the developers’ point of view, browser-based web applications are fast and easy to prototype by using HTML editors and other tools for web design. Using browser as the client application, however, carries also some disadvantages. Browser-based applications may have relatively slow response times, and actions such as clicking on a button or link may cause considerable delays if the web page needs to be reloaded. Also difficulties with pushing content to the client without the need of reloading a page manually may discourage the use of a browser as a client application for a web service. 5.4.2. User Interface The user interface of CampusSearch was prototyped by using XHTML and CSS technologies. This provided a fast and easy way for building horizontal high-fidelity prototypes, that is, prototypes which closely imitated the actual interface and covered a rich set of features with a reduced level of functionality. During the user interface prototyping process, the early prototypes were informally analyzed and tested by the developers, then replaced with static templates implemented in XHTML and CSS, and finally augmented with JSP code to provide full functionality. The user interface design of CampusSearch followed the guidelines for creating web content for mobile devices, developed by Nokia [75]. Since the service was directed for mobile devices, especially to mobile phones that mainly use low bandwidth GPRS or 3G data connection, the user interface was kept very simple in order to keep the amount of data to be transferred as small as possible. Several HCI issues regarding small screen size and limited input and interaction capabilities of mobile devices were also taken into account. For example, the user interface was constructed with a minimum need for text entry, and thus text input is only used for entering keywords in the search interface. Further, the need for horizontal scrolling was eliminated, and the amount of vertical scrolling needed to find the desired information was minimized. Content was optimized for presentation on different devices by creating separate CSS style sheets for different device types. The main view of CampusSearch, depicted in Figure 30a, provides links to other CampusSearch views. Besides the map and the search interface, the two essential features of the service, also links to search history and favorites are provided, which

56 are additional features for registered and logged-in SmartCampus users. At the bottom of the view, links to SmartCampus login page, MyCampus service portal, and Finnish and English versions of the user interface are available. The search view (Figure 30b) contains a drop-down menu for category selection and a text input field for entering keywords. The advanced search view also features two drop-down menus for time span selection, which can be used in activity searches. Due to law regulations, personnel search was only enabled for registered SmartCampus users.

Figure 30. CampusSearch user interface scaled for a S60 mobile phone, (a) Main View, (b) Search View, (c) Results View. The results view (Figure 30c) lists the results that matched with the given search criteria. Each result link in the list leads to the details view presenting information about the selected object, as illustrated in Figure 31a. If the selected place (or activity) has relations to any occurrences, they will be presented along with links to activities in which they belong to (or to locations where they take place). Also links to other objects mapped to the selected object will be shown, such as people working in the selected room or lectures held by the selected person. If the selected object is a place, also a ‘Locate’ link for showing its location on the map will be available.

Figure 31. CampusSearch user interface, (a) Details View, (b) Map View.

57 In the map view, displayed in Figure 31b, the maps of different floors in the university campus can be browsed and places selected for positioning can be located on a map. Tools for zooming and panning maps are provided in the form of three zooming levels and arrow buttons to eight different directions. The locations of the located places are denoted with partially transparent filled polygons. In the example snapshot depicted in Figure 31b, the located cafeteria is marked on the map with a red polygon. A list of located places along with their color codes is presented below the map. In case the user is logged in and his/her location is detected, the user’s location will be marked on the map with a smiley if it is within the displayed area. By clicking ‘Center’ link next to each entry, the selected object will be centered on the map. The location of the user can be refreshed by clicking the map image. 5.4.3. Content All places, activities and people in CampusSearch database can be inserted, modified and deleted by using browser-based CampusSearch Content Provider Interface (CPI). Figure 32a illustrates the view for modifying textual data of a place, while the view for determining geographical location of a place by using Areatool applet is shown in Figure 32b. Besides the browser-based editing interface, few batch processing tools for adding scores of campus objects into CampusSearch database were developed as part of the CPI. A tool called TimmiDaemon is used for fetching activities from the database of Timmi room reservation system and storing them into CampusSearch. Each activity entry in Timmi has a time span and a room code, which enables mapping an activity to a place with the same code in CampusSearch database. Activities, the places of which do not exist in SmartCampus database, will not be stored. Every night, TimmiDaemon performs a single query to Timmi in order to fetch the changes, thus keeping traffic load to the external database low. Further, a tool for importing people’s data from personnel lists in text files was developed. During the field evaluations of CampusSearch, all the place and personnel content was available both in Finnish and English, while activity content fetched from Timmi was merely Finnish. Places were manually added by a group of researchers, while activities and personnel were inserted by using automated tools.

Figure 32. CampusSearch CPI, (a) Place Editing View, (b) Area Editing View.

58 The maps used in the service are adapted from the floor plan drawings of the university campus. The graphics of the maps are designed to be simple and clear. Different symbols on the map are color-coded: walls and other fixed structures are drawn with black, corridors with grey and outdoor areas with orange. Also areas of several pre-defined landmarks are superimposed on the map with different color codes: libraries are painted purple, restaurants turquoise, and large lecture halls yellow. One map image per zooming level is designed for each floor. The maps are calibrated to the WGS84 (World Geodetic System 1984) coordinate system [76] by using affine transformation-based Gauss-Krüger projection, as reported in [77]. The conversion parameters for the transformation between pixel and geographical coordinates were estimated by measuring three calibration points from each map by using a GPS device. The same method is used to convert geographical coordinates of located places to pixel coordinates of map images. 5.5. Empirical Evaluation 1: SmartCampus Field Trial The first empirical evaluation of CampusSearch took place in conjunction with a large-scale field trial named SmartCampus, arranged on the campus of the University of Oulu in April 2005 as a joint effort by Rotuaari and Virtual Campus projects. In the field trial a range of different prototype mobile multimedia services facilitating work and studies were tested by university students, staff and visitors. This section reports the test realization including device and network setup, participants, and data collection methods, as well as the results obtained in the study. 5.5.1. Method The field trial was coordinated from an information booth named SmartCampus field office, illustrated in Figure 33. The staff at the office recruited volunteers as test users, loaned out mobile devices, provided guidance, and collected feedback from participants. The office was movable and its location varied between three places, including the central lobby of the campus, and the lobbies of Restaurant Snellmania in the Faculty of Humanities and Cafeteria Datania in the Department of Electrical Engineering and Information Technology. At office hours during the field trial, all materials, including devices, battery chargers, papers and office supplies, were stored at the field office, and were brought to a locked room for nights. Thus, the field office provided a suitable “home base” for evaluation, improving experimental conditions and control in a dynamic and unpredictable environment, as highlighted by Kellar et al. [40]. To motivate campus dwellers to sign up as test users, each participant was rewarded with a cup of coffee and a pastry, as well as participation to a raffle among all test users. No specific tasks were given, and hence the participants were allowed to try and test features provided by the offered services freely. Test users were given an opportunity to loan a mobile phone without a fee for two working days from the office, together with instructions for activating the services. Various phone models based on Nokia S60, S80 or S90 platforms were available, all equipped with XHTML browsers and either GPRS or 3G network connection. Windows Mobile 2003 based PDAs with WLAN access were also used by specific test user groups recruited during the field trial. Participants could test CampusSearch

59 also with their personal mobile devices as well as public desktop terminals distributed across campus. At the time of the study, the access networks available to mobile users in the university campus area consisted of panOULU public wireless local area network [78], and GPRS and 3G networks.

Figure 33. (a) SmartCampus field office, (b) a staff member instructing a test user. To test Bluetooth cell ID positioning similar to that reported by Aalto et al. [79], a grid of ten BT location sensors was built along the main corridor crossing the campus. The purpose of the grid was to sense mobile BT devices passing by and to update respective SmartCampus users’ profile with the location of the sensor. Besides that, positioning technology based on WLAN signal strength measurements by Ekahau [80] was provided for PDA users within panOULU coverage. Demographics and data on perceived usefulness and ease of use were collected with a questionnaire containing both closed and open-ended questions (Appendix 1). It was partly based on the PUEU questionnaire [29], including 5-point Likert scale questions adapted to the context of the field trial. The questionnaire also contained questions specific to CampusSearch. Further, general verbal feedback was observed at the field office, although not recorded systematically. Actual usage of the service was monitored by means of server-side logging, providing statistics of prime actions such as performing searches and viewing of different pages. Apache log4j API [81] was used for runtime logging. Each log entry described in XML consisted of header and body sections, as pictured in Figure 34. The header section contains the time stamp and the names of the service and the file that logged the event, while the body section contains the actual data regarding the event.

Figure 34. CampusSearch log entry. A registered SmartCampus user with S60 mobile phone has performed a place search using ‘TF104’ as a keyword.

60 5.5.2. Main Results A total of 203 participants returned a completed questionnaire. Male interest in the field trial outweighed female interest by 74% to 26%. Most respondents were students (84%), while staff (14%) and visitors (2%) were harder to reach. The majority of the test users were 18-24 years of age (63.6%), followed by age groups of 25-34 years (25.6%) and 35-49 years (8.2%). Most respondents were studying or working in the faculty of technology (57.5%), followed by the faculties of science (21.0%), economics and business administration (9.1%), humanities (5.4%), education (5.4%), and medicine (1.6%). 155 of the participants stated in the questionnaire that they had used CampusSearch at least one or two times. Log data revealed that 142 of them actually used CampusSearch as a registered SmartCampus user. The results reported above are summarized along with the distribution of loaned phones in Figure 35.

Figure 35. SmartCampus test user demographics: (a) gender, (b) age, (c) category, (d) faculty, (e) frequency of CampusSearch usage, (f) distribution of loaned phones. As depicted in Figure 36, the feedback obtained with the questionnaire indicated that CampusSearch was fairly well received by the test users. 55.6% of them agreed or strongly agreed that the service was useful (M = 3.50 on a 5-point Likert scale), while 59.0% considered that the information provided by the service was worthwhile (M = 3.49). The perception of the ease of use was quite mixed: while 41.2% of the respondents found the service easy or very easy to use, 31.4% considered it hard to use (M = 3.10). The ease of learning to use the service, however, received positive ratings (60% agreed or strongly agreed, M = 3.64), and even 81.6% of the respondents saw the possibility to use the service with a mobile device as a good thing (M = 4.19). When asked whether they would use a similar service in the future, the respondents had quite divergent views: 39.0% agreed, 33.1% disagreed, and as many as 27.9% did not either concur or disagree with the statement (M = 3.06). Searching and finding places with CampusSearch was considered easiest with desktop and laptop computers (agreed by 68.5% and disagreed by only 1.9% of the

61 respondents), while the ratings were least positive for mobile phones (38.3% and 37.5%, respectively). It is notable that the number of respondents regarding questions on easiness with desktop/laptop computers (N = 54) and PDAs (N = 29) was much less that with mobile phones (N = 128). 46.3% of the test users regarded map images clear and helpful in navigation. Due to problems with the BT positioning grid caused by frequent crashes of the BT sensors, only a few test users could test user positioning successfully. As pictured in Figure 36, showing user location on map was not regarded very useful. Place search was ranked the most useful search option (64.1%), followed by activity search (20.6%) and personnel search (15.3%).

Figure 36. User feedback on CampusSearch in the SmartCampus field trial. The actual usage data obtained with server-side logging showed that a total of 1449 searches were performed during the field trial between April 4th and April 29th 2005. “All categories” was the most frequently selected category (666 times), followed by “places” (375), “personnel” (260), and “activities” (148). Thus, while place search was perceived as the most useful search option, it was also a more

62 frequently used category compared to activities and personnel. Test users were most active in making searches during the first field trial week, when the field office was located in the central lobby of the university campus (567 searches). The second most active week was the third, when the office site was the lobby at the Faculty of Humanities (376 searches). The search statistics gathered with server-side logging are summarized in Table 6. Hit statistics on the main view of CampusSearch showed that the service was most frequently accessed with mobile phones (1359 hits during the field trial), which was no surprise as the emphasis in the field trial was put on testing the offered services with mobile terminals. The main page was entered 520 times by using desktop/laptop computers during the study period, while the amount of hits with PDAs was only 38. The reason for such a low hit rate can be explained by the fact that only small recruited test user groups tested the service with PDAs during the second and third week, and the field office did not offer PDAs for loan. The hit statistics on the main view of CampusSearch are presented in Table 7. Table 6. Search statistics obtained from the server logs Searches per category

"all categories"

"places"

"activity"

"personnel"

Total

Week 15 (4.4.-10.4.2005)

316

151

35

66

567

Week 16 (11.4.-17.4.2005)

137

85

20

65

306

Week 17 (18.4.-24.4.2005)

126

99

58

91

376

Week 18 (25.4.-29.4.2005)

87

40

35

38

200

Total

666

375

148

260

1449

Table 7. CampusSearch main view hit statistics obtained from the server logs Hits on CampusSearch main view Week 15 (4.4.-10.4.2005) Week 16 (11.4.-17.4.2005) Week 17 (18.4.-24.4.2005)

mobile phone

desktop/laptop

PDA

Total

533 334 258

193 89 155

0 9 29

726 432 442

Week 18 (25.4.-29.4.2005)

234

83

0

317

Total

1359

520

38

1917

The open comments in the questionnaire revealed that several test users praised CampusSearch for providing relatively fast and easy access to useful information. It was also considered well suited for use on mobile devices (“enables access to helpful information in places where no computers are available”), and found speeding up and easing search tasks in the campus area (“the service eased finding lecture halls”, “useful when some room is lost and I have no time wandering around campus trying to find it”, “you can find lecture halls without need for going asking from the information desk”). Some test users thought that the service may be useful for new students but not for people who are more familiar with the environment (“the map service could be useful for freshmen, but ‘the old’ won’t need it”). As expected, the data transfer rate of the network connection had a major effect on the user experience. Many users with GPRS phones were irritated with the slow connection and did not bother waiting for web pages to load (“while waiting, I already managed to walk to a guide map”). 3G phone users, instead, were often positively surprised with the data transmission rate (“with a 3G phone the connection speed was nice while with GPRS connection the use of the service was really painful”). PC and PDA users with wired or wireless broadband connection were generally most satisfied with using the service, as indicated earlier in Figure 36.

63 A number of other usability factors were also estimated to cause negative bias to the feedback obtained from mobile phone users. Many users complained of the small screen and awkward user interface of the web browser in mobile phones (“the map was so small that it was hard to read”). Registration and logging in to the SmartCampus system with phones equipped with small keypads was also regarded as difficult. Some users encountered reliability problems due to a software bug in map image retrieval, causing occasional failures on rendering the map view of the service (“the map did not always get loaded”, “sometimes an error report was shown when I tried to move the map to some direction with arrow buttons”). Many users also wished for better instructions on how to start using the offered services and loaned devices (“poor quick start instructions gave a negative impression”). 5.6. Empirical Evaluation 2: CampusSurf Focus Groups The second CampusSearch field experiment on the university campus, named CampusSurf, was designed as a smaller-scale task-based evaluation with focus groups to assess usability and utility of two mobile services (CampusSearch and MobiiliOptima). Before the study, both services were modified on the basis of the feedback gathered within the first field trial. The most significant improvement to CampusSearch was the fixing of the map image loading bug that hindered the use of the service in the previous field trial. Also a few minor user interface updates were implemented. The utilized evaluation method, participants, equipment, tasks, data gathering techniques, and the main results are presented in this section. 5.6.1. Method The study was divided into three field experiment sessions, each of them carried out in one day. Each session comprised of three phases. The initial phase consisted of a focus group meeting in which the participants were briefly interviewed, followed by a short introduction and training session on the mobile devices and services to be tested. In the end of the meeting, the participants were given two mobile devices and a questionnaire including quick start instructions and task descriptions. In the second phase the subjects performed a predefined set of tasks with CampusSearch while moving around campus, and answered questions in the questionnaire after completing each task. The final phase consisted of a focus group debriefing session, in which the participants were asked about experiences during task-based evaluation as well as general opinions about mobile services targeted for campus community. The participants also returned the loaned equipment and the questionnaire, and received a cup of coffee and a movie ticket as a compensation for taking part in the experiment. Both focus group meetings took place in a small library room on the premises of the Department of Electrical Engineering and Information Technology. The duration of each phase in one experiment session are presented in Table 8. Before executing the actual tests, a pilot test was conducted to validate the test plan. A total of 18 test users participated in the study, 5 of which were female. The group sizes on each test day were 7, 6 and 5, respectively. All participants were students aged between 20 and 25 years with an average of 23.3 years. Nine of them studied in the field of technology (50.0%), while four natural science students

64 (22.2%), three economics and business administration students (16.7%), and two humanists (11.1%) formed the other half of the test user pool. Participants were inexperienced in mobile web browsing: 4 users (22.2%) had never used a web browser on a mobile device before, while 12 users (66.7%) had briefly tried and 2 users (11.1%) had used it sometimes. Test users perceived themselves as rather familiar with the university building. On a five-point rating scale (1 = very unfamiliar, 5 = very familiar), two users (11.1%) gave a score 2, six users (33.3%) a score 3, and 10 users (55.6%) a score 4. A test user performing a task and filling the questionnaire is shown in Figure 37, while background data of the users are presented in Figure 38. Table 8. Duration of each phase in CampusSurf field experiment Phase Duration Initial focus group meeting and orientation 30 min Task-based field evaluation 90 min (reserved time) Focus group debriefing session 40-60 min Total 180 min = 3 hours

Figure 37. CampusSurf test user performing a task and filling in the questionnaire.

Figure 38. Background data of the participants in the CampusSurf experiment. The two mobile devices used in the study were Nokia 6670 S60 smartphone with GPRS connectivity (Figure 39a) and Nokia 770 Internet tablet with WLAN access (Figure 39b). 6670 has a traditional smartphone user interface with 2.1” TFT display (176 x 208 pixels), alphanumeric keypad and a five-way navigation key, while 770 features a high-resolution (800 x 480 pixels) stylus-operated touch screen and a full-

65 featured web browser. During the test, 770 tablets were connected to panOULU public WLAN [78], while 6670 phones utilized the Octopus GPRS network [82]. Since a bug causing BT location sensors to crash in the SmartCampus field trial had been fixed, Bluetooth-based user positioning was tested with 770 tablets, and a grid of seven BT sensors was deployed in close proximity of places to be located.

Figure 39. Devices used in CampusSurf study, (a) Nokia 6670, (b) Nokia 770. The tasks were performed independently by each participant without the presence of the observer. However, during the experiment the evaluation a moderator was on call at cafeteria Datania, a central location of the test environment, and could be contacted in person or by phone in case participants were encountering insuperable problems and needed assistance. The tasks were divided into two groups, one consisting of six CampusSearch tasks and another of three MobiiliOptima tasks. The CampusSearch tasks, listed in Table 9, were further divided into two groups of three tasks, one for Nokia 6670 smartphone and another for Nokia 770 Internet tablet. The order of task groups was counter-balanced among the participants in order to reduce the learning effect. The actual locations of the searched places in each task as well as the locations of Bluetooth sensors, the focus group room and cafeteria Datania are shown on the map of the university campus in Figure 40. Table 9. Tasks related to CampusSearch in the CampusSurf experiment Place, activity and personnel search tasks with Nokia 6670 S60 phone 1.1 1.2 1.3

“You have an appointment with your professor in room TS310 tomorrow. Find out where the room is and navigate to the location.” “Find out where the lectures of Computer Networks course are being held, and navigate to that location.” “You need to enroll on a course held by researcher Teemu Myllylä, but a registration list is missing from the bulletin board. Locate his office using CampusSearch and navigate to the location.”

Place, activity and personnel search tasks with Nokia 770 Internet tablet 2.1 2.2 2.3

“You are on your way to your next class held in room YL124. Find out where the room is and navigate to the location.” “You are going to attend a Matrix Algebra lecture on April 10th. Find out where it will be held and navigate to the location.” “You need to visit Perttu Laurinen’s office. Locate his room using CampusSearch and navigate to the location.”

66

Figure 40. Locations of the searched places, BT sensors, focus group room, and Datania. Numbers inside the circles indicate the numbers of the related tasks. Data was collected using audio/video camera during focus group meetings, while server-side logging and questionnaires were used in the task-based evaluation. The CampusSearch part of the questionnaire was based on After Scenario Questionnaire (ASQ) and System Usability Scale (SUS) questionnaires, including also a few service-specific questions. After each task, the participants answered questions on the subjective ease and speed of task performance modeled after ASQ, as well as a question related to the surroundings of the place to be located, to ensure the task had been completed successfully. The SUS questionnaire was answered after completing all tasks per each device. Due to counter-balancing of the task order and the fact that all participants were Finnish-speaking, two versions of the questionnaire in Finnish were designed. Version A of the CampusSurf questionnaire without MobiiliOptima part is presented in Appendix 2. 5.6.2. Main Results The tests were carried out in 4–6 April 2006 on the campus area of the University of Oulu. In the beginning of the first focus group session the participants were asked which methods they had previously used for finding places in the campus area. The answers revealed that the most common ways were asking for directions from the information desk in the central lobby or from fellow students and friends, looking at the map of the campus area on the university website, or using a traditional paper map. Some test users also mentioned navigation through reasoning from the code of the searched room. In the evaluation, all participants carried out tasks related to CampusSearch successfully, despite the fact that the positioning sensor grid appeared to be too sparse and location data was updated to the users’ profiles too infrequently. The results of the SUS questionnaire measuring the subjective usability of CampusSearch are illustrated in Figure 41. Also a question regarding clarity and ease

67 of navigation with CampusSearch maps is included in the diagram. As indicated in the graph, the test users gave good grades for the usability of CampusSearch when used on Nokia 770 Internet tablet, with an overall score of 78.1 out of 100 points. When used on Nokia 6670 phone, the service received a considerably lower score of 56.6 points. The most notable differences in the results between the devices were in the cumbersomeness (M6670 – M770 = 3.72 – 2.00 = 1.72), ease (M770 – M6670 = 4.28 – 2.83 = 1.45), and likeability (M770 – M6670 = 3.83 – 2.61 = 1.22) of using the service with the tested device. 16 test users (89%) gave a score 4 or 5 to the ease of use on the Internet tablet, while only 4 test users (22%) gave similar ratings for the S60 phone. Ease of learning to use the service was highly rated on both devices (M770 = 3.94, M6670 = 3.29), and maps were unsurprisingly perceived as clearer and more helpful in navigation on the Internet tablet equipped with a large screen (M770 = 3.94, M6670 = 3.47).

Figure 41. Results of the SUS questionnaire for Nokia 770 and Nokia 6670. The results obtained from the ASQ questions revealed that participants were generally more satisfied with the ease and speed of completing search tasks with the Internet tablet than with the S60 phone. As shown in Figure 42, differences between the devices were greater when asked about the satisfaction with speed of completing the tasks. As indicated by the test users’ comments, the main reason for divergent speed ratings was the major difference in the data transfer rates of GPRS and WLAN network connections. There were no differences in results between search task types.

68

Figure 42. Results of the ASQ questionnaire for Nokia 770 and Nokia 6670: (a) subjective ease and (b) subjective speed of completing each search task. The ratings given to device-independent CampusSearch-specific questions are presented in the two diagrams depicted in Figure 43. The results revealed that CampusSearch was generally considered a useful service by the test users (M = 3.71 on 5-point Likert scale), as appeared also in the SmartCampus field trial (M = 3.50). When asked whether the possibility to use the service while being mobile was a good thing, the test users’ perceptions were quite similar (M = 4.12) to attitudes in the earlier field trial (M = 4.19). The participants’ views on their willingness to use the service in the future were slightly more positive (M = 3.25) than answers to a similar question in the previous study (M = 3.06). Unlike in SmartCampus, the views on the most useful search option were equally distributed among the three search categories. Possible reasons for deviating results in CampusSurf could be the more extensive orientation to start using the services and devices, as well as higher control on the activities performed by the test users to ensure that each main feature provided by the service will be equally introduced and tested.

Figure 43. Obtained results for CampusSearch-specific questions: (a) subjective preferences rated on a 5-point Likert scale, (b) the most useful search option. The test users’ open comments obtained from the questionnaires and debriefing sessions provided deeper understanding on the contexts of use in which the service was seen as useful, and whether the service would be usable enough to replace existing methods for finding places in the campus environment. Also comments regarding identified usability problems and ideas on how to improve the service were gathered. Most of the usability issues perceived by the test users were deviceoriented. The most influential problem was the slow GPRS network connection on S60 phones, which hindered service usage to such an extent that the majority of the participants were unwilling to use the service with a GPRS phone in the future (“GPRS is too slow a network for mobile services!”, “The loading of the map took quite a long time on the phone”, “The most irritating thing was that the phone was

69 so slow. Just waiting, waiting, and waiting.”). Further, the small screen and cumbersome user interface of the phone limiting interaction to keypad-mapped menus and directional navigation degraded user experience (“On the phone, you could not point a link directly.”, “It was awkward to browse since I had to press the buttons many times to get where I wanted.”). The Internet tablet was commended for its fast network connection, bigger display and intuitive user interface (“Navigation with touchscreen and stylus was much more intuitive than by using phone keypad.”, “In this kind of service it is especially good to have a bigger screen and better resolution. The map looks totally different on this device.”, “The device was really easy to use, although I had never used this pen system before.”). However, its wide form factors and the necessity to use two hands for interaction were seen as weaknesses by some users. The test users presented several scenarios in which they considered the availability of the service for mobile devices worthwhile (“If you’re attending a lecture and you want to know what’s next in the schedule.”, “In the morning you could check on the bus which classroom to go to.”). Also the availability of the map during navigation was found advantageous when using the service on a mobile terminal. Some participants, however, stated that they would rather use the service on a desktop computer at home or on one of the numerous fixed terminals placed along the corridors on the campus area. Suggestions for improving the service included clarifying the viewed location on the map (“The code of the located room should be viewed on the map.”, “I did not realize at first that the red square on the map was the viewed location.”), ideas for new content (“opening hours for entrance doors”, “lecture cancellation notifications”), and map-based routing (“It would be great to see crowded areas on the map so I could avoid them.”). 5.7. Discussion The results of the evaluations showed that CampusSearch was found a useful mobile service and could potentially replace the existing available methods for searching and finding places in the campus area. The tests also helped to identify a number of usability issues and yielded improvement ideas that will help in further development of the service. The results revealed that GPRS connection is too slow for a mobile map-based guidance service of this kind. The majority of the test users also found the web browser cumbersome to use with phone keypads, and stressed the impact of screen size on the user experience. The results of the comparative evaluation in the CampusSurf study proved that users preferred Nokia 770 Internet tablet to Nokia 6670 smartphone because of its faster network connection, larger high-resolution screen and more convenient and intuitive user interface featuring targets for direct manipulation. Comparison of the methods used in the field evaluations will be presented in Chapter 7. Besides fixing identified usability problems, potential scenarios for future work include adding more useful content to the service, analysis of the suggested features to be implemented, and development of the user positioning system. Also features utilizing user’s profile and location awareness, such as using distance as a search criterion, might be worthwhile to consider. Finally, conducting an empirical evaluation of CampusSearch on mobile phones with WLAN connectivity would be an attractive idea.

70

6. CASE 2: TARGET SIZE STUDY FOR ONE-HANDED THUMB USE ON MOBILE TOUCHSCREEN DEVICES This chapter describes the last empirical evaluation covered in this thesis, a twophase study carried out to determine optimal target sizes for one-handed thumb use of touchscreen-equipped mobile handheld devices. Since touchscreen user interface widgets compete with other information for limited screen space, it is desirable to keep the dimensions of interaction targets as small as possible without degrading performance and user satisfaction. Similar target size studies have been conducted for two-handed interaction with a stylus on a handheld, as well as for interaction with index fingers on a desktop-sized display. However, so far none have reported a study for thumbs when holding a mobile handheld in a single hand. As the study focused on device-oriented usability, interaction and user interface design issues, laboratory setting was selected as the test environment. The following sections of this chapter present the study design, test realization, obtained results and discussion as well as final conclusions. The reported work was done while the author of this thesis was visiting Human-Computer Interaction Lab [83] in the University of Maryland as an intern during fall 2005. A more detailed description of the study is reported in [1]. 6.1. Method The study consisted of two phases. The first phase investigated the required target size for single target selection tasks, such as activating buttons and checkboxes. The second phase explored optimal target sizes for tasks involving multiple taps, such as text entry. The phases for each type of task were termed discrete and serial, respectively. This design was motivated by a study conducted by Colle and Hiszem [84], who explored appropriate size and spacing of targets for interaction with index fingers on a touch-sensitive kiosk display. Their results showed that while error rate decreased when targets increased from 10 mm to ≥15 mm for tasks with sequences of 4 and 10 taps, the error rate remained constant for single target pointing tasks. This finding suggests that there is a difference between discrete and serial target selection tasks. Because of the limited extent and mobility of the thumb while holding a mobile device, target positions were varied in both phases to find out if the performance depended on screen location. Colle and Hiszem [84] identified two metrics for evaluating tap accuracy. One approach is to vary the target size experimentally and then reason about appropriate target sizes according to hit rate. The second approach offers users small fixed-sized targets and derives a required target size from the hits distribution. The advantage of the second approach is that it also reveals hit bias with respect to the target location. Since the primary goal of the study was to capture accuracy in hitting actual interface objects, both phases were modeled after the first approach of varying target sizes. However, to understand how screen location may affect error rate and thus target size, also actual tap locations were tracked to produce hits distribution data. The course of the empirical evaluation was designed as follows. After completing an initial questionnaire to collect demographics and prior device use, the participants performed the discrete target phase followed by the serial target phase. After each phase, participants recorded subjective ratings of their interaction experience.

71 Performance was evaluated by both speed and accuracy of task completion across various target sizes and locations. Duration of a test session, including instruction, both data collection phases and questionnaire filling, was approximately 45 minutes. The questionnaire used in the study is presented in Appendix 3. Twenty participants (17 male, 3 female) took part in the study. All of them were right-handed and aged between 19 and 42 years (M=25.7). Participants received $10 for their time. While 18 of them used keypad-based mobile devices regularly, only 5 used touchscreen-based handhelds even occasionally. Users were asked to rate on a 5-point scale (1 = never, 5 = always) how often they had used different interaction techniques with mobile devices. With keypad-based handhelds, all users strongly favored one-handed thumb use (M=4.17) over a two-thumb technique (M=2.56). The few users experienced with touchscreen handhelds had regularly used a pen for touch input (M=4.60), while one-handed thumb (M=2.20) and two-handed index finger (M=2.00) techniques had been practiced less often. A two-handed technique using both thumbs had almost never been used (M=1.40). Hand width and thumb length were recorded for each participant. Thumb length varied between 99 and 125 mm (M=115 mm), while hand width varied between 75 and 97 mm (M=88 mm). Both phases were performed on an HP iPAQ PDA measuring 7.1 x 1.4 x 11.4 cm with an 8.9 cm screen, measured diagonally. The display resolution was 240x320 pixels. The user interfaces and the underlying control software were developed using the Piccolo.NET graphics toolkit [85]. 6.1.1. Discrete Target Phase Target sizes used in the discrete target study were 3.8, 5.8, 7.7, 9.6 and 11.5 mm on each side. The appropriate sizes were determined in a pilot study, of which results indicated that performance rates leveled off for target sizes bigger than 11.5 mm. Thus, the 11.5 mm target represented the largest practical recommended size for single targets, while the smallest target (3.8 mm) represented an average target size for existing handheld devices. Nine target locations were defined by dividing the screen into a 3x3 grid of equal-sized cells. For each trial the target was located in the center of one of the regions. In the actual test a total of 225 trials were distributed across five blocks. Each size x location combination was tested once per block in a randomized order, resulting in 45 trials per block. Before completing the official trial blocks, the users began with a practice session consisting of one block of trials. The participant’s task for each discrete target trial was to tap a start position and then the target to be selected. All tasks were performed standing and one-handed using only the right-hand thumb for interacting with the touchscreen. For each trial, the start position was indicated by a large green button from which movement distance could be measured. The distance between the green button and the target was constant for all tasks, while the relative location of the green button varied depending on the region in which the target was located. Movement direction was standardized by positioning the green button either North or South of the target. North ↔ South movement was chosen over East ↔ West movement since it better matches the thumb’s natural axis of rotation. The green button was located in the cell above the target except when the target was in the top row of the grid. In that case the green button was located in the cell below the target. The interface for the discrete target phase and a participant performing the test are presented in Figure 44.

72

Figure 44. User interface for the discrete target phase, (a) the startup view for a trial testing a 5.8 mm target in the center zone, (b) user selects the 7.7 mm target in the upper left zone, (c) a participant performing tasks in a laboratory setting. The pilot study showed that lone targets were perceived easier to select than those near other objects. To address this issue, each intended target was surrounded by ‘distractor’ targets, meaning that users had not only to hit a target, but also avoid others. This design also provided an interface closer to real-world applications often presenting multiple targets close to each other. Further, the distractors were presented in randomized locations around the target to promote a sense that the user was not moving the same exact distance and in the same direction for each trial. The goal of this design decision was to prevent users from adopting a routine or preprogrammed movement for task completion rather than as a result of explicit aiming. The intended target was labeled with an ‘x’ while the distractors were marked with other letters. To de-emphasize the target and discourage its locating preattentively, the target and distractors were shown with a white background and light-gray lettering at the start of a trial (Figure 44a). When the green button was tapped and released, keys turned pink and labels turned black to draw attention to all on-screen objects (Figure 44b). Lift-off selection strategy was used in the study, meaning that the locations of the users’ selections were recorded upon thumb release. This strategy is also currently used for standard interface widgets of Pocket PC operating systems. A successful target selection further required that both the tap and release positions were located within the target area. Users were provided with auditory and visual feedback. When pressed, the ‘x’ target was highlighted in red (Figure 44b), and both success and error sounds were played upon thumb release to indicate whether the target was hit successfully or not. The data collection methods in the discrete target study were software logging and questionnaires. Logs recorded the time between the green button tap and target tap, the absolute position of the target tap, and trial success or failure. After completing all tasks, the participants were asked to rate how comfortable they felt tapping the ‘x’ target in each region using a 7-point scale (1 = uncomfortable, 7 = comfortable), as well as which target size was the smallest they felt comfortable using in each region. 6.1.2. Serial Target Phase The target sizes used in the serial target study ranged from 5.8 mm to 13.4 mm with 0 mm edge-to-edge spacing. The sizes were similar to those in the discrete target

73 phase, except that the smallest target (3.8 mm) was replaced with an even larger target (13.4 mm). This decision was based on previous findings that error rates tended to increase for sequential selections [84], and the fact that error rates were very high for the smallest target in the pilot study. To study the effect of location on task performance, the display was divided into four regions using a 2x2 grid. Each target size was tested 5 times per each region for a total of 100 trials. The trials were divided into 5 blocks in a similar manner to the discrete target phase. For each serial target trial, the users were required to enter a four digit code using a soft numeric keypad. As in the discrete target phase, tasks were performed with the right hand thumb while standing. For each task, a green ‘start’ button, a numeric keypad and a randomly-generated 4-digit goal sequence were displayed. Backspace and ‘END’ keys were also presented in the bottom corners of the keypad. Depending on the keypad location, the green button was placed in the cell above or below the keypad, and the 4-digit goal sequence appeared to the left or right of the keypad. The task was to tap the green button first, enter the target sequence with the keypad, and finally touch the ‘END’ key to confirm and proceed to the next task. The input string was shown below the goal sequence. Several interaction features were retained from the discrete target phase, such as a lift-off selection strategy and auditory and visual feedback. The user interface for the serial target phase is shown in Figure 45. In the serial target phase, logs recorded the total task time from the release of the green button to the release of the ‘END’ button, as well as the transition time between green button and the first keypad button selection. Uncorrected errors were measured by comparing the goal and input sequences, and corrected errors by counting the number of backspace sequences. After completing all tasks, comfort ratings for keypad sizes and regions were gathered similarly to discrete target phase.

Figure 45. User interface for the serial target phase, (a) the startup view for a trial testing a keypad with 7.7 mm targets in the upper right region, (b) the display for the same trial as the user selects the second digit of the sequence (6). 6.2. Results Discrete target task times: Task time, defined from the release of the start button to the release of the ‘x’ target, was analyzed using a 5 x 9 repeated measures analysis of variance (RM-ANOVA) with factors of target size and location. Erroneous trials were removed from the data and the mean total time of the remaining trials was computed. A 5% level of confidence was used to determine statistical significance.

74 As a result, a main effect of target size (F(1,25) = 70.42, p < .001) was observed. No effects of target location or interactions between target size and location were found. Unsurprisingly, participants were able to tap targets faster when they grew in size (Figure 46a). Post-hoc comparisons showed that time differences between all target sizes were significant, even between the two largest targets (p = .04). These results are consistent with Fitts’ law [86], a model for human movement, which (1) defines movement time (MT) with respect to the distance to (A) and size of (W) the target as MT = a + b (ID) = a + b log2(A/W+1).

(1)

The constants a and b have been described as representing efficiency of the pointing device (here, the thumb), while the index of difficulty (ID), defined as log2(A/W+1) in [87], determines that targets are harder to hit the farther they are, but easier to hit the larger they are. Thus, since the distance was constant across different target sizes, Fitts’ law explains the decrease in difficulty and time with the increase in size. Discrete target task error rate: A 5 x 9 RM-ANOVA was carried out on the percentage of trials that were performed in error. Again, a main effect of target size was observed (F(1,27) = 49.18, p < .001), but no other main effects or interactions were found. As depicted in Figure 46b, errors declined as target size increased. Posthoc comparisons revealed that error rates between target sizes differed significantly from one another, except between the two largest targets (9.6 vs. 11.5 mm). This means that while speed improves significantly as targets grow from 9.6 mm to 11.5 mm, error rate does not.

Figure 46. Results for the discrete target phase; (a) mean total time between the releases of the start button and ‘x’ for each target size, (b) mean percentage of erroneous trials, (c) subjective preferences for interacting with targets in each region. Discrete target task hits distribution: The hits distribution for the smallest four targets in all regions is viewed in Figure 47. White blocks indicate the true targets with black crosshairs at the centers, while gray and black dots represent successful and erroneous hits, respectively. Gray bounding boxes indicate hits that fall within 2 standard deviations (2-SD) of the means in the X and Y directions. Along with each diagram, the maximum width and height of the 2-SD areas is shown to offer the minimum sized block that would be expected to enclose 95% of hits at any location. In general, the hit area increased with target size, and thus users traded off speed for tap accuracy. Considering the relative position of the 2-SD boxes with respect to the target centers, a surprising right-leaning trend can be seen for targets on the right side, even though movement direction was from either directly above or below.

75 Discrete target task user preferences: Subjective ratings for interacting with targets in each screen location are presented in Figure 46c. Mean comfort rating (1-7; 7 = most comfortable) is shown in the upper left corner of each region, while white blocks in each cell indicate the mean size of the smallest comfortable target in the region in mm. The center region was perceived as the most comfortable (M = 5.7), while the NW and SW regions were rated as the least comfortable locations for tapping targets with the thumb (both with M = 3.7). Users considered they would be comfortable with smaller targets within the center column and in the center zone in particular (M = 6.0 mm). Moreover, they felt the largest targets would be required in the NW, SW, and SE corners of the screen. As a summary, we can see that subjective ratings correlate with performance results in Figure 47 – across different target sizes, corner regions tended to have larger 2-SD boxes than the center regions. The subjective ratings and hit locations also indicate that users had the most difficulty interacting with objects on the left side and bottom right corner of the display, and were at most ease in the center region.

Figure 47. Hits distribution for 3.8, 5.8, 7.7, and 9.6 mm targets from left to right. Serial target task times: A 5 x 4 RM-ANOVA with factors of target size and location was performed on the task time data, defined from the release of the first digit in sequence to the release of the ‘END’ button. Trials with either corrected or uncorrected errors were eliminated from the data set and the mean total time after the first transition of the remaining trials was calculated. As with the discrete target results, a main effect of key size was found (F(1,25) = 60.02, p < .001). No effect of keypad location or interactions between size and location were present. As depicted in Figure 48a, users were able to enter 4-digit sequences faster as the key sizes grew. Post-hoc comparisons proved that differences between all key sizes were significant. However, in contrast to the results of the discrete phase, Fitts’ law does not explain this finding. Since the keypads scaled uniformly, IDs across keypads of varying sizes remained equal. Thus, Fitts’ law would predict movement times to be equal across conditions in the serial target phase. One possible explanation for this finding is that finger size interacted with key size. Because all but the largest targets were smaller than the average thumb, users may have taken actions to increase accuracy, such as reorienting the thumb, which would have slowed their performance. As a result, we hypothesize that ”the actions users take to accommodate touchscreen targets smaller than the thumb acts upon Fitts’ law as if the target size is smaller than it actually is, thereby increasing total movement time” [1]. Serial target task error rate: A 5 x 4 RM-ANOVA was done on the percentage of erroneous trials. A trial was considered to be successful only if no errors were made. A main effect of target size was observed (F(2,43) = 11.83, p < .001), but no effect of

76 keypad location was found. However, key size interacted with location (F(12,228) = 1.87, p = .039). As shown in Figure 48b, errors declined as key size grew. Post-hoc comparisons revealed that the keypad with the smallest keys (5.8 mm) caused significantly more errors than those with key sizes ≥ 9.6 mm. However, differences in error rates between other key sizes were not significant. Serial target task user preferences: Subjective ratings for interacting with keypads in each region are illustrated in Figure 48c. The NE region was considered the most comfortable (M = 5.7) and SE region the least comfortable location for performing sequences of taps with a thumb (M = 5.0). Further, participants thought they would be comfortable with smaller keys in the NE cell (8.3 mm) while larger keys would be needed in the NW, SW and SE zones (8.9, 8.8 and 8.8 mm, respectively).

Figure 48. Results for the serial target phase; (a) mean transition time between taps after the first transition, (b) mean error rate, (c) subjective ratings. 6.3. Discussion Although speed continued to improve significantly with even the largest targets in both phases, there was no difference in error rates with target sizes ≥ 9.6 mm in discrete tasks and key sizes ≥ 7.7 mm in serial tasks. It is notable that error rates for all target sizes were higher in serial tasks than in discrete tasks. Along with users’ ratings for serial tasks (key size should be at least 8.9 mm on the hardest region) and the fact that mean error rate did not decline when keys grew from 9.6 to 11.5 mm, we conclude that no key size smaller than 9.6 mm should be recommended for serial target selection tasks. The evaluation of hits distribution in the discrete target phase showed that for 9.6 mm targets the minimum sized box expected to enclose 95% of hits at any location was 9.1 x 8.9 mm. Along with subjective ratings for the smallest comfortable target size, we could expect to reach the optimal performance and user preference for discrete tasks with 9.2 mm target size without decreasing speed substantially. The limitations of the study include one posture (standing) and one touchscreen-equipped mobile device (PDA) used for performing tasks. It would be useful to investigate appropriate target sizes while users are on the move, as well as for other touchscreen handhelds with different form factors.

77

7. COMPARISON OF TWO CASE STUDIES The findings of the case studies confirmed the guidelines for empirical evaluation of mobile services derived from the literature review. Selecting laboratory as the test environment for target size study proved to be successful, while evaluation of CampusSearch was necessary to be conducted in a realistic field setting. This chapter explains the statement above and presents a methodological comparison between the two case studies. In addition, the evaluation methods used in SmartCampus and CampusSurf field experiments are compared. 7.1. Field Evaluations vs. Laboratory Study A major distinction between the two case studies was the type of mobile interactive system under evaluation. The interactive system in the target size study consisted of two touchscreen user interface designs, which did not provide any specific service. The objective was to evaluate novel user interfaces designed to support one-handed use of small touchscreen devices, not a mobile service designed to support specific human activity and provide targeted value to the end users. The study investigated the effect of target size and location on task performance and perceived ease of interaction in target selection tasks. Thus, the evaluation focused solely on deviceoriented usability measures – particularly efficiency, error rate and satisfaction, three usability attributes defined by Nielsen [9]. Laboratory setting was found sufficient for studying these aspects, and provided a well-controlled environment for taking accurate quantitative measurements. A number of factors affecting overall user acceptance were not inspected in the target size study. Since the system did not support specific human activity or provide targeted value to the users, the usefulness of the system could not be assessed. Secondly, issues related to the actual usage context and content provided by the system were out of scope, as the context of use was not well-defined and the system did not provide any content. The effects of mobility were not included either, since the study inspected usability of different-sized targets when users performed tasks standing. Furthermore, various other aspects affecting overall acceptability of mobile services were not areas of interest, including social influence, reliability and ease of adoption. In CampusSearch case study, the mobile system under evaluation was aimed to support specific human activity in specified contexts of use. The supported activities were browsing and retrieving information on various resources at a university campus, as well as locating and finding these resources. Thus, the goals of CampusSearch evaluations differed significantly from the target size study. The experiments were not restricted to identifying issues related to the user interface and device-oriented usability. Instead, the focus was to evaluate the usefulness of the service and to assess whether it was usable and acceptable enough to replace existing methods for performing the supported tasks. CampusSearch was targeted to be used in changing, unpredictable, crowded and noisy usage contexts in which the users might be either stationary or mobile. Thus, it was important to explore the issues related to mobility and the actual usage context of the service. Further, the content of the service played a crucial role in assessing overall acceptability. To motivate users to use the service, the content must be topical, accurate and relevant. The service

78 should also be easy to adopt and needs to operate reliably in changing usage situations. The findings from CampusSearch studies suggest that in order to assess these factors affecting overall user acceptance, it is necessary to carry out evaluations in realistic field settings. It would be impossible to simulate all the factors sufficiently in a laboratory. Both case studies proved questionnaires and logging to be cheap and effective indirect data collection methods. In evaluations of CampusSearch, they were unaffected by the mobile and dynamic context in which the service was used. The methods used did not require evaluators to be present during an on-going evaluation session and allowed several people to test the service simultaneously, thus saving time and resources. Server-side logging provided precise data on how each test user had been using CampusSearch. The modified PUEU questionnaire in SmartCampus field trial and SUS questionnaire in CampusSurf provided established instruments for assessing perceived usefulness and usability. Moreover, SUS questionnaire allowed comparisons of usability between CampusSearch and other interactive systems. In the target size study, software logging provided accurate speed, error rate and hits distribution measurements. It also enabled statistical analysis of gathered task time and accuracy data. The questionnaire was found a suitable method for collecting demographics and user feedback on perceived ease of interaction. The developed guidelines for selecting a suitable environment for evaluating mobile interactive systems are depicted in Figure 49. The model recommends laboratory setting for evaluation when the focus is solely on user interface and device-oriented usability related issues. In such cases, laboratory evaluations should be sufficient and more efficient as they are usually easier, cheaper and less timeconsuming to conduct. However, the model emphasizes that field experiments are indispensable if the goal is to inspect a wider range of factors affecting the overall acceptability of the designed mobile service. Such factors include perceived usefulness, relevance of content, effects of actual usage contexts, social and cultural aspects, ease of adoption, and so on. Moreover, studying long-term usage requires field trials that allow users to test the service by themselves for a relatively long period of time.

Figure 49. Guidelines for selecting suitable methods for evaluating mobile services, based on the set of inspected factors affecting overall user acceptance.

79 7.2. SmartCampus vs. CampusSurf The evaluation method used in SmartCampus field trial allowed evaluating overall acceptance of CampusSearch with a relatively large population, thus providing a basis for statistical analysis. The decision to let users try and test the service freely without specific tasks increased reality of the test setting. Further, it enabled users to test CampusSearch in the middle of their daily activities in such situations of concern which they thought they could resolve by using the service. This strategy, however, did not provide researchers any control over what the users actually did with the service. In the worst case this may lead to a situation where many users do not use the service at all. The participants were not necessarily fully aware of all the features provided by the service, since no special training or orientation session was held prior to testing. Thus, the time used for testing and the level of familiarity with the service could be highly variable among the participants who returned a completed questionnaire. Besides providing a brief orientation session, it might be worthwhile to define certain rules and agree with the users on “mandatory tasks” or the minimum usage of the service before the test. The task-based evaluation method used in the CampusSurf field experiment gave evaluators a greater control over the tasks the test users carried out with the service. The method ensured that all the main features of the service were tested and similar search tasks were performed by each participant. The short-term evaluation was also faster, cheaper and easier to conduct than a large-scale field trial. The initial focus group session provided the users with decent training and orientation for using the test devices and services, while the final meeting allowed participants to discuss their experiences as well as allowed researchers to gain a deeper insight into various aspects of usability and user acceptance. The level of reality in the CampusSurf experiment, however, could not reach the level of SmartCampus. The reason for this was that the users were instructed to perform predefined tasks within a few hours instead of letting them test the service freely over a lengthy period of time. Moreover, longer-term field trials allowing users to test mobile services as part of their daily lives are needed for profound investigation of various aspects affecting overall system acceptance. These aspects include diverse usage contexts, content, adoption, social influence and trust. Further, it would have been better to conduct the comparative evaluation in CampusSurf between devices with less difference in data connection speed. Unfortunately, the lack of suitable SIM cards for 3G phones forced the researchers to select a mobile phone model with GPRS connectivity for the evaluation. One weakness of both field evaluations was that there was no comparative evaluation with existing methods for finding places in the campus area. In addition, it might have been worthwhile to complement data collection with direct observational methods suitable for mobile evaluation to obtain more reliable results.

80

8. SUMMARY This thesis studied design and evaluation of interactive systems for mobile handheld devices. An extensive literature review on the design processes and evaluation methods was introduced, addressing the special characteristics of mobile devices and services that should be taken into account during different stages of design. In addition, the literature review yielded recommendations for selecting appropriate methods and data collection techniques for evaluating mobile interactive systems. As a result, field experiments with real users were considered necessary for evaluation when the focus is not simply on device-centered usability issues. Otherwise, laboratory environment should be sufficient and more efficient as an evaluation setting. Moreover, software logging and questionnaires were suggested as suitable data gathering methods, since they are hardly affected by mobility and the dynamic and unpredictable contexts in which mobile services are typically used. The first case study of this work included the design and implementation of CampusSearch, a mobile browser-based service for browsing and retrieving information on various resources in a university campus area. The need for such mobile service was identified in a comprehensive user survey, and the obtained results were used as a basis for the development of a new service. The service was empirically evaluated in two field experiments with real end users in natural context of use, which in this case was the university campus. Empirical data was collected using techniques recommended by the literature review. The results revealed that CampusSearch was found useful and the users were generally willing to use a similar service in the future. The latter experiment, CampusSurf, proved that users preferred Nokia 770 Internet tablet to Nokia 6670 mobile phone for using the service because of its wider screen, faster network connection and more convenient user interface. The second case of this thesis consisted of a two-phase study to determine optimal sizes for direct thumb interaction targets when using a mobile touchscreen-based device with a single hand. The study, carried out in a laboratory, looked in detail at interaction between target size and task performance in both single- and multi-target selection tasks. The results of the experiment suggest that target size of 9.2 mm for single-target tasks and targets of 9.6 mm for multi-target tasks should be sufficiently large without degrading performance and user preference. Potential areas for future work include exploring appropriate target sizes while users are mobile, as well as conducting similar studies with other types of small touchscreen-equipped devices. Finally, this work presented a methodological comparison of the two case studies, in which the developed guidelines for evaluation of mobile services were confirmed.

81

REFERENCES [1]

Parhi, P., Karlson, A. & Bederson, B. (2006) Target size study for onehanded thumb use on small touchscreen devices. Proceedings of the MobileHCI 2006, Espoo, Finland, pp. 203-210.

[2]

Newman, W.M. & Lamming, M.G. (1995) Interactive System Design. Addison-Wesley, 468 p. ISBN 0-201-63162-8.

[3]

Preece, J., Rogers, Y., Sharp, H., Benyon, D., Holland, S. & Carey, T. (1994) Human-Computer Interaction. Addison-Wesley, 775 p. ISBN 0-20162769-8.

[4]

Preece, J., Rogers, Y. & Sharp, H. (2002) Interaction Design: Beyond Human-Computer Interaction. John Wiley & Sons, 519 p. ISBN 0-47149278-7.

[5]

Norman, D.A. & Draper, S. (1986) User Centered System Design: New Perspectives on Human-Computer Interaction. Lawrence Erlbaum Associates, Hillsdale, NJ, USA, 544 p. ISBN 0-898-59872-9.

[6]

ISO 13407. Human-centered design processes for interactive systems. ISO/TC159/SC4. International Standard. 1999.

[7]

Hartson, H.R. & Hix, D. (1989) Toward empirically derived methodologies and tools for human-computer interface development. International Journal of Man-Machine Studies, vol. 31, no. 4, pp. 477-494.

[8]

Carroll, J.M. (1995) Scenario-Based Design: Envisioning Work and Technology in System Development. John Wiley & Sons, 408 p. ISBN 0471-07659-7.

[9]

Nielsen, J. (1993) Usability Engineering. Academic Press, San Diego, CA, USA, 362 p.

[10] Checkland, P. & Scholes, J. (1990) Soft Systems Methodology in Action. John Wiley & Sons, 418 p. ISBN 0-471-98605-4. [11] ISO 9241-11. Ergonomic requirements for office work with visual display terminals (VDTs) – Part 11: Guidance on usability. ISO/TC159/SC4. International Standard. 1998. [12] The Use and Misuse of Focus Groups. (3.5.2006) useit.com: Jakob Nielsen’s Website. URL: http://www.useit.com/papers/focusgroups.html. [13] Contextual inquiry. (3.5.2006) UsabilityNet: usability resources for practitioners and managers. URL: http://www.usabilitynet.org/tools/ contextualinquiry.htm.

82 [14] User observation and field studies. (4.5.2006) UsabilityNet: usability resources for practitioners and managers. URL: http://www.usabilitynet.org/tools/userobservation.html. [15] Johnson, P. (1992) Human-Computer Interaction: Psychology, Task Analysis and Software Engineering. McGraw-Hill, 256p. ISBN 0-07707235-9. [16] O’Neill, E. & Johnson, P. (2004) Participatory task modeling: users and developers modeling users’ tasks and domains. Proceedings of the TAMODIA’04 conference on Task models and diagrams, Prague, Czech Republic, pp. 67-74. [17] Unified Modeling Language. (5.5.2006) URL: http://www.uml.org/. [18] TaskArchitect. (5.5.2006) URL: http://www.taskarchitect.com/. [19] Heuristic Evaluation. (12.5.2006) useit.com: Jakob Nielsen’s Website. URL: http://www.useit.com/papers/heuristic/. [20] Rudd, J., Stern, K. & Isensee, S. (1996) Low vs. high-fidelity prototyping debate. Interactions, vol. 3, no. 1, pp. 76-85. [21] HyperCard (17.5.2006) Wikipedia, the http://en.wikipedia.org/wiki/HyperCard.

free

encyclopedia.

URL:

[22] ToolBook (17.5.2006) URL: http://www.toolbook.com/. [23] Rapid prototyping (17.5.2006) UsabilityNet: usability resources for practitioners and managers. URL: http://www.usabilitynet.org/tools/ rapid.htm. [24] Vaidyanathan, J., Robbins, J. E. & Redmiles, D. F. (1999) Using HTML to create early prototypes. Proceedings of the CHI’99 Conference on Human Factors in Computing Systems, Pittsburgh, PA, USA, pp. 232-233. [25] Kjeldskov, J. & Graham, K. (2003) A review of mobile HCI research methods. Proceedings of the MobileHCI 2003, Udine, Italy, pp. 317-335. [26] Johnson, P. (1998) Usability and mobility: Interactions on the move. Proceedings of the First Workshop on Human-Computer Interaction with Mobile Devices, Glasgow, UK, GIST Technical Report G98-1. [27] Salmre, I. (2005) Writing Mobile Code: Essential Software Engineering for Building Mobile Applications. Addison Wesley, 792 p. ISBN 0-321-269314. [28] Kjeldskov, J. & Stage, J. (2004) New techniques for usability evaluation of mobile systems. International Journal of Human-Computer Studies (IJHCS), vol. 60, pp. 599-620.

83 [29] Davis, F.D. (1989) Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology. MIS Quarterly, vol. 13, no. 3, pp. 319-340. [30] Kaasinen, E. (2005) User acceptance of mobile services – value, ease of use, trust and ease of adoption. VTT Publications 566, 151 p. + app. 64 p. ISBN 951-38-6640-6. [31] Diffusion of Innovations (9.1.2007) Theories used in IS research. URL: http://www.istheory.yorku.ca/diffusionofinnovations.htm. [32] Venkatesh, V., Morris, M. G., Davis, G. B. & Davis, F. D. (2003) User acceptance of information technology: toward a unified view. MIS Quarterly, vol. 27, no. 3, pp. 425-478. [33] Crossing the Chasm (9.1.2007) Wikipedia, the free encyclopedia. URL: http://en.wikipedia.org/wiki/Crossing_the_Chasm. [34] Theory of Planned Behavior (14.1.2007) Icek Ajzen’s homepage. URL: http://www.people.umass.edu/aizen/tpb.html. [35] Theory of Planned Behavior (14.1.2007) Theories used in IS research. URL: http://www.istheory.yorku.ca/theoryofplannedbehavior.htm. [36] Garzonis, S. (2005) Usability evaluation of context-aware mobile systems: A review. 3rd UK-UbiNet Workshop, Bath, UK. [37] Abowd, G. D., & Mynatt, E. D. (2000) Charting past, present, and future research in ubiquitous computing. ACM Transactions on Computer-Human Interaction, vol. 7, issue 1, pp. 29-58. [38] Kjeldskov, J., Graham, C., Pedell, S., Vetere, F., Howard, S., Balbo S. & Davies, J. (2005) Evaluating the usability of a mobile guide: The influence of location, participants and resources. Behavior and Information Technology, vol 24, issue 1, pp. 51–65. [39] Goodman, J., Brewster, S. & Gray, P. (2004) Proceedings of HCI in Mobile Guides, workshop at MobileHCI 2004, Glasgow, UK. [40] Kellar, M., Reilly, D., Hawkey, K., Rodgers, M., MacKay, B., Dearman, D., Ha, V., MacInnes, W. J., Nunes, M., Parker, K., Whalen, T. & Inkpen, K. M. (2005) It’s a jungle out there: Practical considerations for evaluation in the city. Proceedings of the CHI’05 Conference on Human Factors in Computing Systems, Portland, OR, USA, pp. 1533-1536. [41] Kaikkonen, A., Kekäläinen, A., Cankar, M., Kallio, T. & Kankainen, A. (2005) Usability testing of mobile applications: A comparison between laboratory and field testing. Journal of Usability Studies, vol. 1, issue 1, pp. 4-16.

84 [42] Vetere, F., Howard, S., Pedell, S. & Balbo, S. (2003) Walking through mobile use: novel heuristics and their application. Proceedings of the OzCHI 2003, pp. 24-32. [43] Kjeldskov, J., Skov, M. B., Als, B. S. & Høegh, R. T. (2004) Is it worth the hassle? Exploring the added value of evaluating the usability of contextaware mobile systems in the field. Proceedings of the MobileHCI 2004, Glasgow, UK, pp. 61-73. [44] Iachello, G. & Terrenghi, L. (2005) Mobile HCI 2004: Experience and reflection. IEEE Pervasive Computing, vol. 4, no. 1, pp. 88-91. [45] Sharp, R. & Rehman, K. (2005) The 2005 UbiApp workshop: what makes good application-led research? IEEE Pervasive Computing, vol. 4, no. 3, pp. 80-82. [46] Kort, J. & de Poot, H. (2005) Usage analysis: Combining logging and qualitative methods. Proceedings of the CHI’05 Conference on Human Factors in Computing Systems, Portland, OR, USA, pp. 2121-2122. [47] Gerhardt-Powals, J. (1996) Cognitive engineering principles for enhancing human-computer performance. International Journal of Human-Computer Interaction, vol. 8, no. 2, pp. 189-211. [48] Nielsen, J. (1992) Finding usability problems through heuristic evaluation. Proceedings of the CHI’92 Conference on Human Factors in Computing Systems, Monterey, CA, USA, pp. 373-380. [49] Redish, J., Bias, R.G., Bailey, R., Molich, R., Dumas, J. & Spool, J.M. (2002) Usability in practice: formative usability evaluations – evolution and revolution. Proceedings of the CHI’02 Conference on Human Factors in Computing Systems, Minneapolis, MN, USA, pp. 885-890. [50] Isomursu M., Kuutti, K. & Väinämö, S. (2004) Experience Clip: Method for user participation and evaluation of mobile concepts. Proceedings of Participatory Design 2004, Toronto, Canada, pp. 83-92. [51] Ojala, T., Korhonen, J., Sutinen, T., Parhi, P. & Aalto L. (2004) Mobile Kärpät – A case study in wireless personal area networking. Proceedings of the MUM 2004, College Park, MD, pp. 149-156. [52] Root, R.W. & Draper, S. (1983) Questionnaires as a Software Evaluation Tool. Proceedings of the CHI’83 Conference on Human Factors in Computing Systems, Boston, MA, USA, pp. 83-87. [53] Kirakowski, J. (14.3.2006) Questionnaires in Usability Engineering. URL: http://www.ucc.ie/hfrg/resources/qfaq1.html. [54] Kitchenham, B.A. & Pfleeger, S.L. (2002) Principles of survey research part 3: Constructing a survey instrument. Software Engineering Notes, vol. 27, no. 2, pp 20–24.

85 [55] Chin, J.P., Diehl, V.A. & Norman, L.K. (1988) Development of an Instrument Measuring User Satisfaction of the Human-Computer Interface. Proceedings of the CHI’88 Conference on Human Factors in Computing Systems, Washington, DC, USA, pp. 213-218. [56] Brooke, J. (1996) SUS: A Quick and Dirty Usability Scale. In: P.W. Jordan, B. Thomas, B.A. Weerdmeester & I.L. McClelland (Eds.), Usability Evaluation in Industry. Taylor & Francis, 224 p. ISBN 0-748-40460-0. [57] SUMI questionnaire. (1.8.2006) URL: http://sumi.ucc.ie/. [58] Lewis, J.R. (1995) Computer Usability Satisfaction Questionnaires: Psychometric Evaluation and Instructions for Use. International Journal of Human-Computer Interaction, vol. 7, no. 1, pp. 57-78. [59] Lin, H.X., Choong, Y-Y. & Salvendy, G. (1997) A Proposed Index of Usability: A Method for Comparing the Relative Usability of Different Software Systems. Behavior and Information Technology, vol. 16, no. 4-5, pp. 267-277. [60] QUIS, Questionnaire for User Interaction Satisfaction (26.5.2006) URL: http://lap.umd.edu/QUIS/. [61] Tullis, T. S. & Stetson, J. N. (2004) A comparison of questionnaires for assessing website usability. Usability Professional Association (UPA) Conference, Minneapolis, MN, USA. [62] Teague, R., De Jesus, K. & Nunes-Ueno, M. (2001) Concurrent vs. posttask usability test ratings. Proceedings of the CHI’01 Conference on Human Factors in Computing Systems, Seattle, WA, USA, pp. 289-290. [63] Wright, K.B. (2005) Researching Internet-based populations: advantages and disadvantages of online survey research, online questionnaire authoring software packages, and web survey services. Journal of Computer-Mediated Communication, vol. 10, issue 3, article 11. [64] WebTrends (1.9.2006) URL: http://www.webtrends.com/. [65] Unica NetTracker for Web Analytics (1.9.2006) http://www.unica.com/product/product.cfm?pw=untng.

URL:

[66] Analog: WWW log file analysis (1.9.2006) URL: http://www.analog.cx/. [67] AWStats – Free log file analyzer for advanced statistics (GNU GPL) (1.9.2006) URL: http://awstats.sourceforge.net/. [68] Rotuaari project (3.7.2006) URL: http://www.rotuaari.net/. [69] Ojala, T., Korhonen, J., Aittola, M., Ollila, M., Koivumäki, T., Tähtinen, J. & Karjaluoto, H. (2003) SmartRotuaari – context-aware mobile multimedia services. Proceedings of the MUM 2003, Norrköping, Sweden, pp. 9-18.

86 [70] Mobiilikysely, Oulun yliopiston mobiilipalvelut http://virtuaalikampus.oulu.fi/mobiilikysely.html.

(3.7.2006)

URL:

[71] Apache HTTP Server (3.7.2006) URL: http://httpd.apache.org/. [72] Apache Tomcat (3.7.2006) URL: http://tomcat.apache.org/. [73] MySQL (3.7.2006) URL: http://www.mysql.com/. [74] Hibernate (3.7.2006) URL: http://www.hibernate.org/. [75] XHTML guidelines for creating web content v1.3, Forum Nokia. (3.7.2006) URL: http://sw.nokia.com/id/7f3f1424-b51e-4067-a3ef-acaab08e484f/ XHTML_Guidelines_For_Creating_Web_Content_v1_3_en.pdf. [76] World Geodetic System. Wikipedia, the free encyclopedia. (3.7.2006) URL: http://en.wikipedia.org/wiki/World_Geodetic_System. [77] Aittola, M. (2003) Paikkatietoiset palvelut pienpäätelaitteissa. Diploma thesis. University of Oulu, Department of Electrical and Information Engineering, 72 p. [78] panOULU public access network (3.7.2006) URL: http://www.panoulu.net/ [79] Aalto, L., Göthlin, N., Korhonen, J. & Ojala, T. (2004) Bluetooth and WAP push based location-aware mobile advertising system. Proceedings of the MobiSys ‘04, Boston, MA, USA, pp. 49-58. [80] Ekahau (3.7.2006) URL: http://www.ekahau.com/. [81] Log4j project (3.7.2006) URL: http://logging.apache.org/log4j/. [82] Octopus Network (1.8.2006) URL: http://www.octo.fi/. [83] University of Maryland, Human-Computer Interaction Lab (1.8.2006) URL: http://www.cs.umd.edu/hcil/. [84] Colle, H. A. & Hiszem, K. J. (2004) Standing at a kiosk: effects of key size and spacing on touch screen numeric keypad performance and user experience. Ergonomics, vol. 47, no. 13, pp. 1406-1423. [85] Piccolo.NET (1.8.2006) URL: http://www.cs.umd.edu/hcil/piccolo/. [86] Fitts, P. M. (1954) The information capacity of the human motor system in controlling amplitude of movement. Journal of Experimental Psychology, vol. 47, pp. 381-391. [87] MacKenzie, I. S. (1989) A note on the information-theoretic basis for Fitts’ law. Journal of Motor Behavior, vol. 21, pp. 323-330.

87

APPENDICES Appendix 1

SmartCampus field trial questionnaire

Appendix 2

Version A of the CampusSurf concept evaluation questionnaire

Appendix 3

Target size study questionnaire

© Oulun yliopisto 2005

Palautettu: ___ .04.2005 klo: ___:___

SMARTCAMPUS Tiedot matkapuhelimestasi: Merkki

_______________ (esim. Nokia, Siemens)

Malli

_______________ (esim. 3110, M65)

Minulla ei ole matkapuhelinta. Testasitko SmartCampus-palveluja lainapuhelimella?

Nro ____________

Kyllä Ei

Mikäli sinulla oli lainapuhelin, arvioi laina-aika _______________ vrk/h Kirjauduitko SmartCampus-järjestelmään kertaakaan?

Ei

Kyllä

Jos kirjauduit, miksi? _______________________________________________________________________ Jos kirjauduit Oma Kampus -palveluun, käytitkö ”Muista kirjautumiseni tässä laitteessa” –ominaisuutta? Kyllä

Ei

Käyttäjätunnuksesi, mikäli rekisteröidyit järjestelmään:_________________________________________ Arvioi, kuinka paljon käytit seuraavia palveluja. En kertaakaan

1-2

3-5

5 - 10

krt

krt

krt

Useammin

En tiedä

a) MobiiliOptima b) Oodi | Mobile c) Kampushaku d) Kampustagi e) Lehtiset f)

SmartLibrary

g) Ulkopuoliset palvelut Mikä oli paras palvelu? Perustele. ____________________________________________________________________________________________ ____________________________________________________________________________________________ ____________________________________________________________________________________________ Palvelujen käytön aikana ilmeni seuraavia ongelmia: a) b) c) d) e) f) g)

Ei koskaan Yhteyden katkeaminen ………….....................……………....... Käyttöongelmat (esim. monimutkainen valikko)..............….…. Informaation vaikea hyödynnettävyys..……………………….... Päätelaitteen huono reagointi komentoihin.............................. Näytön huono näkyvyys ................................................ Näppäimistöä oli hankala käyttää .................................. Muu tekninen vika…………………………………………...…….

Harvoin

Joskus Usein

Koko ajan

Mikä____________________________________________________________________________________

© Oulun yliopisto 2005

Oliko palvelujen käytössä jotain todella hankalaa? Kuvaile hankaluuksia. ____________________________________________________________________________________________ ____________________________________________________________________________________________ ____________________________________________________________________________________________

Seuraavat kysymykset koskevat palvelujen käyttöä yleisesti.Vastaa vain käyttämiesi palvelujen osalta. Palvelu on hyödyllinen. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2 a) MobiiliOptima b) Oodi | Mobile c) Kampushaku d) Kampustagi e) Lehtiset f) SmartLibrary g) Ulkopuoliset palvelut Palvelun avulla saan haluamaani tietoa. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2 a) MobiiliOptima b) Oodi | Mobile c) Kampushaku d) Kampustagi e) Lehtiset f) SmartLibrary g) Ulkopuoliset palvelut Palvelu tukee (opiskeluun/työntekoon) liittyvissä asioissa. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2 a) MobiiliOptima b) Oodi | Mobile c) Kampushaku d) Kampustagi e) Lehtiset f) SmartLibrary g) Ulkopuoliset palvelut Palvelu on helppokäyttöinen. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2 a) MobiiliOptima b) Oodi | Mobile c) Kampushaku d) Kampustagi e) Lehtiset f) SmartLibrary g) Ulkopuoliset palvelut Palvelun käytön opettelu oli helppoa. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2 a) MobiiliOptima b) Oodi | Mobile c) Kampushaku d) Kampustagi e) Lehtiset f) SmartLibrary g) Ulkopuoliset palvelut

En 3

4

5

tiedä

3

4

5

tiedä

3

4

5

tiedä

3

4

5

tiedä

3

4

5

tiedä

En

En

En

En

© Oulun yliopisto 2005

Pidän palvelun käyttämisestä. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2 a) MobiiliOptima b) Oodi | Mobile c) Kampushaku d) Kampustagi e) Lehtiset f) SmartLibrary g) Ulkopuoliset palvelut Mahdollisuus käyttää palvelua mobiilisti on hyvä asia. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2 a) MobiiliOptima b) Oodi | Mobile c) Kampushaku d) Kampustagi e) Lehtiset f) SmartLibrary g) Ulkopuoliset palvelut

En 3

4

5

tiedä

3

4

5

tiedä

4

5

tiedä

5

tiedä

En

Palvelun käyttööni vaikutti se, että huomasin muidenkin käyttävän sitä. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2 3 a) MobiiliOptima b) Oodi | Mobile c) Kampushaku d) Kampustagi e) Lehtiset f) SmartLibrary g) Ulkopuoliset palvelut

En

Palvelun käyttööni vaikutti se, että yliopisto kannustaa/mahdollistaa palvelujen käytön. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2 3 4 a) MobiiliOptima b) Oodi | Mobile c) Kampushaku d) Kampustagi e) Lehtiset f) SmartLibrary g) Ulkopuoliset palvelut Minulla on tarvittavat tiedot palvelun käyttämiseen. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2 a) MobiiliOptima b) Oodi | Mobile c) Kampushaku d) Kampustagi e) Lehtiset f) SmartLibrary g) Ulkopuoliset palvelut Palvelussa ollut ”Apua”-toiminto oli minulle hyödyllinen. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2 a) MobiiliOptima b) Oodi | Mobile c) Kampushaku d) Kampustagi e) Lehtiset f) SmartLibrary g) Ulkopuoliset palvelut

En

En 3

4

5

tiedä

3

4

5

tiedä

En

© Oulun yliopisto 2005

Palvelun käyttö sai minut tuntemaan oloni epävarmaksi. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2 a) MobiiliOptima b) Oodi | Mobile c) Kampushaku d) Kampustagi e) Lehtiset f) SmartLibrary g) Ulkopuoliset palvelut Pelkäsin tekeväni virheen painamalla väärää painiketta. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2 a) MobiiliOptima b) Oodi | Mobile c) Kampushaku d) Kampustagi e) Lehtiset f) SmartLibrary g) Ulkopuoliset palvelut Aion käyttää vastaavaa palvelua tulevaisuudessa. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2 a) MobiiliOptima b) Oodi | Mobile c) Kampushaku d) Kampustagi e) Lehtiset f) SmartLibrary g) Ulkopuoliset palvelut

En 3

4

5

tiedä

3

4

5

tiedä

3

4

5

tiedä

4

5

tiedä

En

En

Uskon käyttäväni kehittyneitä mobiilipalveluja enemmän tulevaisuudessa. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2 3

En

Mikä teki palvelujen käyttämisestä hauskaa tai vähemmän hauskaa?

________________________________________________________________________________ ________________________________________________________________________________ ________________________________________________________________________________ ________________________________________________________________________________ ________________________________________________________________________________ Miten SmartCampus-palveluja tulisi kehittää?

________________________________________________________________________________ ________________________________________________________________________________ ________________________________________________________________________________ ________________________________________________________________________________ ________________________________________________________________________________

© Oulun yliopisto 2005

Seuraavat kysymykset koskevat ainoastaan Kampushaku-palvelua. Kohteiden etsiminen ja löytäminen palvelun avulla oli helppoa Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2 a) pöytäkoneella/kannettavalla tietokoneella b) kämmenmikrolla c) matkapuhelimella.

En 3

4

5

tiedä

Karttakuvat olivat havainnollisia ja niiden avulla oli helppo suunnistaa. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2 3

4

5

tiedä

Oman sijainnin näyttämisestä kartalla oli minulle hyötyä. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2

4

5

3

En

Sijaintiani ei näytetty

Mikä seuraavista hakutoiminnoista oli hyödyllisin? Valitse vain yksi vaihtoehto. a) Paikkahaku b) Aktiviteettihaku c) Henkilöhaku

Seuraavat kysymykset koskevat ainoastaan Kampustagi-palvelua. Lukijan koskettaminen on mielestäni helppo tapa avata kyseiseen paikkaan liittyvää sisältöä. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2 3 4 5

En tiedä

Olivatko Kampustagi-lukijat mielestäsi oleellisilla paikoilla? Kerro minne olisit kaivannut Kampustagilukijapisteitä. ________________________________________________________________________________________________ ________________________________________________________________________________________________ ________________________________________________________________________________________________ Millaista sisältöä olisit toivonut Kampustagi-palveluun? ________________________________________________________________________________________________ ________________________________________________________________________________________________ ________________________________________________________________________________________________ Vastasivatko mielestäsi lukijapisteiden sijainnit ja niihin liitetyt sisällöt toisiaan? ________________________________________________________________________________________________ ________________________________________________________________________________________________

Taustatiedot Sukupuoli

Mies

Nainen

Ikä

alle 18 v.

18 - 24 v.

Oletko: Opiskelija………………………………….. Henkilökunta….………………..…............ Vierailija….………………………..………. Muu….……………………………..............

25 - 34 v.

35 - 49 v.

50 - 64 v.

Tiedekunta Humanistinen……………………….. Kasvatustieteiden….……………….. Luonnontieteellinen..……………….. Lääketieteellinen..………………….. Taloustieteiden.…………………….. Teknillinen……………………..…….

65 v. tai yli

© Oulun yliopisto 2006

CampusSurf -käyttäjätesti Tehtävä- ja kyselylomake (Tyyppi A)

Aloituskysely 1.

Ikäsi:

______

2.

Sukupuolesi:

3.

Mitä alaa opiskelet/olet viimeksi opiskellut? (Oppilaitos, tiedekunta, laitos/osasto)

Nainen

Mies

__________________________________________________________________________________ 4.

Oletko käyttänyt web-selainta mobiililaitteessa? En ole käyttänyt Olen kokeillut lyhyesti Jos olet, niin millaista? Vanhat wap-tekstiselaimet

5.

Uudet S40- ja S60-älypuhelimet

Käytän runsaasti

Muut (Esim. PDA)

En osaa sanoa

Kuinka hyvin tunnet mielestäsi yliopistorakennuksen? (Kuinka hyvin osaat mielestäsi suunnistaa rakennuksen sisällä ja tiedät missä mitäkin sijaitsee) Erittäin hyvin

Erittäin huonosti 6.

Käytän jonkin verran

Kuinka usein käytät Optima-oppimisympäristöä? Päivittäin

3-5 kertaa viikossa

Kerran viikossa

Pari kertaa kuukaudessa

Harvemmin

Ohjeita ennen testin aloitusta •

Linkit palveluihin löytyvät osoitteesta mobile.oulu.fi. Laitteiden kirjanmerkeistä löytyvä ”CampusSurf”-linkki vie myös palveluiden linkkisivulle.

•

Tässä testissä suoritettavat tehtävät on jaettu kahteen osaan. Suorita kaikki tehtävät itsenäisesti kysymättä apua toisilta.

•

Käytä Nokia N770 Internet-päätettä yliopiston läpi kulkevan pääkäytävän lähettyvillä tai Tietotalossa, joissa panOULU -verkko (langaton laajakaistaverkko) toimii varmasti.

•

1. osan tehtävät liittyvät Kampushaku-palveluun. Suorita tehtävät 1-3 Nokia 6670 matkapuhelimella ja tehtävät 4-6 Nokia N770:lla.

•

Jos kohtaat ylitsepääsemättömiä ongelmia tehtäviä suorittaessasi, voit ottaa yhteyttä Pekka Parhiin (huone TS376, puh. 040-7078475, s-posti [email protected],).

Testikäyttäjätunnuksesi: kampus19, salasana: 2ssc19b.

© Oulun yliopisto 2006

TEHTÄVÄT :: OSA I :: KAMPUSHAKU Suorita tehtävät 1-3 Nokia 6670 -matkapuhelimella. Siirry Kampushakuun laitteen kirjanmerkeissä (bookmarks) olevan CampusSurf-linkin kautta. Ensimmäiseksi kirjaudu palveluun testikäyttäjätunnuksillasi. Valitse @rotuaari.net kirjautumissivulla. Tehtävä 1: paikkahaku matkapuhelimella ”Sinulla on huomenna tapaaminen professorin kanssa huoneessa TS310. Selvitä Kampushaun avulla missä kyseinen huone sijaitsee ja suunnista sen luokse.” Suoritettuasi tehtävän 1 vastaa seuraaviin kysymyksiin. T1.1 Kenen sarjakuvapiirtäjän sarjakuva on huoneen ovessa? Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä

____________________________ 1

2

3

4

5

En tiedä

T1.2 Tämän tehtävän suorittaminen oli mielestäni helppoa. T1.3 Tämän tehtävän suorittaminen oli mielestäni nopeaa.

Tehtävä 2: aktiviteettihaku matkapuhelimella ”Haluat ottaa selvää missä tietokoneverkot -kurssin opetusta järjestetään. Selvitä Kampushaun avulla opetustilan sijainti ja suunnista sen luokse.” Suoritettuasi tehtävän 2 vastaa seuraaviin kysymyksiin. T2.1 Mihin asti kyseinen opetustila on avoinna perjantaisin? Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä

___________ 1

2

3

4

5

En tiedä

T2.2 Tämän tehtävän suorittaminen oli mielestäni helppoa. T2.3 Tämän tehtävän suorittaminen oli mielestäni nopeaa.

Tehtävä 3: henkilöhaku matkapuhelimella ”Haluat käydä ilmoittautumassa tutkija Teemu Myllylän pitämälle laboratoriotyökurssille, mutta ilmoittautumislista puuttuu ilmoitustaululta. Paikanna Myllylän toimisto Kampushaun avulla ja suunnista toimiston luokse.” Suoritettuasi tehtävän 3 vastaa seuraaviin kysymyksiin. T3.1 Mikä nimi lukee ovikyltissä Myllylän nimen yläpuolella? Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä T3.2 Tämän tehtävän suorittaminen oli mielestäni helppoa. T3.3 Tämän tehtävän suorittaminen oli mielestäni nopeaa.

_______________________________ 1

2

3

4

5

En tiedä

© Oulun yliopisto 2006

Seuraavat kysymykset koskevat Kampushaku-palvelun käyttöä Nokia 6670 -matkapuhelimella. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2 3 4 5

En tiedä

a) Haluaisin käyttää tätä palvelua tällä laitteella. b) Pidin palvelua tarpeettoman monimutkaisena. c) Palvelu oli tällä laitteella mielestäni helppokäyttöinen. d) Uskoisin tarvitsevani teknistä tukea pystyäkseni käyttämään palvelua tällä laitteella. e) Palvelun toiminnot muodostivat mielestäni hyvän kokonaisuuden. f) Mielestäni palvelussa oli liian paljon epäjohdonmukaisuuksia. g) Uskoisin että useimmat ihmiset oppisivat käyttämään tätä palvelua tällä laitteella hyvin nopeasti. h) Pidin palvelua hankalana käyttää tällä laitteella. i) Koin oloni itsevarmaksi käyttäessäni palvelua tällä laitteella. j) Minun täytyi oppia paljon ennen kuin pääsin alkuun palvelun käytössä tällä laitteella. k) Karttakuvat olivat tällä laitteella havainnollisia ja niiden avulla oli helppo suunnistaa.

Vaihda nyt käyttämääsi laitetta, ja suorita seuraavat tehtävät Nokia N770 Internet-päätteellä. Muista ensiksi kirjautua palveluun testikäyttäjätunnuksillasi (valitse @rotuaari.net). Tehtävä 4: paikkahaku Internet-päätteellä ”Olet matkalla seuraavalle oppitunnille, joka järjestetään salissa YL124. Selvitä Kampushaun avulla missä kyseinen sali sijaitsee ja suunnista salin luo.” Suoritettuasi tehtävän 4 vastaa seuraaviin kysymyksiin. T4.1 Minkä värinen on kyseisen salin ovi?

____________________________

Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä

1

2

3

4

5

En tiedä

T4.2 Tämän tehtävän suorittaminen oli mielestäni helppoa. T4.3 Tämän tehtävän suorittaminen oli mielestäni nopeaa. Sijaintiani ei näytetty T4.4 Oman sijainnin näyttämisestä kartalla oli minulle hyötyä.

Tehtävä 5: aktiviteettihaku Internet-päätteellä ”Olet menossa 10.4. matriisialgebra -kurssin luennolle. Selvitä Kampushaun avulla missä se pidetään ja suunnista kyseisen paikan luokse.”

© Oulun yliopisto 2006

Suoritettuasi tehtävän 2 vastaa seuraaviin kysymyksiin. T5.1 Minkä värinen on kyseisen salin ovi?

____________________________

Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä

1

2

3

4

5

En tiedä

T5.2 Tämän tehtävän suorittaminen oli mielestäni helppoa. T5.3 Tämän tehtävän suorittaminen oli mielestäni nopeaa. Sijaintiani ei näytetty T5.4 Oman sijainnin näyttämisestä kartalla oli minulle hyötyä.

Tehtävä 6: henkilöhaku Internet-päätteellä ”Sinun pitäisi käydä harjoitustyövalvojasi Perttu Laurisen luona. Paikanna hänen työhuoneensa Kampushaun avulla ja suunnista hänen toimistonsa luokse.” Suoritettuasi tehtävän 3 vastaa seuraaviin kysymyksiin. T6.1 Mikä eläin on kuvattuna Laurisen nimikyltin alla olevassa postikortissa? Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä

1

________________________ 2

3

4

5

En tiedä

T6.2 Tämän tehtävän suorittaminen oli mielestäni helppoa. T6.3 Tämän tehtävän suorittaminen oli mielestäni nopeaa. Sijaintiani ei näytetty T6.4 Oman sijainnin näyttämisestä kartalla oli minulle hyötyä.

Seuraavat kysymykset koskevat Kampushaku-palvelun käyttöä Nokia N770 Internet-päätteellä. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1 2 3 4 5 a) Haluaisin käyttää tätä palvelua tällä laitteella. b) Pidin palvelua tarpeettoman monimutkaisena. c) Palvelu oli tällä laitteella mielestäni helppokäyttöinen. d) Uskoisin tarvitsevani teknistä tukea pystyäkseni käyttämään palvelua tällä laitteella. e) Palvelun toiminnot muodostivat mielestäni hyvän kokonaisuuden. f) Mielestäni palvelussa oli liian paljon epäjohdonmukaisuuksia. g) Uskoisin että useimmat ihmiset oppisivat käyttämään tätä palvelua tällä laitteella hyvin nopeasti. h) Pidin palvelua hankalana käyttää tällä laitteella. i) Koin oloni itsevarmaksi käyttäessäni palvelua tällä laitteella. j) Minun täytyi oppia paljon ennen kuin pääsin alkuun palvelun käytössä tällä laitteella. k) Karttakuvat olivat tällä laitteella havainnollisia ja niiden avulla oli helppo suunnistaa.

En tiedä

© Oulun yliopisto 2006

Seuraavat kysymykset koskevat Kampushaku-palvelun käyttöä yleisesti. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1

2

3

4

5

En tiedä

a) Palvelu on mielestäni hyödyllinen. b) Mahdollisuus käyttää palvelua mobiilisti on hyvä asia. c) Haluaisin käyttää tätä palvelua vastaavanlaisiin hakutehtäviin tulevaisuudessa. Mikä seuraavista hakutoiminnoista on mielestäsi hyödyllisin? Valitse vain yksi vaihtoehto. Paikkahaku

Aktiviteettihaku

Henkilöhaku

En osaa sanoa

Olet nyt suorittanut kaikki Kampushakuun liittyvät tehtävät. Hienoa! Voit hengähtää nyt hetken ennen toisen osan aloittamista, jossa testataan MobiiliOptima-palvelua. Suorita tehtävät Nokia N770 Internet-päätteellä yliopiston läpi kulkevan pääkäytävän lähettyvillä tai Tietotalossa, joissa panOULU –verkko (langaton laajakaistaverkko) toimii varmasti.

© Oulun yliopisto 2006

TEHTÄVÄT :: OSA II :: MOBIILIOPTIMA Suorita tehtävät Nokia N770 Internet-päätteellä. Siirry MobiiliOptimaan laitteen bookmarkeissa olevan CampusSurf-linkin kautta. Aloita tehtävä kirjautumalla palveluun testikäyttäjätunnuksillasi. Tehtävä 1: ”Lue kaikki saamasi tervetuloilmoitukset.” Suoritettuasi tehtävän 1 vastaa seuraaviin kysymyksiin. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1

2

3

4

5

En tiedä

a) Mielestäni suoriuduin tehtävästä helposti. b) Mikä tehtävän suorittamisessa oli (1) helppoa, (2) vaikeaa? ____________________________________________________________________________________________ c) Miten alareunan "Paluu" ja "Alkuun" -linkit mielestäsi eroavat toiminnaltaan toisistaan? ____________________________________________________________________________________________

Tehtävä 2: ”Kenttäkoe-työtilassa on käynnissä keskustelua MobiiliOptimasta. Vastaa viestiketjuun ”Adjektiivit” ja kirjoita kolme ensimmäisenä mieleesi tulevaa adjektiivia MobiiliOptimasta. Jos sinulla on vielä aikaa, voit halutessasi vastata muihinkin viestiketjuihin.” Suoritettuasi tehtävän 2 vastaa seuraaviin kysymyksiin. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1

2

3

4

5

En tiedä

a) Mielestäni suoriuduin tehtävästä helposti. b) Mikä tehtävän suorittamisessa oli (1) helppoa, (2) vaikeaa? ____________________________________________________________________________________________

Tehtävä 3: ”Lähetä viesti käyttäjälle Optima Mobiili, otsikoi viesti otsikolla ”Ohi on” ja kirjoita viestiisi kouluarvosana (4-10) MobiiliOptimasta tämän kenttäkokeen perusteella.” Suoritettuasi tehtävän 3 vastaa seuraaviin kysymyksiin. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1

2

3

4

5

En tiedä

a) Mielestäni suoriuduin tehtävästä helposti. b) Mikä tehtävän suorittamisessa oli (1) helppoa, (2) vaikeaa? ____________________________________________________________________________________________ c) Miten poistuit/poistuisit Mobiilioptima-palvelusta? _____________________________________________________________________________________________

© Oulun yliopisto 2006

Seuraavat kysymykset koskevat MobiiliOptima-palvelun käyttöä yleisesti. Vastaa asteikolla 1 = täysin eri mieltä – 5 = täysin samaa mieltä 1

2

3

4

5

En tiedä

a) Haluaisin käyttää tätä palvelua usein. b) Pidin palvelua tarpeettoman monimutkaisena. c) Palvelu oli mielestäni helppokäyttöinen. d) Uskoisin tarvitsevani teknistä tukea pystyäkseni käyttämään palvelua. e) Palvelun toiminnot muodostivat mielestäni hyvän kokonaisuuden. f) Mielestäni palvelussa oli liian paljon epäjohdonmukaisuuksia. g) Uskoisin, että useimmat ihmiset oppisivat käyttämään tätä palvelua hyvin nopeasti. h) Pidin palvelua hankalana käyttää. i) Koin oloni itsevarmaksi käyttäessäni palvelua. j) Minun täytyi oppia paljon ennen kuin pääsin alkuun palvelun käytössä. k) Palvelu on mielestäni hyödyllinen. l) Nykyinen MobiiliOptima on prototyyppi. Olisin kiinnostunut käyttämään MobiiliOptimaa, mikäli sitä jatkokehitetään. m) Minulle riittää, että voin käyttää normaalia Optimaa ”mobiilisti” kannettavalla tietokoneella ja WLAN-yhteydellä. n) Mielestäni on tarpeellista, että Optimaa pystyy käyttämään kännykällä.

Olet nyt suorittanut kaikki Kampushakuun ja MobiiliOptimaan liittyvät tehtävät. Hienoa! Palauta tämä lomake sekä lainatut päätelaitteet ryhmän loppupalaverin yhteydessä, joka pidetään huoneessa TS334. Palaverin päätteeksi testiin osallistujat palkitaan Finnkinon elokuvalipuilla. Tähän kenttään voit halutessasi antaa palautetta liittyen käyttäjätestiin ja testattuihin palveluihin! _______________________________________________________________________________________________ _______________________________________________________________________________________________ _______________________________________________________________________________________________ _______________________________________________________________________________________________ _______________________________________________________________________________________________

Pre Experiment Questionnaire 1.

Your age:

______

2.

Gender:

M

3.

Do you use a touchscreen-based handheld device? (Examples: Pocket PC and Palm PDA with or without keyboard) a. Yes

F

Model/Type: ____________________________

b. No – Go to question #6. 4.

How often do you use a touchscreen-based handheld device? a. b. c. d. e.

5.

Several times a day No more than once or twice a day No more than a few times a week No more than a few times a month Hardly ever

When you use the touchscreen, how often do you use the following techniques? Never

6.

Always

a. Stylus

1

2

3

4

5

b. Index finger

1

2

3

4

5

c. One handed with thumb

1

2

3

4

5

d. Two handed with thumbs

1

2

3

4

5

Do you use a keypad-based handheld device? (Examples: Cell phone, Blackberry, Pocket PC and Palm PDA with keyboard) a. Yes

Model/Type: ____________________________

b. No – You’re done with this questionnaire! 7.

How often do you use a keypad-based handheld device? a. b. c. d. e.

8.

Several times a day No more than once or twice a day No more than a few times a week No more than a few times a month Hardly ever

When you use the keypad, how often do you use the following techniques? Never

Always

a. Index finger

1

2

3

4

5

b. One handed with thumb

1

2

3

4

5

c. Two handed with thumbs

1

2

3

4

5

Post Experiment Questionnaire 1 Discrete Tap Study

1.

Please rate how comfortable you felt tapping ‘X’ target in each region of the screen (mark the rating inside each region):

Rating range: 1-7

2.

1 = Uncomfortable 7 = Comfortable

Please mark the corresponding letter of the smallest target size you felt comfortable using in each region:

a. Extra Small

d. Large b. Small

c. Medium

e. Extra Large

Post Experiment Questionnaire 2 Serial Tap Study 1.

Please rate how comfortable you felt using the keypad in each region of the screen (mark the rating inside each region):

Rating range: 1-7

2.

1 = Uncomfortable 7 = Comfortable

Please mark the corresponding letter of the smallest keypad size you felt comfortable using in each region:

a. Extra Small

d. Large

b. Small

c. Medium

e. Extra Large

3.

Comments about the whole experiment? (optional) __________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________ __________________________________________________________________________

Thank You!