Face Recognition using MMS-Mobile Devices

Face Recognition using MMS-Mobile Devices 2 O. Kao1, U. Rerrer1 , G. Steinert1 , S. Witting2 1 Department of Computer Science, University of Paderbor...
Author: Lorena Welch
2 downloads 0 Views 534KB Size
Face Recognition using MMS-Mobile Devices 2

O. Kao1, U. Rerrer1 , G. Steinert1 , S. Witting2 1 Department of Computer Science, University of Paderborn, Germany Department of Computer Science, Technical University of Clausthal, Germany e-mail: [email protected]

Abstract This paper describes a flexible, modular system for processing of multimedia information acquired with mobile devices. The workflow is described using a sample scenario where a user takes a photo with an MMS-capable cell phone, sends it to a corresponding service for face identification, and receives the meta-information via SMS as a response. The main component in this chain is a gateway, which offers interfaces for different networking technologies on the one side and interfaces for various multimedia services on the other side. The design of the gateway considers the fast development of mobile technology by implementing the architecture as open as possible.

Keywords: mobile devices, multimedia services, face recognition

1

Introduction

The fast development of mobile devices in recent years led to significant improvement of the multimedia capabilities of cell phones, PDAs, etc. Users can e.g. take a picture and send it via MMS (Multimedia Messaging Service) to other cell phones or to a given e-mail address. Different providers also offer additional services such as printing of the MMS image to a post card and sending this via regular mail. However, further approaches for image processing locally (on the cell phone) or remotely (on a certain MMS server) are currently not in use. Local processing is restricted due to the small displays, storage capacity, and compute power. Therefore, the processing must be performed on a powerful background infrastructure, which receives the message (MMS

server), processes it locally or forwards it to a selected service resource, receives the response, and sends it to the mobile device. There is a number of application fields, e.g. in case of voice messages, one can use this approach to compare the voice of a specific person with voice patterns in a database and thus immediately identify a person from any place supplied with wireless network. An archaeologist on a historic site can require an immediate identification of a found object by comparing the photo of the object with pictures in the corresponding database. This paper presents a general architecture for the realisation of the described approach for a mobile client and a cluster-based database for the processing of the information from the mobile device. The architecture supports different types of mobile devices and connectivity (GSM, WLAN, Bluetooth, ...) and has no limitations regarding the supported services and development technologies for these services, it rather defines the interfaces and the protocols. The described prototype addresses a sample scenario, where image comparison is used for the identification of a certain person based on a photo taken with an MMS-capable cell phone, for example a picture of an old friend (I know him/her from somewhere ...). The paper is organised as follows. Section 2 gives an overview of the underlying architecture. Subsequently, the gateway between the mobile clients and services is described in more detail (Section 3). Section 4 covers the background infrastructure and Section 5 the algorithms for face recognition together with the experimental results.

Service Face Recognition

MMS-Center

Client 7. SMS

MMS Interface

8. SMS

5. Result

4. Request

1. MMS

2. E-mail

3. Request

Gateway 6. Result

SMS-Center

Figure 1: Workflow overview of the sample scenario from a cell phone via MMS server to face recognition

2

Workflow overview

We assume a common mobile network infrastructure with an MMS server from a commercial provider, which receives and forwards the messages to the multimedia services. The user captures a picture, selects a site and service for the processing (usually identified by an e-mail address or a phone number), and sends all information to the assigned MMS server. The MMS server transforms the message into an email and sends the new information to the addressed gateway. These steps can be simplified by integration of an MMS server and a service gateway, however this is likely uncommon due to the high implementation effort and the system costs, respectively. As soon as the gateway received the e-mail, it separates the content, determines the image and the addressed service, executes the necessary pre-processing, e.g. format conversion, and activates the required service. The details about the gateway workflow are depicted in Section 3. The selected service – in this case face recognition – compares the source image with all archived face images and delivers – if a match is found

– the meta-information such as name, address, e-mail, etc. of the identified person to the gateway. Depending on the selected network and protocol, the gateway sends the response to a mobile device, in this scenario via SMS (Short Message Service). A graphical representation of the workflow is presented in Figure 1. The MMS and the SMS center are usually combined into a single resource. During the transformation of the MMS message in an e-mail the transmitted picture(s) are integrated as attachments, thus the MMS interface produces the corresponding number of recognition requests.

3

Gateway between and services

clients

The gateway offers a number of interfaces for various communication technologies on the client side and a plug-in concept for the integration of multimedia services on the server side. The client interfaces can be adapted to current technologies such as GSM, UMTS, WLAN, Bluetooth. Additional clients with future networking technology are supported as well by defining most flexible interface archi-

tecture, thus fort coming standards can be mapped on. The gateway core manages the system, accepts, converts, and transmits the messages from the mobile clients to the services and vice versa. Furthermore, it allows an addition or removal of plug-ins during runtime. These three components define the gateway structure, which is presented in Figure 2.

transfer mode, addressing the targeted multimedia service, handling of timeout and transmission problems, etc. • Service-Gateway-Protocol (SGP) considers the data transmission and the system administration, in particular log-in and log-off of the multimedia services, exchange of status information, watch-dog functionality for the service, etc. • Client-sided-Interface to Gateway Protocol (CIGP) and server-sided-Interface to Gateway Protocol (SIGP) are internal protocols for the communication between the gateway core and the interfaces.

Figure 2: Main gateway components A flexible design is one of the main requirements for the gateway. Other requirements consider the failover functionality, concurrent, asynchronous processing of multiple queries, mechanisms for user authentication, and management of interrupted connections. Moreover, the user access to and work with the system should be as easy and intuitive as possible. Therefore, information mechanisms about the status and progress of the current query, automatic software download and update as well as a transparent support of accounting systems have to be considered. The structure of the current gateway design together with the included protocols is presented in Figure 3. The connection between the server and client interfaces is realised using specific protocols. The main client-service protocol (CSP) is located on the top of the stack as shown in Figure 3 and uses the functionality of the following protocols: • Client-Gateway-Protocol (CGP) connects the client with the client interface and has to be adapted to the current communication technology. The common properties of all CGP sub-protocols are the data

Finally, the gateway-to-gateway protocol (GGP) is related to the cooperation of multiple and concurrent gateways, which can be installed in order to provide an increased level of failover functionality or to perform load balancing. In particular information exchange about available services, re-direction of request/reply units, and creation of unique service identifier are performed. The workflow inside the gateway starts with the incoming message. The included service identifier and the login information are used to verify the access restrictions and to select the corresponding service for the data processing. The submitted data is checked according to the defined remote interfaces, extended by intern information for further processing and query identification. Subsequent pre-processing steps – usually format conversion – prepare the data for the addressed service. The package is delivered to the gateway core, which submits it to the service interface and blocks the thread, until the results are available. The server interface offers a proxy functionality: it transmits the data to server and starts the processing. It also processes and extends the result data according to the defined protocols and sends the results to the mobile client via gateway and client interface. In postprocessing steps the results are adapted to the presentation modus of the mobile client.

Figure 3: Overview of gateway protocols

4

Background Infrastructure for Efficient Execution of Multimedia Services

Processing and retrieval of multimedia data requires large computational resources. This is in particular true, if the analysis of the current multimedia instance under investigation has to be analysed during run-time, i.e. preextracted feature vectors are not available, because for example queries considering varying feature types are allowed or because a precise search for objects has to be performed [4]. The latter leads to processing times of several hours which are not tolerable in the previously described mobile scenario. But also in case of standard searches such as face recognition based on pre-extracted feature vectors, the processing time has to be as short as possible in order to allow acceptable interaction between the mobile user and the system. In opposite to the access for traditional systems, sending an MMS needs a long transmission time and thus increases the response time significantly. A shorter response time can be mainly achieved by shortening the processing time. For this purpose we adapted an existing cluster-based architecture for image and video retrieval called Cairo to the mobile scenario and created the necessary interfaces [5]. The selection of a cluster architecture for the implementation of the background infrastructure is based on number of performance evaluations and comparisons of shared everything, shared disk and cluster architectures. These experiments have shown that the distributed storage

and transfer of media instances over a number of nodes allows the most efficient execution of multimedia retrieval operations [6]. Figure 4 shows a schematic of the developed cluster architecture by considering image retrieval as an example. Based on their functionality, the nodes of the parallel image database are subdivided into three classes: • Query stations host the user interfaces and provide a web-based access to the image database. • Master node controls the cluster, receives the query specification, and broadcasts the algorithms with the parameters and the query image to the computing nodes. Furthermore, it acts as a redundant storage server, unifies the intermediate results of the computing nodes, and produces the final ranking. • Computing nodes perform the image processing and comparisons. Each of these nodes contains a disjunctive subset of the existing images with their metainformation and executes all operations with the data stored on the local devices. The individual results are sent to the master node. For the initial distribution of the images and their feature vectors over the available nodes a content independent partitioning strategy is selected, such that the memory size of the images and the feature vectors stored on the local devices is approximately equal for all nodes. If a new image has to be inserted into the

CPU 1 CPU 2 slave node 1

query access

Internet CPU 1 CPU 2



slave node 2

master node

CPU 1 CPU 2 slave node n

Figure 4: Schematic of the proposed cluster architecture database, the partition with the smallest storage size is determined and the image is sent to the corresponding node. The advantages of this strategy are the simple implementation and management. The processing of a partition is executed element-wise, as the individual operations per image are independent of one another. The initial partitioning makes it possible for all nodes to have almost uniform processing times, assuming a homogenous, dedicated execution platform and a query considering all images in the database. The management overhead depends on the operator used and the structure of the partial results. The applied distribution of the images and the meta-information across a number of nodes enables the parallelisation of the retrieval by executing the same operations by all nodes on the local subset. Components called transaction, distribution, computation, and result manager are necessary to implement this approach. They are based on the well-known parallel libraries PVM and MPI that are used for distributed and parallel computations in a network of workstations. The transaction manager analyses the operators to be applied and determines the execution order. Opposite to a traditional database management system, the data is usually only

read, so that no access conflicts need to be resolved. The order of operations must be set in a way that the time for the processing and presenting of the system response is minimised and all suitable images have been considered. The distribution manager generates the local system calls for every individual cluster node and sends these to all nodes via the communication routines of the active virtual machine. The computing manager controls the execution of the algorithms with the images and the meta-information stored on the local memory devices of the computing nodes, thus the intra-cluster communication during runtime is minimised. The composed intermediate results need to be unified by the result manager in the final step. The measurements regarding the performance speedup and efficiency are executed using a configuration consisting of 15 computing nodes and a single master node (Dual Pentium III, 667 MHz, 256 Mbyte main memory). All images were searched with operations consuming minimal, moderate, and high time effort in order to evaluate the management overhead and the performance gain. Thereby, a linear speedup with the number of nodes is achieved regardless of the applied configuration. All nodes work with nearly full capacity due to the

minimised communication between the cluster nodes during query processing. Therefore, the efficiency values remain in a narrow interval between 94% and 98%. The absolute runtimes depend on the performance of the used hardware and have no significance for the analysis of the retrieval efficiency. It should be noted, that the times required for the internal initialisation and communication as well as the unification of the individual results are on average less than 0.0016% of the total processing times.

5

Face Recognition: Algorithm and Performance Measurements

We used a module for face recognition as a sample multimedia service, as a number of developed and tested algorithms for face recognition already exist [7]. Well-known and often used algorithms are Elastic Banch Graph Matching introduced by Wiskott et al. [8] and Eigenfaces approach proposed by Turk and Pentland [9]. For the prototype implementation we selected the Eigenface approach with the principal component analysis (PCA) as main component, where each image is represented by vectors called principal components. Let I be a monochrome image with dimensions W × H. This image can be represented as a vector of the dimension WH and thus as a point in WH-dimensional Σ space. Images containing faces are – due to their similarity – concentrated on certain regions, thus the corresponding clusters can be transferred into a subspace of Σ with a lower dimensionality. The PCA delivers those principal vectors with the most precise specification of the face image distribution in this subspace called face space. The eigenvectors of the covariance matrix are computed for a set of training images and are subsequently linearly combined in order to approximate the original faces. Turk and Pentland call the results of this operation Eigenfaces. The original work for the computation of the Eigenfaces and the covariance

matrix is given in [9]. The result of the algorithm are so called weight vectors Θi for each of the face images stored in the database. These representing vectors are stored in the image database and used for the similarity comparison. At runtime, the incoming picture is transformed in the same way by projecting the query image into the existing face space. Subsequently, the computed weight vector of the querying image is compared to all other vectors in the database using the Euclidian Distance. In addition, the difference between the original image and its reconstruction is determined. Dependent on these two parameters three cases are possible 1. The query image contains no face and the system replies with a failure message 2. The face on the query image is not available in the database 3. One of the archived faces matches the input face on the query picture and the meta-information of the image is sent as a reply. A number of empirical measurements were executed in order to determine the average processing length per query and the average retrieval quality. One of the main criterions for the selection of Eigenfaces as retrieval method is the fast processing. This was approved during tests, as retrieval in a large databases with more than 14000 face images needs only about 10 seconds on a simple workstation. Using the powerful parallel infrastructure Cairo allowed processing times less than a second. The retrieval quality is currently far beyond the rates, which are achieved with the face recognition methods in traditional databases. Depending on the used query image and the database recognition rates between 80% and 88.6% are achieved. The main reason for this rather discontenting rate is the (still) poor quality of the MMS images coupled with the low resolution. A number of pre-processing steps to equalise illumination differences could improve the rate significantly. On the other

hand the development of the cell and the embedded photographic lenses phones is going fast, so an appropriate hardware improvement can be expected soon.

6

Conclusions

This paper presents a system for connecting mobile devices with a powerful background infrastructure in order to enable a more complex processing of multimedia information acquired with mobile devices such as MMS cell phones. A sample workflow is given by considering face recognition as an example. Future work includes the support of further, already existing communication technologies into the gateway. Also different multimedia services based on voice and image recognition are currently in development. Finally, investigation of related face recognition methods with respect to the given quality of the MMS pictures is mandatory to increase the recognition rate.

References [1] R. Semper, M. Spasojevic The Electronic Guidebook: Using Portable Devices and a Wireless Web-based Network to Extend the Museum Experience Technical Report, HP Laboratories Palo Alto, March 2002 [2] J. Arreymbi, M. Dastbaz Issues in Delivering Multimedia Content to Mobile Devices Sixth International Conference on Information Visualisation, 2002, pp. 622626 [3] S. Nylander, M. Bylund, A. Waern The Ubiquitous Interactor - Device Independent Access to Mobile Services HumanComputer Interaction, May 2003 [4] O. Kao On Parallel Image Retrieval with Dynamically Extracted Features Journal of Parallel Computing, Elsevier Science, to appear 2004

[5] O. Kao, S. Stapel Case Study: Cairo A Distributed Image Retrieval System for Cluster Architectures T.K. Shih (Edt.): Distributed Multimedia Databases: Techniques and Applications, 2001, pp. 291303, Idea Group Publishing [6] T. Bretschneider, S. Geisler, O. Kao Simulation-based Assessment of Parallel Architectures for Image Databases Proceedings of the International Conference on Parallel Computing (ParCo 2001), pp. 401-408, 2002, Imperial College Press [7] R. Chellappa, C.L. Wilson, S. Sirohey Human and machine recognition of faces: A survey Proceedings of the IEEE, 83(5), 1995 [8] L. Wiskott, J.-M. Fellous, N. Kruger, C. v.d. Malsburg Face recognition by elastic graph matching IEEE Transactions on Pattern Recognition and Machine Intelligence, 19(7), pp. 775-779, 1997 [9] M. A. Turk, A. P. Pentland Face recognition using eigenfaces Proceedings of Conference on Computer Vision and Pattern Recognition, pp.586-591, 1991

Suggest Documents