TAMPERE UNIVERSITY OF TECHNOLOGY. Department of Information Technology Institute of Signal Processing

TAMPERE UNIVERSITY OF TECHNOLOGY Department of Information Technology Institute of Signal Processing Iftikhar Ahmad A framework of content-based ret...

Author: Martina Harvey

1 downloads 2 Views 4MB Size

Report

Download PDF

Recommend Documents

TAMPERE UNIVERSITY OF TECHNOLOGY

TAMPERE UNIVERSITY OF TECHNOLOGY MATHEMATICS

Presentation of Tampere University of Technology

Report. WT2.13: Tampere University of Technology. Activities performed during the visit. Tampere University of Technology, Finland

AMITY INSTITUTE OF INFORMATION TECHNOLOGY AMITY UNIVERSITY

Department of Information Technology

DEPARTMENT OF INFORMATION TECHNOLOGY

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science : Discrete-Time Signal Processing

TAMPERE UNIVERSITY OF TECHNOLOGY POWER ENGINEERING GROUP

Department of Computer Science COMSATS Institute of Information Technology, Lahore

Department of Energy Technology Royal Institute of Technology, Sweden

Department of Information Technology & Management

Tampere University of Technology, Finland Research Centre of Metal Structures

TAMPERE UNIVERSITY OF TECHNOLOGY Faculty of Computing and Electrical Engineering

INSTITUTE OF RADIOELECTRONICS WARSAW UNIVERSITY OF TECHNOLOGY FACULTY OF ELECTRONICS AND INFORMATION TECHNOLOGY ANNUAL REPORT

University of Kansas Information Technology

WARSAW UNIVERSITY OF TECHNOLOGY Faculty of Electronics and Information Technology

Princeton University Tokyo Institute of Technology Georgia Institute of Technology Tsinghua University

Department of Information Technology. - Scientific Computing. Department of Information Technology. - Scientific Computing

Dublin Institute of Technology. Jonathan O'Brien Dublin Institute of Technology

COMSATS Institute of Information Technology Virtual Campus

Tokyo Institute of Technology. Tokyo Institute of Technology

Dublin Institute of Technology. Mairead Brosnan Dublin Institute of Technology

Dublin Institute of Technology. Fiona Hassett Dublin Institute of Technology

TAMPERE UNIVERSITY OF TECHNOLOGY Department of Information Technology Institute of Signal Processing

Iftikhar Ahmad

A framework of content-based retrieval in mobile devices

Licentiate Thesis

The Title has been accepted on May 7, 2003 Reviewer: Prof. Moncef Gabbouj Prof. Irek Defee Dr. Roberto Castagno

Disclaimer The work presented in this thesis is the personal work of Mr. Iftikhar Ahmad. It is not the official opinion of Nokia.

1

Abstract The rapid increase in the available digital information (images, audio and video) requires an efficient way to browse that information and find a way to retrieve the required information. Convention key word (text) base query is not appropriate for audio/visual search. On the other hand, mobile phone technology is continuously progressing and it is possible now to create digital information from the mobile phone. Mobile phone can take picture, record audio and video. Mobile phone can also connect the internet to exchange this information with other phones and devices connected through internet. In this study, we have developed a new Java-based client-server application for content-based image retrieval over wireless networks. The application on the client side is running on the Personal Java [8] (JDK1.1.8) [59] or on Mobile Information Device Profile (MIDP) [46] (Java 2 Micro Edition) and client is written in pure Java™. Client can make query with image data or it can make random query from the database. Once client gets the results, it can make query from the given results. On the server side, we use C code for the query. As C code is faster then Java code and in C code, we have better access to the native resources. As MIDP [46], devices are in different shapes and sizes. The user interface (MIDP Client) is adaptable for the different screen sizes and the network connections. The Client passes its environment information to the server and server manages the query according to the requirement of the Client. MIDP devices are supporting limited number of image formats; hence, Server converts the image compatible to the image format supported by the Client. Client can query by texture, histogram and shape. It selects appropriate database for the query. There are separate databases for the above query types.

2

Preface This thesis is carried out in the institute of Signal Processing, Tampere University of Technology, Finland. The work is part of the MUVI project. The objective is contentbased information retrieval on mobile devices. In this study client-server, architecture is used. Client side is running on Mobile Information Device Profile (MIDP) devices. Server is running on PC. Client requests the server to perform the query. Then server performs query from the selected database and send the results to the client.

3

LIST OF PUBLICATIONS [P1] Ahmad Iftikhar, Faouzi Alaya Cheikh, Bogdan Cramariuc and Moncef Gabbouj, “Query by Image Content using Mobile Information Device Profile (MIDP)” Finsig03, May 19, 2003, in Technical University of Tampere, Tampere, Finland [P2] Ahmad Iftikhar, Faouzi Alaya Cheikh, Bogdan Cramariuc and Moncef Gabbouj, “Query by Image Content using NOKIA 9210 Communicator”, Workshop on Image Analysis for Multimedia Services (WIAMIS) 2001, COST 211, From May 16-17, 2001, Technical University of Tampere, Tampere, Finland. [P3] Iftikhar Ahmad, Azhar Quddus, Heikki-Jussi Laine, Olli Yli-Harja, “Image segmentation of the CT-scans of hip endoprostheses”, NORSIG2000, IEEE Nordic Signal Processing Symposium 2000, held at Vildmarkshotellet, Kolmården, Norrköping, Sweden, June 13-15, 2000. http://www.es.isy.liu.se/norsig2000/publ/page271_id129.pdf

4

1

2 3

4 5

Background ............................................................................................................9 1.1 Content-based Information Retrieval (CBIR)................................................9 1.2 Content-based Information Retrieval (CBIR) Design .................................10 1.3 Feature..........................................................................................................10 1.3.1 Histogram.............................................................................................11 1.3.2 Shape....................................................................................................12 1.3.3 Texture .................................................................................................13 1.3.4 Color layout .........................................................................................13 1.4 MUVIS on 9210 communicator...................................................................13 Introduction..........................................................................................................17 Design and architecture........................................................................................20 3.1 Classes description.......................................................................................20 3.2 Client-Server architecture ............................................................................22 3.3 Client side ....................................................................................................22 3.4 Server side....................................................................................................24 3.5 Client architecture........................................................................................25 3.5.1 Main component view..........................................................................25 3.5.2 Client’s engine component view..........................................................26 3.5.3 Client’s user interface component view...............................................27 3.5.4 Client utile component view ................................................................29 3.5.5 Client’s user interface component composition...................................30 3.6 Servlet architecture ......................................................................................31 3.7 Localization and internationalization...........................................................33 3.7.1 Internationalization ..............................................................................33 3.7.2 Localization..........................................................................................33 3.8 Sharing query information ...........................................................................34 3.9 Sequence diagrams.......................................................................................35 3.9.1 Sequence diagram of the user interface ...............................................35 3.9.2 Sequence diagram of the engine ..........................................................36 3.9.3 Query by image data ............................................................................37 3.9.4 SMS query results ................................................................................38 3.9.5 Sequence diagram of the server ...........................................................39 3.9.6 Retrieve images from the server ..........................................................40 Wireless connectivity technologies......................................................................41 4.1 Application download and installation ........................................................41 Software ...............................................................................................................42 5.1 Tomcat .........................................................................................................43 5.1.1 Servlet ..................................................................................................43 5.1.2 Initializing a servlet..............................................................................45 5.1.3 Writing service methods ......................................................................45 5.1.4 Getting information from requests.......................................................46 5.1.5 Servlet and CGI....................................................................................46 5.1.6 Servlet logging .....................................................................................47 5.2 Java native interface.....................................................................................48 5.2.1 Loading and linking native methods....................................................48 5.3 ImageMagick ...............................................................................................50 5.4 Java advance imaging APIs .........................................................................51 5.4.1 Java advance imaging usages...............................................................52 5.5 Java 2 Micro Edition....................................................................................53

5

5.5.1 CLDC...................................................................................................54 5.5.2 CDC .....................................................................................................55 5.5.3 Profile...................................................................................................56 5.5.3.1 Mobile Information Device Profile..................................................56 5.5.3.2 Optional packages............................................................................57 5.6 Mobile Media API .......................................................................................58 5.6.1 Mobile media API architecture............................................................58 5.6.2 Using the Mobile Media API...............................................................59 5.6.3 Using a player ......................................................................................60 5.6.4 Supported media types.........................................................................60 5.7 Wireless Messaging API..............................................................................61 5.7.1 Representation of a message................................................................61 5.7.2 Message................................................................................................62 5.7.3 Binary message ....................................................................................62 5.7.4 Text message........................................................................................63 5.7.5 Sending and receiving messages..........................................................63 5.7.6 MessageListener ..................................................................................64 5.7.7 Security ................................................................................................65 5.8 What is wrong with wireless Java?..............................................................65 6 Results and assessments.......................................................................................67 7 Conclusions and future work ...............................................................................75 8 Summary of the publications ...............................................................................77 8.1 Author’s contribution to the publications ....................................................77 9 Acknowledgements..............................................................................................98 10 References........................................................................................................99

6

Abbreviations TUT JDK MIDP J2ME MUVI MUVIS PC CSD HSCSD GPRS WAP CBIR QBIC SQUID

Tampere University of Technology Java Development Kit Mobile device Information Profile Java 2 Micro Edition Multimedia Video Indexing and Retrieval Multimedia Video Indexing and Retrieval System Personal Computer Circuit Switch Data High Speed Circuit Switch Data General Packet Radio Service Wireless Application Protocol Content-Based Indexing and Retrieval CBIR system designed by IBM Shape Queries Using Image Databases (CBIR system, University of Surrey, UK) Netra CBIR system, University of California at Santa Barbara UCSB University of California at Santa Barbara VisualSEEK CBIR system, Columbia University RGB Color model (red, green, blue) GSM Global System for Mobile SMS Short Message Service iMode iMode is NTT DoCoMo's mobile internet access system 3G 3rd generations mobile network system RAM Random Access Memory TOMCAT Servlet container, used as server SERVLET Java program to extend the server capabilities EPOC Operation System from symbian MIDlet MIDlets are applications designed to run on wireless Java enabled devices such as phones or PDAs UI User Interface MID Mobile Information Device C/C++ Programming language PDA Personal Digital Assistant CSS Curvature Scale Space J2EE Java 2 Enterprise Edition CGI Common Gateway Interface API Application Programming Interface HTTP Hyper Text Terminal Protocol JNI Java Native Interface APACHE Apache is Server software VM Virtual Machine MS Windows Micro Soft windows JPG Joint Photographic Experts Group GIF Graphic Interchange Format TIFF Tagged Image File Format PNG Portable Network Graphics GIF Graphic Interchange Format DLL Dynamic Link Library 7

Java 2D RMI JAI I/O CDC CLDC KVM CPU TV FP GUI AWT PP PBP MMAPI MP3 PDF WMA

Two Dimensional Java API Remote Method Invocation Java Advanced Imaging Input and output Connected Device Configuration Connected Limited Device Configuration K virtual machine Central Processing Unit Television The Foundation Profile Graphical User Interface Abstract Window Toolkit Personal Profile Personal Basis Profile Mobile Media API is an audio encoding method Portable Document Format Wireless Messaging API

8

1 Background There is a swift increase in the available digital information (image, audio and video) in the last decade and it is increasing at an enormous rate. To use that information we need to index that information so we can browse that information. From early 1990s, content-based indexing and retrieval (CBIR) of digital images became an active area of research. Commercial and academic systems for image retrieval have been built on computers. Most of these systems (e.g. QBIC [54] from IBM, Netra [4, 55] from UCSB, VisualSEEK [60], SQUID [66], AMORE [67], LEIDEN [68], Surimage [69], Virage [70], WebSeek [71], ISTORAMA [72], Fichlar [73]) are supported browsing and searching by image. The search is based on visual features such as color, shape, texture, spatial layout of objects in the query scene and the keywords. These systems are only supported on machines (PCs, workstation, etc). MUVIS [6] [57] system was developed in late 90s in Tampere University of Technology [56] (TUT) [58]. It supports indexing and retrieval in large image database by using visual and semantic feature such as color, texture and shape on PCs. Where the whole application is running on the PC and one can make query in the specific database.

1.1 Content-based Information Retrieval (CBIR) In the Content base information retrieval system, the query is made by the image, audio or video from the given database. It retrieves the similar information from the selected database. Query is accomplished in several steps. First, we extract the features from the media (image). Secondly, the feature vector of the query media is compared to the media items (images) in the selected database. Then a list of the media item is created according to similar measures and the results are saved in the file on server side. Saved file contains one entry per line. First index number of the image in the selected database is written. Secondly, the full path of the image is given, and then the last 9

entry is match with the query media. Finally, Java side reads the result file, creates a single string with the result entries, and sends it to the client.

1.2 Content-based Information Retrieval (CBIR) Design In content-based querying, first we create an offline database. That database contains features of the selected media items. For that first, we select a number of images. After that, we extract the features of the selected images and save them in the database. For the query, first we extract the feature of the query image. Secondly, we compare the feature vector of the query image with the feature vectors of the database images. Then we take the first ten closest matches and send as query results to the Clients.

Get media (Images)

Capture image and send image to server

Compare the feature in the database

Extract the feature from media.

Receive the image and pass to native side.

Find the first ten matches.

Create a database offline.

Extract feature from the image

Send the result to the client.

1.3 Feature The features are information in the images. In low-level it can be color, texture, shape and spatial layout. On the other hand, high-level features are concept and keyword.

10

Visual query is not as simple as text base query. Human being has a very good image recognition system and it works efficiently. Modern computer have the processing power to do image recognition but do not have efficient algorithms to do so. Human visual system uses low-level and high level information from the image to recognize the image in an efficient way. We have some information, how the human system works, however we do not have the full information yet. Therefore, we cannot reproduce it in the computers. In MUVIS, we are using low-level information (color, texture and shape) to make query and match the images. The performance of the content-based retrieval system heavily depends upon the selection of the features. Type and length of the feature vector play a crucial role. It also depends upon the design of the similarity measure [64].

1.3.1 Histogram Color is one of the dominant features of the image. Human perception system differentiates the images by the presence of different colors in the images. That is why color is used broadly as a visual attribute in the image retrieval. It is easy to compare and makes the query faster. In fact, most existing image retrieval systems such as QBIC [54], Netra [55], and VisualSEEK [60] are more efficient in color retrieval. Retrieval by color similarity requires models of the color stimuli. It perceives color space similar to human being perception of colors. Color histogram is commonly used in visual search. The histogram of an image represents the relative frequency of the occurrence of the various color levels in the image. It tells the statistical distribution, as well as the joint probability of the intensities of the three-color channels. Human color visual system depends upon the three types of receptors. Each of this receptor responds differently for the varying wavelengths of light. The trichromacy of color sensation means that different spectral distributions can produce the same

11

perception of color. Color stimuli are commonly represented as points in threedimensional color space. Before forming the histogram, RGB color space is usually convert to perceptually uniform color space, such as the HSV (hue, saturation vale) space. Hue describes the actual wavelength of the color, saturation indicates the amount of white light present in the color and brightness represents the intensity of the color [5].

1.3.2 Shape “There is no universal definition of what shape is. Impressions of shape can be conveyed by color or intensity patterns (texture), from which a geometrical representation can be derived” [65]. In our work, we consider the contour based representation of shapes. Moreover, our features are extracted at a discrete set of points on the contour, namely the high-curvature points. Edges and high curvature points are of prime importance to our visual system. Our eyes have a built-in mechanism that emphasizes the edges and changes in the seen we observe and throws away the redundant information such as in smooth areas. Therefore, our brain is used to recognizing the objects and shapes from little information such as few points or lines. Computers are still very far behind in their recognition capabilities in this sense. Shape features play an important role in content-based information retrieval. Shape can be represented by regenerative features (such as boundaries, regions, moments etc) and measurement features (perimeter, area, corners, roundness, bending energy, symmetry, orientation, eccentricity etc). In the literature, shape features are grouped in two categories, region-based and boundary-based. Region-based considers that the shape is composed of a set of two-dimensional regions, where as boundary-based considers shape by its outline. Region based vectors usually results in shorter feature vectors and are faster in matching. However, generally they fail to give proficient similarity retrieval. Where as, feature vectors extracted from boundary-based representations provide a more affluent description of the shape. This scheme has led to the development of the multi-resolution shape presentations called Curvature Scale Space (CSS) image, which proved very useful in similarity

12

assessment [61]. The features vector extracted from the CSS image contain the scale and position of the lobes maxima. The CSS image represents the evolution across scale of the position of the zero-crossings of the curvature function of the contour. The matching of two CSS images consists of finding the optimal horizontal shift of the maxima in one of the CSS images that would yield the best possible overlap with the maxima of the other CSS image. The matching cost is defined as the sum of pair wise distances (in CSS) between corresponding pairs of maxima [74].

1.3.3 Texture Texture is a small elementary pattern that repeats periodically in an image. The elementary pattern and its periodicity may change in the image. It is hard to explain what elementary pattern is and what periodicity means. There are different definitions from researchers. Therefore, it is hard for the computer to recognize the texture. Human visual system is capable of recognizing and distinguishing the texture quite easily [63]. Psychophysical study has evident that human visual system is recognizing the texture in the frequency domain. Texture is an important feature for CBIR system. Texture is almost everywhere in the nature. There are techniques to find features from the texture. In MUVIS, textural features are divided into three categories: spatial, frequency and moment-based attributes [62]. In MUVIS, texture based query is faster.

1.3.4 Color layout Color base query may not give very good results in large database. One image may contain many colors. In color layout, big images are divided into smaller parts. Then color features are extracted from the smaller parts. There are complicated techniques for the partition of images. After that, features are extracted from the smaller parts.

1.4 MUVIS on 9210 communicator

13

Communicator 9210 [11, 12] is the first mobile phone to make content-based query in MUVIS. MUVIS on 9210 is using client-server architecture [3]. Client is running on the 9210 in personal Java [8]. Personal Java is running on Symbian OS ver6.0 [13, 14, 15]. Server is running on PC in J2SE. We are using servlet [16] on the server side for querying. Servlets are used for dynamic contents [16]. On the server side, servlet is using native code for querying. 9210 does not have built-in camera. We use a separate camera to take the picture. We transfer the image in the phone and make the query with the captured image. Communicator is using CSD or HSCSD data call to connect to the server. Communicator (9210) has a big screen so it can display ten smaller images and one query image. We are using high-speed link (HSCSD). Through that, we can down load ten images in 2 min 45 sec. Download time should be considered carefully. It depends upon many dynamic parameters i.e., free memory in the device (9210), traffic on the network and load on the server etc. For 9210 clients, we are resizing the image on the server side. Later client can fetch original image.

Figure 2-1: Menu to select the database

14

Figure 2-2: Start a random query in the selected database

Figure 2-3: Histogram based query

Figure 2-4: Texture based query

Communicator 9210 supports personal Java that is compatible with JDK1.1.8 [9] (without swing). Personal Java is rich in APIs if you compare it with Mobile Information Device Profile (MIDP) [46]. However, not all the devices are supporting personal Java. Personal Java is only for communicator family. MIDP is for the lowend phones.

In personal Java, MUVI client is running as an application, so it has access to the local file system. We can read the images from the local file system and send to the MUVI server. However, in MIDP, (MUVI) Client does not have access to the local file system so we cannot read the images from the local file system. If the phone have Java APIs to capture the image, we can send the image to the server otherwise we

15

cannot send the images to the server. Nokia’s [18] phone 3650, support Mobile Media APIs, so we can capture the image and send it to the server. MUVIS on personal Java was presented in Workshop on Image Analysis for Multimedia Services (WIAMIS) May 16 – 17, 2001, Tampere, Finland.

16

2 Introduction The way people are communicating is changing very fast. Few years ago, mobile phones were lucrative items restricted to a very small community of rich executive and government agents. Moreover, they were used exclusively for voice calls. Today the mobile terminal penetration is growing steadily and continuously. Moreover, their use is no longer restricted to voice communication only. In Finland, it is widely accepted among youngsters to use a GSM phone for sending SMS messages, to chat with friends or to play games. Adults may be more interested in checking their stocks or paying a bill using their wireless terminal and the Wireless Application Protocol (WAP) [17]. In Japan a phenomenal change in the use of mobile phones happened by the introduction of the “iMode” [1] system. The number of users since its introduction two years ago has risen to 17 millions. Java technology has established itself as the leading 3rd party application development platform for the downloadable mass-market applications for the mobile devices. According to the analyst estimate, approximately 80-100 millions Java enabled handsets have been shipped to the market by the beginning of 2003. Installed base is growing rapidly as over 15 mobile device manufacturers support the Java platform. By the beginning of 2003 tens of operators, all around the world have deployed or trailed Java services. The Java application development platform consists of both a programming language and an execution environment for the mobile devices. One of the key benefits of the Java platform is that it can run on top of several different operating systems. The proven scalability and portability of the Java technology enables it to be implemented across all kinds of the mobile devices ranging from the basic mass market to the high-end devices. In addition, the Java platform also has a robust security model that protects the device from harmful application. Almost everyone has downloaded ring tones and screen icons to their mobile phones. Now with the Java application, personalization is no longer limited to the outer side of

17

the phone. There will be many applications from the third party to enhance the capabilities of the Java enable mobile phones. The early results from many of these Java service launches are encouraging. Mobile users have welcomed graphical and easy to use Java services. Further more the Java language has been one of the fastest growing programming language. The third generation, or 3G [2], phones will create new opportunities for the content providers, by providing a way of transmitting text, voice, images, and streamed video. Moreover, their ability to remain connected with the Internet all the time will provide users an overwhelming access to a huge amount of information. Users will then face the problem of how to retrieve the information of interest to them in an efficient manner. The goal is to allow him for searching and navigation in this wealth of data without the need to make text-based queries for following obvious reasons: •

The user may be unable to type the commands.

•

The keyboards of portable devices are not very comfortable for text-based commands.

•

Text-based queries may not be very appropriate in the case of images, video or music.

•

User of the handheld devices will not be a computer programmer.

•

Handheld devices will be limited in resources (small key board, smaller screen, less RAM, less computational power etc).

Therefore, a content-based indexing and retrieval engine coupled to a speech recognition engine could be the ultimate interface to such a system. In this study, we will introduce a content-based search engine and its graphical user interface. The speech recognition part is not considered in this study. The objective of this study is to make a fast, reliable, content-based query on mobile devices. Even though, the newly introduced pervasive devices are having faster processors, larger memories and their available communication bandwidth is getting wider, they

18

remain far behind the PC capabilities. Therefore, a major challenge in designing such a system is to understand the characteristics of such devices and their hardware and software limitations.

19

3 Design and architecture MUVIS client, Mobile Information Device Profile (MIDP) [46] application (MIDlet) has three parts. First, is the user interface second is the engine part and third is the utility part. User interface part contains objects that display the results and invoke the query. Engine part is responsible to maintain the state of the engine, send the query data, and retrieve the query results from the server. Utility part contains the object, which is used by engine part to manipulate the server results data.

3.1 Classes description Class

Descriptions

MultimediaSearchEngine

Main class to start the MIDlet

J2MEUserInterface

Abstract interface for the User Interface, Engine is using it to update the query results Abstract base class for the user interface, It handles the query result array. It contains default implementation of J2MEUserInterface. Adoptable class for the user interface

AbstractUserInterface QuartzUserInterface UIManager

User Interface Manager class to calculate the device adaptation parameters. J2MEDisplayEntry Ret-rive result entry contains image data, image path in the database, match number and index of the image in the database. MultimediaServerConnection Handle the connection to the server. MediaCommandListener J2MEServletResult J2MESearchEntry

J2MEQueryInfo ImageMediaControl StringTokenizer

Commands from the User Interface report in this object. Client query results are held here It contain search values, image name (for query by image name), image index (for internal query), image data (for the query by image data) on the connection side Current query object on the User interface side MediaCommandListener is passing commands to the ImageMediaControl that construct the query and pass to the MultimediaServerConnection. Parse the query string on the client side.

20

MidletLog MediaMessageConnection

It is use to send the log (sequence and intermediate status values) and query timing to the log server that is different from the MUVI server. Media Message connection is use to send and receive the SMS.

21

3.2 Client-Server architecture Wireless devices are far behind the PCs. We have limited resources on the devices so most of the work is performed on the server side. Server is running on a personal computer and has all the required resources to manage the query. Where as client is showing the query results and querying with the given parameters. Client server communication is defined in a way that it is transferring least amount of information to and from the server. Client is sending the query in a plain text (human readable format). Servlet is performing the query on the server side in native code. Server (servlet) is sending the query results to the client in a plain text (human readable format). Later client can request for the images. Server is then resizing the images and is sending small images to the client (device). In some cases server is also changing the image format to make it suitable for the client.

3.3 Client side Different venders provide wireless devices and they have different operating systems. Even one vender has more than one operating system (Nokia [18], Motorola [75], Sony-Ericsson [77] and Siemens [76] are the examples). Different venders provide different interfaces to program their devices. It is hard for the developer to write same application for the different devices. Maintenance / support of that application will be a nightmare for the developer. As Java 2 Micro Edition (J2ME) [7] is becoming the defecto standard of wireless device so we have selected J2ME platform as our client side. The J2ME platform is not a single specification for a piece of software. Instead, it is a collection of technologies and specifications designed for different parts of the small device market. As Java is interpreted language, so it is slow on the mobile devices. Mobiles have small amount of processing power. Hence we are doing as less as possible on the client side as we are resource limited on the client side. Client side software is categorized into three parts. One is user interface package, second is engine part and third is utility part. In the user interface part, we have an abstract interface (J2MEUserInterface), which is implemented by concrete

22

implementation. In the abstract interface, we define the APIs to use from the engine part. Engine part does not know about the user interface. We provide a default implementation of J2MEUserInterface as AbstractUserInterface. That default implementation provides basic information about the device. If we want to port our client to another device, we can subclass this AbstractUserInterface and over write the methods that are not appropriate with the new device. The default implementation is very useful in the case of unknown devices. It tries to adopt the device configuration with default settings. UIManager class is responsible for the selection of the proper user interface. With the help of screen size and system property (microedition.platform) we select the user interface. User interface part is also responsible of updating the query resultant images, when engine passes the image object to it. Engine part is independent of the user interface and is a common component. Command component does not need any change or adaptation on different devices. User interface passes the commands to the command listener. Command listener is forming the query and passing it to the ImageMediaControl. Image media control is maintaining the query status. Database is selected in this object. All the query information is stored in the query information object. ImageMediaControl

is

passing

the

query

information

object

to

the

MultimediaServerConnection. MultimediaServerConnection is sending the query to the server and retrieves the results from the server. Query results are saved in the J2MEServletResult object. MultimediaServerConnection getting the entries from the J2MEServletResult

and

is

fetching

the

images

from

the

server.

Later

MultimediaServerConnection is using AbstractUserInterface to update the query results. Utility package contains StringTokenizer. It is used on the client side to parse the tokenized strings that we receive from the server. StringTokenizer is not part of the MIDP; we have added it to the MIDlet classes.

23

3.4 Server side Server is running on the desktop computer. We are using Tomcat [22] software as the server. Tomcat is providing a servlet [10] container. We are adding our servlet to the tomcat. Servlet is using native calls to make queries. When a client is sending query to the server, it is calling the service method of MidpMultimediaServer. Here we are getting the query information from the client request. We are checking the client type and query type. If the query type is by image data, we are getting the image data from the client request. Now we check the client type. If the client type is J2ME, we pass the query information to MultimediaMidpClient. Multimedia MIDP client is parsing the query information and creating the QueryInformation object. QueryInformation object is passing to the ImageMediaQuery. ImageMediaQuery is making native calls for the query. In case of Personal Java, we are using directly the ImageMediaQuery as the QueryInformation object is on the client side. We are using object serialization in Personal Java that is not supported in MIDP. Therefore, in MIDP we are not using ImageMediaQuery directly. In case of MIDP client, ImageMediaQuery object is passing the query result and the status of the query to the MultimediaMidpClient. MultimediaMidpClient is creating a tokenized string with the query results and passing it to the MIDP client. MIDP client is parsing the resulting string and creating the J2MEServletResult object on the client side. Later this object is used to fetch the similar images from the query. StringTokenizer is used on the client and on the server side to parse the tokenized strings. Java Advance Imaging (JAI) API is used on the server side to resize the images as requested by the client devices in the query information. Java Advance Imaging API is also use to convert the JPEG and GIF to the PNG image format for the MIDP devices.

24

3.5 Client architecture It shows that how different packages are interacting with each other and how different object access each other.

3.5.1 Main component view It shows the component (or package level view).

Figure 3-1 Package level interaction

There are three main packages in the client. User interface (UI) package contain the classes related to user interface. Action is engine part. It is independent from the user interface. Relationship between UI package and action is bi-directional. User interface part is adding query to the action part and action (engine) package is updating the query results by the abstract interface.

25

3.5.2 Client’s engine component view

Figure 3-2 Engine’s component interaction

Engine handles the activity behind the user interface. It connects to the server, sends the query to the server, retrieves the query results from the server, and passes the images to the user interface for display. Engine is a separate thread. It is waiting for the user interface requests. User interface is submitting the query requests to the engine part. Engine is collecting the requests in a queue and serving on first come, first served bases.

26

3.5.3 Client’s user interface component view

Figure 3-3 User Interface’s component interaction

User interface package is containing classes related to the display of images. UI part is starting the query by issuing the command. Engine is sending that query to the server and getting the results. Later it is updating the results by using an interface (J2MEUserInterface). UIManager is creating the displayable and setting it to current displayable. Normally we cannot access the file system of mobile devices, so we are

27

using MidletLog to get the query timing information. MidletLog is sending it to the log server.

28

3.5.4 Client utile component view

Figure 3-4 Utile component view It is used to parse the query results from the server. Sever is sending the query results as double string tokenized. First, we are parsing to get the query result entries. Later we parse the query result entries and get the query entries (match, path in the database, internal image index).

29

3.5.5 Client’s user interface component composition

Figure 3-5 Client’s (MIDlet’s) component’s composition

When we start the MultimediaSearchEngine MIDlet, it initializes it self and create the ImageMediaControl. ImageMediaControl creates MultimediaServerConnection and start

the

connection

MediaCommandListener.

thread. This

Later

initializes

MultimediaSearchEngine the

engine

part.

After

creates that,

MultimediaSearchEngine is initializes user interface part by calling UIManager. As mobile devices are limited in memory, image capture and message sending objects are not initialized to save memory. The application initializes them when they are called.

30

3.6 Servlet architecture

Figure 3-6 Server Architecture

When the client sends the request to the server, it appears in the service method of MidpMultimediaServer. Server checks the client type and passes the MIDP request to the MultimediaMidpClient. MultimediaMidpClient creates the QueryInformation

31

object from the query information given by the client. Later MultimediaMidpClient is passing the QueryInformation object to the ImageMediaQuery. ImageMediaQuery is doing query in the native side to get the query results. ImageMediaQuery is creating ServletSearchResult

object

and

passing

it

to

the

MultimediaMidpClient.

MultimediaMidpClient is creating a tokenized string from the ServletSearchResults and sending it back to the client.

32

3.7 Localization and internationalization Mobile devices are personal devices to the users. The user of the mobile devices may not be a computer programmer so mobile devices require personalization. Personalization does not mean a logo and ringing tone. It requires many things language, date and time format, currency, text formatting etc. Localization means adoption to different languages and cultures. Applications cannot be localized, if the platform does not support localization. All the new operating systems support localization. Mobile devices are also supporting localization and mainly Java is supporting the localization [29].

3.7.1 Internationalization Internationalization is a way of designing software so that it can be localized with a minimum effort. In a broader term, to design a system in a way that it requires no software change to translate it to another language.

3.7.2 Localization Localization is the technical term for the process that adapts a software package to different cultures according to their requirements. Contrary to popular misconception, this process of adaptation not only consists of translation work, but it also includes user interface design and usability testing. Currently we are supporting Finnish, German and French version of MUVI Client.

33

3.8 Sharing query information Mobile devices are not alone in this connected world. Mobile devices connect in many different ways. User of the mobile devices can make voice call, data call to connect with the network. User can also send and receive the short messages from the GSM phones. With the help of WMA (Wireless Messaging API), it is possible for Java application to send and receive the short messages from the mobile devices. After querying, MIDlet can share the results with the other WMA supported devices. WMA has two types of message connections; one is client connection and second is server connection. In the client message connection, MIDlet can only send the messages. MIDlet cannot receive the messages. However, in the server message connection, MIDlet can send and receive the messages. We are using server message connection to send and receive the messages. When the device receives the message, it read the entries from the message. Later user can make query from the received message.

34

3.9 Sequence diagrams Sequences diagrams are component level description of the system that how different components are interacting with each other. Behavior relations of the different components are explained in these diagrams. These diagrams are not explaining the flow of the program or state diagram.

3.9.1 Sequence diagram of the user interface Operation Preconditions Description Inputs Outputs Exceptions

Create the User Interface MIDlet is started and Java description file has valid tags It checks the device parameters for the UI adaptation. Tag vales from the Java description A User interface is displayed NullPointerException,

Figure 3-7 Sequence Diagram of the MIDlet initiation

35

3.9.2 Sequence diagram of the engine Operation Preconditions Description Inputs Outputs Related To Exceptions

Find similar images User interface is up and running Engine is waiting for the query object when it gets the query object It starts the query and passes the results to the User Interface. Query object Result from the Server Retrieval of the query results IOException, NullPointerException

36

3.9.3 Query by image data Operation Preconditions Description Inputs Outputs Related To Exceptions

Query by image data User interface is up and running Engine is waiting for the query object when it gets the query object It starts the query and passes the results to the User Interface. Query object Result from the Server Retrieval of the query results IOException, NullPointerException, MediaException

Figure 3-8 Query by image Data

37

3.9.4 SMS query results Operation Preconditions Description Inputs Outputs Exceptions

Send query result entry to another mobile phone User interface is up and running. After querying, some results are retrieved from the server. After querying client wants to share the query results. One query result entry Message is send to the GSM phone. IOException, IllegalArgumentException

Figure 3-9 Send Query result entry as SMS

38

3.9.5 Sequence diagram of the server Operation Preconditions Description Inputs Outputs Related To Exceptions

Query on the server side Server is up and running Making query on the server side Query object Result of the query Query on the server side IOException

Figure 3-10 Query on the server side

39

3.9.6 Retrieve images from the server Operation Preconditions Description Inputs Outputs Related To Exceptions

Retrieve images from the server Server is up and running Making query on the server side Query object Result of the query a scaled image Retrieval of the image from the server IOException

Figure 3-11 Retrieve images from the server side

40

4 Wireless connectivity technologies Java enabled phones are supporting Circuit Switch Data (CSD), High Speed Circuit Switch Data (HSCSD) and General Packet Radio Service (GPRS), there operational frequencies are GSM 900/1800 optionally few support 1900. As the client is making a data call to connect to the server. If the phone is supporting 2.5G (GPRS) then user is paying for the information that he is exchanging. But in the case of CSD and HSCSD user is paying for the online time regardless of, if he is exchanging the information or not (using the channel or not). GPRS connection gives better results in the MUVI search engine, as we are sending and receiving the information in small packets and later we are downloading the images in one go.

4.1 Application download and installation Although there are many Java applications available, however, the important issue is the installation of Java application in the mobile devices. Problem starts even before, how user gets the Java application. Normally web browser is used to find the Java application. Web browser detects the MIME type of the Java Application Description (JAD) file and starts the Java application installer to install the Java application. Installer gets the URL of the JAR (Java ARchive) file from the JAD file, download the JAR file, and install the Java application. In some devices, another option is to transfer the (JAD and JAR) file to the device inbox and start installation from the device inbox. Different vendors are providing different devices. They have different interfaces to install the MIDlet. It is difficult to install a MIDlet, unless vendors provide proper information how to install the application.

41

5 Software Handheld devices are diverse in functionality, size and shape. Different venders make them. Different venders have different operating system in their devices. They have the different APIs for the same functionality. Some vender has the different APIs for the same functionality on different phones. In some cases, same vender has different operating system on their different devices. That makes it impossible to develop an application (Client) in native (C/C++) language that will be portable on different devices from the different venders or from the same vender. It was a challenge to write the MUVI client. Client can come in many sizes, ranging from wireless mobile devices and PDAs, all the way up to the desktop computers. The functionality supported by the clients can vary. Simple clients can make query only from the database. Sophisticated clients, known as rich clients, can have services with multimedia content: sampled audio, and video. Rich clients can be built on the J2METM [47] or J2SETM [48] platforms. In J2ME, rich client are supporting many extra APIs that simple client is not supporting. In the wireless and wire-line world, managing, upgrading, and tracking the usage of rich clients presents new challenges. J2EE [49] address these problems and we have scaleable server that can handle requests from different clients. If client needs extra favor server is providing it to the client for example client only support a single image format. Server is converting the image to the required format and sending it to the client.

42

5.1 Tomcat Tomcat [22] is a standard servlet container. Servlets are the Java platform technology of choice for extending and enhancing web servers. Servlet is component-based and platform-independent method for building Web-based applications, without the performance limitations of Common Gateway Interface (CGI) [20] programs. Moreover, unlike proprietary server extensions (such as Netscape Server API [21] or Apache [19] modules), servlets are server and platform-independent. Servlets can also access a library of HTTP-specific calls and receive all the benefits of the mature Java language, including portability, performance, reusability, and crash protection. Servlet can make Java Native Interface (JNI) [23, 40] calls to use the native code on the server side. As the native code is much faster than the Java code so the image indexing and retrieval is handled in the C code. We are using image-magick [42] libraries in the native code to read the images in the different formats. Cygwin [41] DLL is used to load the image-magick libraries.

5.1.1 Servlet Servlet is a Java technology that helps web developers with a simple, consistent mechanism for extending the functionality of a web server. A servlet can be considered as an applet [24] [25] that runs on the server side, without a face. With the help of Java servlets we can run a Java program on the server side without user interface to provide some services. Those services will be dynamic in nature. Java developer can write a Java program and activate from the servlet to provide dynamic contents. With the start of web, service providers recognize the need for dynamic contents. Applets are the, first attempts toward this objective. Attempt was to use the client platform to provide dynamic user experiences. Efforts were made to investigate using the server platform for this purpose. Initially, Common Gateway Interface (CGI) [20] 43

scripts were the main tools used to generate dynamic contents. CGI scripting has a number of limitations. It is platform dependent and is lacking in scalability. To overcome these limitations, Java servlet technology was created in a scalable and portable way to provide dynamic contents.

A servlet is a Java programming language class used to extend the capabilities of servers that host applications accessed through a request-response programming model.

The javax.servlet [26] and javax.servlet.http [26] packages provide interfaces and classes for writing servlets. All servlets must implement the servlet interface, which defines life-cycle methods. The HttpServlet [30] class provides methods, such as doGet and doPost, for handling HTTP-specific services. We are using HttpServlet in our server MidpMultimediaServe.

The life cycle of a servlet is controlled by the container (Tomcat) in which the servlet is deployed. When a request is mapped to a servlet, the container performs the following steps.

Check the instance of the servlet, if an instance of the servlet does not exist, the web container, •

It loads the servlet and its classes. Load the native library if required.

•

It creates an instance of the servlet class.

•

Initializes the servlet instance by calling the init method. Here we initialize the databases (Shape, texture and histogram).

•

Invokes the service method, passing a request and response object.

If the container needs to remove the servlet, it finalizes the servlet by calling the servlet's destroy method.

44

Time

Init Init-databases Service, Query Service, Query

Service, Query

Service, Query

Service, Ret-rive Image

Service, Ret-rive Image Destroy

Thread 1

Thread 2

Thread 3

Figure 5-1 : Life cycle of a servlet. When the servlet needs to be unloaded (e.g. because a new version should be loaded or the server is shutting down) the destroy method is called. There may still be threads that execute the service method when destroy is called, so destroy has to be threadsafe. All resources which were allocated in init should be released in destroy method. This method is called only once during the servlet's lifecycle.

5.1.2 Initializing a servlet When the web container loads and instantiates the servlet class and before it delivers requests from clients, the web container initializes the servlet. We are reading the MUVIS image database path here. We initialize the MUVIS database during the initialization of the servlet. We are creating list of all the images in the database. Later that list is used to make the random query from the database. If the servlet cannot complete its initialization process, it throws UnavailableException [27].

5.1.3 Writing service methods The service provided by a servlet is implemented in the service method of a GenericServlet [28]. The term service method is used for any method in a servlet class

45

that provides a service to a client. The general pattern for a service method is to extract information from the request, access resources, and then populate the response based on that information. For Hypertext Transfer Protocol (HTTP) [31] servlets, the correct method for populating the response is to first fill in the response headers, then retrieve an output stream from the response, and finally write any body content to the output stream. Response

headers

must

always

be

set

before

a

PrintWriter

[32]

or

ServletOutputStream [33] is retrieved because the HTTP protocol expects to receive all headers before body content. The next two topics explain how to get information from requests and produce responses.

5.1.4 Getting information from requests A request contains data passed between a client and the servlet. All requests implement the ServletRequest [34] interface. This interface defines methods for accessing the following information:

5.1.5 Servlet and CGI The traditional way of adding functionality to a web server is the Common Gateway Interface (CGI) [20]. A language-independent interface allows a server to start an external process. This gets information about a request through environment variables, the command line and its standard input stream and writes response data to its standard output stream. Each request is answered in a separate process by a separate instance of the CGI program. CGI programs are usually written in interpreted languages like Perl [35]. Servlets have several advantages over CGI: •

A servlet does not run in a separate process. This removes the overhead of creating a new process for each request.

46

•

A servlet stays in memory between requests. A CGI program (and in some systems an extensive runtime system or interpreter) needs to be loaded and started for each CGI request.

•

There is only a single instance, which answers all requests concurrently. This saves memory and allows a servlet to manage persistent data.

•

A servlet can be run by a servlet engine in a restrictive Sandbox [36] (as an applet runs in a web browser's sandbox) which allows secure use of untrusted servlets.

5.1.6 Servlet logging A web server is normally running as a background process without connected standard input/output streams. Even if the server is running in a console, a servlet cannot expect to access that console with the System.in [37], System.out [38] and System.err [38] streams. Instead all messages should be sent to a server log file with one of ServletContext's log methods: public void log(String msg) writes the specified message to the log file. public void log(String msg, Throwable t) writes the specified message and a stack trace of the Throwable objects to the log file. The message will usually be an explanation of an error and the Throwable an exception that caused the error. Two convenient methods are implemented in GenericServlet. They automatically prefix the messages with the right servlet name and then call the appropriate ServletContext method: public void log(String msg) We are writing the entire MUVIS servlet log to the server log files. With the help of server log file, we are producing query timing information.

47

5.2 Java native interface We are using Java Native Interface (JNI) [23] on the server side to make query in the native (C/C++-code) code. As Java is an interpreted language, so it is slow. Where as C/C++ code is much faster in execution. Therefore, Java Native Interface is used to make native calls to use C-code. You can write applications entirely in Java, there are circumstances where Java alone does not meet the requirements of your application. Programmers use the JNI [23] to write Java native methods to handle those situations when an application cannot be written entirely in Java. The following examples illustrate when you need to use Java native methods: •

The standard Java class library does not support the platform-dependent features needed by the application.

•

You already have a library written in another language, and want to use it through the JNI.

•

You want to implement a small portion of time-critical code in a lower-level language for faster execution.

5.2.1 Loading and linking native methods Native methods are loaded with the System.loadLibrary [39] method. In the following example, the class initialization method loads a platform-specific native library in which the native method InterfaceDBDirectDataHistQuery is defined: package muvis; class NativeIPL { public native int InterfaceDBDirectDataHistQuery( String DBFname, int NBins1, int NBins2, int NBins3, float Bins[], float Weights[], int DistType, int NResults,

48

String ResultsFname); static { System.loadLibrary("NativeIPL"); } }

The argument to System.loadLibrary is a library name chosen randomly by the programmer. The system follows a platform-specific, approach to convert the library name to a native library name. For example, a Solaris system converts the name NativeIPL to NativeIPL.so, while a Win32 system converts the same NativeIPL name to NativeIPL.dll. The programmer may use a single library to store all the native methods needed by any number of classes, as long as these classes are to be loaded with the same class loader. The VM internally maintains a list of loaded native libraries for each class loader. Developer should choose native library names that minimize the chance of name clashes. If the underlying operating system does not support dynamic linking, all native methods must be pre-linked with the VM. In this case, the VM completes the System.loadLibrary call without actually loading the library.

49

5.3 ImageMagick Image magick is a collection of tools and libraries to read, write, and manipulate an image in many image formats including popular formats like TIFF, JPEG, PNG, PDF, PhotoCD, and GIF. With ImageMagick you can create images dynamically, making it suitable for web applications. You can also resize, rotate, sharpen, color reduce, or add special effects to an image or image sequence and save your completed work in the same or differing image format. We are using imagemagick software to read the different image formats in the native code on server side.

50

5.4 Java advance imaging APIs The Java Advanced Imaging APIs [43] extend the Java platform by allowing sophisticated and high-performance image processing in the Java programs. Java Advanced Imaging is a set of classes, which provide imaging functionality beyond that of Java 2D [44] and the Java Foundation classes, although it is compatible with those APIs. The Java Advanced Imaging (JAI) API implements a set of image processing functionality including image tiling, regions of interest, threading and deferred execution. JAI also offers a set of image processing operators including many common points, area and frequency-domain operators. Java Advanced Imaging encapsulates image data formats and remote method invocations within a re-usable image data object. It is allowing an image file, a network image object or a real-time data stream to be process identically. JAI is compatible with the Java run time library model, providing platform independence with the "write once, run anywhere" paradigm. Client-server imaging is supported by way of the Java platform's networking architecture and remote execution technologies. Remote execution is based on Java remote method invocation (RMI) [45]. This allows Java code on a client to invoke method calls on objects that reside on another computer without having to move those objects to the client. Java Advanced Imaging follows an object model where both images and image operators are defined as objects sub-classed of a common parent. An operator object is instantiated with one or more image sources and other parameters. This operator object may then become an image source for the next operator object. The connections between the objects define the flow of processed data. The resulting editable graphs of image processing operations may be defined and instantiated as needed. JAI also provides an extensible framework that allows customized solutions to be added to the core API.

51

5.4.1 Java advance imaging usages We are using Java Advance Imaging API to resize the images on the server side to reduce the image size and improve the query speed. Java Advance Imaging (JAI) APIs are used to convert the image format. As MIDP is supporting PNG image format only, where as our database has GIF and JPG images. Server is converting the GIF or JPG images to PNG image format on the fly. Later server sends it to the client.

52

5.5 Java 2 Micro Edition Java 2 Micro Edition (J2ME) [47] is not a single specification for a piece of software. Instead, it is a collection of technologies and specifications designed for different parts of the small device market. Consumer devices (mobile phones, personal digital assistant, pager etc) are diverse in form functionality and features. The J2ME architecture defines configurations, profiles and optional packages as elements for building complete Java runtime environment. These elements meet the requirements for a broad range of consumer devices and target markets. Each combination is optimized for the memory, processing power, and I/O capabilities of a related category of consumer devices. The result is a common Java platform that provides standard behavior to the end user with varying functionality. If the device provides some extra functionality that is not common to other devices, there can be extra API for that functionality. Applications are advice to check the presence of extra APIs programmatically before using them. Through this, there application will be portable across the different configurations. Configurations are composed of a virtual machine and a minimal set of class libraries. They provide the base functionality for a particular range of devices that share similar form, feature and functionality, such as network connectivity and memory footprint. These APIs parameterized based on configurations and profiles. A configuration defines the least set of class libraries available for a variety of devices. Profile is a set of APIs available for a family of devices. For example, the profile of the mobile phone is separate from the profile of the personal digital assistant, but both profiles work with the same configuration. The two common types are Connected Device Configuration CDC [50] and Connected Limited Device Configuration CLDC [51]. The Connected Device Configuration (CDC) specification is based on the full feature Java virtual machine specification, which defines a full-featured runtime environment that includes all of the functionality of J2SE. This configuration is targeted for highend wireless devices (like communicator, personal digital assistant) with at least a few megabytes of available memory. The Connected Limited Device Configuration 53

(CLDC) consists of the K virtual machine (KVM) [51] and a set of class libraries appropriate for use with low-end wireless devices. This configuration is target for small wireless devices. They have very simple user interface, minimum memory starting at around 128K, and network connections with lower bandwidth. In CLDC, there are two categories of devices. One is a high-end device; they contain extra API for their extra functionality. Where as low-end resource limited devices, support only core functionality. The line between the low-end and high-end is fuzzy now, but later it will be prominent when high-end devices will support extra APIs. Java 2 Micro Edition is intending to modular and scalable so that it can support a big verity of consumer devices. The KVM is a runtime implementation of the Java Virtual Machine (JVM). The K in the KVM stands for kilo [51]. Currently, there are two Java 2Micro Edition configurations: •

Connected Limited Device Configuration (CLDC)

•

Connected Device Configuration (CDC)

CDC is the supper set of the CLDC. Application written for CLDC can run on CDC.

5.5.1 CLDC CLDC (Connected Limited Device Configuration) is the smaller of the two configurations. It is designed for devices with slow processors, limited memory and slow network connections devices such as mobile phones, two-way pagers and PDAs. These devices normally have either 16 or 32-bit CPUs, and a minimum of 128 KB to 512 KB of memory available for the Java platform and applications (MIDlet). CLDC is not supporting many features that J2SE supports on desktop computers. There is not floating-point support in the CLDC. Floating-point support is not given in the virtual machine. As most of the target, devices do not support hardware floating-point support. A number of features are removed from the virtual machine to reduce the footprint in the handheld devices. Java native interface, user-define class loader, reflection, thread group and daemon threads, finalization and week references are removed from the virtual machine.

54

There is a new two-phase class file verification is introduced in the J2ME. Firstly, we compile the java files to class files. Secondly, these class files run through a special pre-verifiers tool in order to check the class files and add some attributes. This preverification is done on development workstation. At runtime verification, pre-verifier is checking the extra attributes that were added during offline verification. In the figure 6-1 configuration is running on top of host operating system with some core libraries. Profile is running on top of configuration.

Profiles

Propritry APIs

Configuration

Libraries Java Virtual Machine

Host Operating System

Figure 6-1: Software layers of J2ME in devices

5.5.2 CDC CDC (Connected Device Configuration) is designed for high-end devices. They have more memory, faster processors, and greater network bandwidth, such as communicators, TV set-top boxes, residential gateways, and high-end PDAs. CDC includes a full-featured Java virtual machine, and a much larger subset of the J2SE platform than CLDC. As a result, most CDC-targeted devices have 32- bit CPUs and a minimum of 2MB of memory available for the Java platform and the applications.

55

Servers and Enterprise computers Optional Packages

Servers and personal computers

Optional Packages

High-end devices, PDAs, TV set-top boxes

Mobile phones, PDA, Pagers

Smart cards

Optional Packages Personal Profile

Java 2 Platform, Enterprise Edition (J2EE)

JVM

Java 2 Platform, Standard Edition (J2EE)

Personal basis Profile

Optional Package

Foundation profile

MIDP

CDC

CLDC

JVM

KVM

JVM

Java card Card VM

Java 2 Micro Edition Full feature Java virtual machine Figure 6-7: Different configuration of Java virtual machines in different environment.

5.5.3 Profile To form a complete runtime environment (Java virtual machine and a set of APIs) targeted at specific device family, configurations are combined with a set of higherlevel APIs, and profiles. The API and profile define the application life cycle model, the user interface, and access to device specific properties.

5.5.3.1 Mobile Information Device Profile The Mobile Information Device Profile (MIDP) is designed for mobile phones. It is designed for the entry-level required by mobile applications, including the user

56

interface, network connectivity, local data storage, and application management. Combined with CLDC, MIDP provides a complete Java runtime environment that leverages the capabilities of handheld devices and minimizes both memory and power consumption. For more details, check the MIDP specification.

5.5.3.2 Optional packages Combining various optional packages with CLDC, CDC, and their corresponding profiles can further extend the J2ME platform. Optional packages are, to address very specific market requirements. Optional packages offer standard APIs for using both existing and emerging technologies such as Bluetooth, Web services, wireless messaging, multimedia, and database connectivity. Because optional packages are modular, device manufacturers can include them as needed to support the different features of each device.

57

5.6 Mobile Media API The Mobile Media API [52] extends the functionality of the J2ME platform by providing audio, video and image capturing, multimedia support to resourceconstrained mobile devices. As an optional package, it allows Java developers to gain access to the native multimedia resources available on a mobile device. The Mobile Media API (MMAPI) provides a powerful, flexible, and simple interface to device’s multimedia capabilities in a standard way. Accessing these multimedia capabilities in native code, on different devices will be a nightmare. MMAPI exposes a clean interface to the MIDlet for playing and recording audio and video data. For more details, check the mobile media specification.

5.6.1 Mobile media API architecture The Mobile Media API is base on four fundamental concepts: 1. A player knows how to interpret media data. One type of player, for example, might know how to produce sound based on MP3 audio data. Another type of player might be capable of showing a QuickTime [53] movie. Players are represented by implementations of the javax.microedition.media.Player interface. 2. You can use one or more controls to modify the behavior of a player. You can get the controls from a Player instance and use them while the player is rendering data from media. For example, you can use a VolumeControl to modify the volume of a sampled audio Player. Controls are represented by implementations of the javax.microedition.media.Control interface; specific control sub-interfaces are in the javax.microedition.media.control package. 3. A data source knows how to get media data from its original location to a player. Media data can be stored in a variety of locations, from remote servers to resource files or RMS databases. Media data may be transported from its original location to the player using HTTP, a streaming protocol like RTP, or

58

some other mechanism. javax.microedition.media.protocol.DataSource is the abstract parent class for all data sources in the Mobile Media API. 4. Finally, a manager ties everything together and serves as the entry point to the API. The javax.microedition.media.Manager class contains static methods for obtaining Players or DataSources.

5.6.2 Using the Mobile Media API The best way to use the MM API is to use the Manager. Manager has many factory methods to perform different multimedia tasks. Manager can create Player from the DataSource object, from known type of input stream or from a URI string. There are three different versions to create the Player as follows: public static Player createPlayer(String locator) throws IOException, MediaException public static Player createPlayer(DataSource source) throws IOException, MediaException public static Player createPlayer(InputStream stream, String type) throws IOException, MediaException The easiest way to obtain a Player is to use the first version of createPlayer() and pass in a string that represents media data. For instance, you might specify an audio file on a web server: Player p = Manager.createPlayer("http://webserver/music.mp3"); The other createPlayer() methods allow you to create a Player from a DataSource or an InputStream, whatever you have available. If you think about it, these three methods are really just three different ways of getting at the media data, the actual bits. An InputStream is the simplest object, just a byte stream. The DataSource is the next level up, an object that speaks a protocol to get access to media data. Moreover, passing a locator string is the ultimate shortcut. The MMAPI figures out which protocol to use and gets the media data to the Player. Normally the implementation first downloads the media and then starts playing the media. 59

5.6.3 Using a player Once you've successfully created a Player, The simplest action is to begin playback with the start() method. To understand the functionality of MM API, it is good to understand the life cycle of a Player. This consists of four states. When a Player is first created, it is in the unrealized state. After a Player has located its data, it is in the realized state. If a Player is rendering an audio file from an HTTP connection to a server, for example, the Player reaches realized after the HTTP request is sent to the server, the HTTP response is received, and the DataSource is ready to begin retrieving audio data. The next state is prefetched, and is achieved when the Player has read enough data to begin rendering. Finally, when the data is being rendered, the Player's state gets started. The Player interface provides methods for state transitions, both forwards and backwards through the cycle described above. The reason is to provide the application with control over operations that might take a long time. You might, want to push a Player through the realized and pre-fetched states so that a sound can be played immediately in response to a user action.

5.6.4 Supported media types If you get a device that supports the Mobile Media API, what kinds of data can it play? What data transfer protocols are supported? The Mobile Media API doesn't require any specific content types or protocols, but you can find out at runtime what is supported

by

calling

Manager's

getSupportedContentTypes()

and

getSupportedProtocols() methods. What's the worst that can happen? If you ask Manager to give you a Player for a content type or protocol that is not supported, it will throw an exception. Application should attempt to recover gracefully from such an exception, perhaps by using a different content type or displaying a message to the user.

60

5.7 Wireless Messaging API The messaging API is based on the Generic Connection Framework (GCF), which is defined in the Connected Limited Device Configuration (CLDC) 1.0 specification. The package javax.microedition.io defines the framework and supports input/output and networking functionality in J2ME profiles. It provides a coherent way to access and organize data in a resource-constrained environment. The design of the messaging functionality is similar to the datagram functionality that is used for UDP in the Generic Connection Framework. Like the datagram functionality, messaging provides the notion of opening a connection based on a string address and that the connection can be opened in either client or server mode. However, there are differences between messages and datagrams, so messaging interfaces do not inherit from datagram. It might also be confusing to use the same interfaces for messages and datagrams. The interfaces for the messaging API have been defined in the javax.wireless.messaging package.

5.7.1 Representation of a message A message can be considered as having an address part and a data part. A class that implements the interface defined for messages in the API represents a message. This interface provides methods that are common for all messages. In the javax.wireless.messaging package, the base interface that is implemented by all messages is named as Message. It provides methods for addresses and time stamps. For the data part of the message, the API is designed to handle both text and binary messages. These are represented by two sub-interfaces of Message: TextMessage and BinaryMessage. These sub-interfaces provide ways to manipulate the payload of the message as string and byte arrays, respectively. Unlike network layer datagram’s, the wireless messaging protocols that are accessed by using this API are typically of store-and-forward nature. Messages will usually reach the recipient, even if the recipient is not connected at the time of sending. This

61

may happen significantly later if the recipient is disconnected for a long period of time.

5.7.2 Message This is the base interface for derived interfaces. They represent various types of messages. This package is designed to work with Message objects that may contain different elements depending on the underlying messaging protocol. This is different from datagrams that are assumed always to be just blocks of binary data. An adapter specification for a given messaging protocol may define further interfaces derived from the Message interfaces included in this generic specification. The wireless messaging protocols that are accessed via this API are typically of store-and-forward nature, unlike network layer datagrams. Thus, the messages will usually reach the recipient, even if the recipient is not connected at the time of sending the message. This may happen significantly later if the recipient is disconnected for a long time. This interface contains the functionality common to all messages. Concrete object instances representing a message will typically implement other (sub) interfaces providing access to the content and other information in the message, which is dependent on the type of the message. Object instances implementing this interface are just containers for the data that is passed in. The setAddress() method just sets the value of the address in the data container without any checking whether the value is valid in any way.

5.7.3 Binary message BinaryMessage is an interface to represent a binary message. This is a sub-interface of Message class. It contains methods to get and set the binary data payload. The setPayloadData() method sets the value of the payload in the data container without any checking whether the value is valid in any way. Methods for manipulating the address portion of the message are inherited from Message. We are not using binary message to send the query results.

62

5.7.4 Text message TextMessage is an interface for representing a text message. This is a sub-interface of Message class. It contains methods to get and set the text payload. The setPayloadText method sets the value of the payload in the data container without checking whether the value is valid in any way. Methods for manipulating the address portion of the message are inherited from Message class, also getTime() is inherited from the message class. We are using text message to send the query entry. Sending query entry in the text format gives us the advantage that it is in the human readable format. If the message is sent to a device that is not supporting the WMA API, user of that device can read the message and can guess from where this message came. If the device is not supporting the WMA API, user cannot use that SMS to make the query or to see that image. User can see the text message in the device inbox but the user cannot use it for the query.

5.7.5 Sending and receiving messages As defined by the Generic Connection Framework, the message sending and receiving functionality

is

implemented

by

a

Connection

interface,

in

this

case,

MessageConnection. To make a connection, the application obtains an object implementing the MessageConnection from the Connector class by providing a URL connection string that identifies the address. If the application specifies a full destination address that defines a recipient to the Connector, it gets a MessageConnection that works in a “client” mode. This kind of Connection can only be used for sending messages to the address specified when creating it. The application can create a “server” mode MessageConnection by providing a URL connection string that includes only an identifier that specifies the messages intended to be received by this application. Then it can use this MessageConnection object for receiving and sending messages. The format of the URL connection string that 63

identifies the address is specific to the messaging protocol used. For sending messages, the MessageConnection object provides factory methods for creating Message objects. For receiving messages, the MessageConnection supports an event listener-based receive mechanism, in addition to a synchronous blocking receive() method. The methods for sending and Overview receiving messages can throw a SecurityException if the application does not have the permission to perform these operations. The generic connection framework includes convenient methods for getting InputStream and OutputStream handles for connections, which are stream connections. The MessageConnection does not support stream-based operations. If an application calls the Connector.open methods to create a stream, they will receive an IllegalArgumentException.

5.7.6 MessageListener Listeners are normally used to have non-blocking callback when an event occur associated with listener. It is normally supported in a multi-threaded environment. The MessageListener interface provides a mechanism for the MIDlet to be notified of incoming

messages.

When

an

incoming

message

arrives,

the

notifyIncomingMessage() method is called. The MIDlet must retrieve the message using the receive() method of the MessageConnection. MessageListener should not call receive() directly. Instead, it can start a new thread which will receive the message or call another method of the MIDlet (which is outside of the listener) that will call receive(). The listener mechanism allows applications to receive incoming messages without needing to have a thread blocked in the receive() method call. A MessageListener object is registered, that the platform can notify, when a message has been received on this MessageConnection. If there are incoming messages in the queue of this MessageConnection that have not been retrieved by the application prior to calling this method, the newly registered listener object will be notified immediately once for each such incoming message in the queue. There can be at most

64

one listener object registered for a MessageConnection object at any given point in time. Setting a new listener will de-register any previously set listener. Passing null, as the parameter will de-register any currently registered listener.

5.7.7 Security To send and receive messages using this API, applications (MIDlet) must be granted a permission to perform the requested operation. The mechanisms for granting permission are implementation dependent and device dependent. In the Nokia’s 3650 it is done from the setting of the suite. If the connection type is allowed then sending of message is also allowed. The permissions for sending and receiving may depend on the type of messages and addresses being used. Some port numbers are restricted. Implementation may restrict an application’s ability to send some types of messages. These addresses can include device addresses and/or identifiers, such as port numbers, within a device. An implementation may restrict certain types of messages or connection addresses, such that the permission would never be available to an application on that device. The applications should not assume that successfully sending one message mean that they have the permission to send all kinds of messages to all addresses. An application should handle SecurityExceptions when a connection handle is provided from Connector.open(url) and for any message receive() or send() operation that potentially engages with the network.

5.8 What is wrong with wireless Java? Wireless Java is still in its infancy in terms of maturity. This is an area in the Java specification that is expected to continue slow but steady growth for at least the next one year. Later this year, expect to see much more wireless Java integrated and/or used with the J2EE platform. Java 2 Micro Edition is a new emerging technology. Different vendors are porting it to their devices. Sun has a promise with the Java to “write once and run everywhere”. However, in the real world it means, write once and debug everywhere. Sun has 65

different virtual machines that are different in features. Sun is having different core APIs for different Java technologies. Floating-point support is not provided in the CLDC profile. Where as many core APIs are not supported to reduce the memory footprint. There are clearly branches in the virtual machines and these machines are not compatible. There are specifications for the Java 2 Micro Edition. However, these specifications have left many behavior features opened for implementation. The result is, different implementation behaviors for the same technology. One MIDlet is working properly in one device but it is crashing in another device. Both devices are having same profile but they have different implementation (from the same vendor or different vendors). Another problem is the bugs in the implementation. It is impossible to write bug free software. There are always bugs in the software. It is also true for MIDP implementation. If you are writing a MIDlet for different devices and they have different bugs in their implementation. It means you have to spend extra time to find some work around to those bugs. At the end, you have lot of work around; they will increase your code size and reduce the speed on the platform that is not having that particular bug. Sun’s reference implementation and different mobile implementations are not compatible in many areas. For example, if you read a PNG image of size N from a stream in an array of size N+1. Later you use Image class to create an image object. It works fine on Sun’s reference implementation but it is not working on some mobile implementation. In the current market situation, all the venders are pushing their devices out to generate some revenue. Vander are meeting the least requirements and shipping their devices out. Later when they find any bug, they are fixing those bugs and shipping the fixes in the new releases. That makes the life even worse. MIDlet is working on one device and it is not working on later releases of the same device.

66

6 Results and assessments Figure 7-1 shows the Graphical User Interface (GUI) on the Nokia’s 3650 mobile phone. This implements the proposed content-based image indexing and retrieval system. Figures 7-2 to 7-6 show the results of different types of queries made to the image database, namely, color histogram, shape and texture queries. In each case, one similar image is retrieved and displayed on the phone’s screen. Due to the limitations imposed by the wireless device, the operating system and the communication channel, a rather slow query response has been achieved. Table 1 shows the timing of different queries and image retrieval in Sun’s reference implementation on PII system. Where as table 2, 3 and 4 shows querying and image retrieval timing from 3650, 7650 and 7250. Query time on the server starts when a query arrives at the servlet, the servlet extracts the image features and makes the query in the native code. It includes the time to save the results on the server. Servlet reads the query results and creates a resultant string, which is sent back to the client. Resultant string to the client contains the image names, internal index of image in the database and similarity scores for ten images. The image retrieval time starts when the server (Tomcat) passes the image retrieval request to the servlet. It includes the time needed to retrieve the actual images from the server’s file system and resize them on the server side. It also includes the time to convert the JPG or GIF image to the PNG image format. Screen size of the mobile devices has an impact on the image retrieval time. As content-based information retrieval system on mobile devices, is using client server architecture. It has many dynamic factors that affect the performance of the system dramatically. The most important is the mobile network. If the mobile network is too busy then the query and image retrieval time is too long. During the rush hours, image retrieval is painfully slow. When the network is not over loaded then the query and image retrieval time is reasonable. In this study, we alter many dynamic factors to improve the query and image retrieval time. We use server on different PCs and different network connections.

67

First server is running on the PC (PII, 300MHz, 124MB RAM and windows XP (operating system), where as second server is on a modern P4 system (1.8GHz, 512MB RAM and windows XP). Table 1 to 4 shows the query time on the server and the client side. Client time is the time when the client gets the query results. Clients are connected to the server with the Circuit Switch Data (CSD), High-Speed Circuit Switch Data (HSCSD) call and by GPRS. Where as table 1 to 4 also show the image retrieval time from the server. Server first resizes the image to the size requested by the client and then converts the image to the Portable Network Graphics (PNG) image format. Server is always sending the PNG image to the client. Results in the table 1 to 4 depend upon many dynamic factors like load on the network, server status and available memory in the client.

Query on sun’s reference implementation with PII server Query Data Base Query Time on server Query Time on client Histogram 930 ms 1420 ms Texture

549 ms

990 ms

Shape

1872 ms

7580 ms

Image retrieval on Sun’s reference implementation with PII server Image Database

Time on server Time on Client

Image Size

Histogram

141 ms

1050 ms

7114B

Texture

78 ms

2920 ms

17145B

Shape

76 ms

990 ms

459 B

Table 1: Query and image retrieval time by using sun’s reference implementation

3650 using HSCSD and P4 server Query Data Base Query Time on server Query Time on client Histogram 156 ms 5328 ms Texture

78 ms

5140 ms

Shape

125 ms

12031 ms

Image retrieval 3650 using HSCSD and P4 server Image Database

Time on server Time on Client

68

Image Size

Histogram

141 ms

16094 ms

50912 B

Texture

78 ms

15023 ms

48315 B

Shape

76 ms

2650 ms

878 B

Table 2: querying and image retrieval time on 3650 In the table 2, histogram query for image retrieval is the first query so it takes more time then texture. Later all the queries will take less time as all the required libraries are in the memory. 7650 using HSCSD and P4 server Query Data Base Query Time on server Query Time on client Histogram 156 ms 8328 ms Texture

78 ms

7140 ms

Shape

6594 ms

10580 ms

Image retrieval 7650 using HSCSD and P4 server Image Database

Time on server Time on Client

Image Size

Histogram

142 ms

18094 ms

51814 B

Texture

79 ms

16023 ms

48218 B

Shape

78 ms

2603 ms

570 B

Table 3: querying and image retrieval time on 7650 7250 using GPRS and P4 server Query Data Base Query Time on server Query Time on client Histogram 110 ms 47483 ms Texture

80 ms

46231 ms

Shape

125 ms

62580 ms

Image retrieval 7250 using GPRS and P4 server Image Database

Time on server Time on Client

Image Size

Histogram

139 ms

45385 ms

22115 B

Texture

78 ms

47295 ms

25741 B

Shape

78 ms

2603 ms

678 B

Table 4: querying and image retrieval time on 7250 The timing provided in Table above should be interpreted with care. They varied quite a lot during testing. As they depend on a number of rather dynamic factors. Such as

69

the network traffic, (load) and the load on the server. As well as the state of the server, (i.e., servlet must be uploaded again in case no one requested its use from the server) and the available memory on the device (PC). Performance of the client also depends upon the available memory in the device. MUVI client have some work around for different bugs in the MIDP implementation on different devices. Testing on Sun’s reference implementation shows that, different work around degrades the performance from 12% to 15%. Mobile devices have slow processors so the performance cost will be higher for them. Demo shows that such an implementation is feasible; however, due to the limiting factors in both the hardware and software of the wireless terminal as well as the communication channel, limited results have been obtained, namely, reduced sizes of image query results, small number of images, long process and access time. The good news is that with the advent of 3G networks, offering higher data rates, more processing power in wireless devices and more memory, such an application would be possible. This implementation is targeted to small sized wireless devices with limited capabilities. Results show that CBIR is possible on mobile devices. However, the bottleneck is the limited bandwidth of the mobile devices. Many other limitations should also be considered. For example the server, initially we were using PII as a server that gives us poor results. Later when system becomes stable, we port the server to P4 base system. P4 is giving us good results. We are getting approximately eight times better results than PII based server. We should use a high-speed dedicated server for the MUVIS. That will give us even better results.

70

Figure 7-1 Show the Client user interface on 3650

Figure 7-2: Query image and results are shown on 3650

71

Figure 7-3: Query images are shown on 3650 and 7650

72

Figure 7-4: Query results are shown on 3650 and 7650

Figure 7.5: Client user interface is shown.

73

Figure 7.6 Client is running on 6800

74

7 Conclusions and future work As describe above media indexing and retrieval is a complex problem. Extending it to the mobile devices not only inherits all those problems but it also introduces new challenges. Mobile devices are resource-constrained. New generation of mobile devices has many advantages but still they are lacking in many areas, user interface, computational power, battery power, and connectivity. Another important thing is user of the mobile devices is not a computer programmer. Rather he is an ordinary person. Therefore, we need a robust implementation of CBIR on the mobile devices. Studies prove that such an implementation is feasible. However, there are some limitations. Mobile devices are resource limited. They have less processing power, less memory and limited device storage capacity to save information. The bottleneck is the network bandwidth. Mobile devices have very limited bandwidth that is not sufficient for multimedia application. The good news is mobile devices are improving their resources. Modern mobiles are not only having new resources but are also improving the existing resources. Future devices will have more memory, faster processors and better bandwidth for network connectivity. 3G is the promise for the high band in the mobile network. Java 2 Micro Edition is providing us an environment to run the application securely. J2ME is also providing new APIs to access the different device resources. In the future mobile devices, we can capture the audio and video. 3G technologies give us opportunity to exchange the multimedia information at a reasonable bandwidth. Human visual system is ideal for image recognition and matching. It may take forever or very long to build such a reliable system. There are many CBIR systems; they are working with reasonable accuracy in restricted domains. CBIR system can give good results if it gets extra information about the image for querying. Modern mobiles can provide audio and video information for the query. Audio information can be used to assist the image-based query. On the server side, we can use speech to text converter and use annotated information to help the query.

75

Query with the help of audio and video gives us many options to help the visual query. We can use the audio part to assist the query in the visual part. We convert the audio part to text by speech to text APIs. Later we can use annotated information to narrow down the area for the visual search. We can also use the annotated information with the help of speech to text information to adjust the grade of visual query results. Mobile devices are input and output limited. User of mobile device usually does not have keyboard to enter the commands. User interface is limited. Therefore, the user has limited options to provide extra information for the query. CBIR application on the mobile devices require simple and efficient user interface. MIDP 2.0 has extra APIs for the user interface. MIDP2.0 provides a new security model for the mobile applications. It has support for secure network. MIDP2.0 has limited support of media API for mobile devices. MIDP 2.0 with the extra APIs will provide a good framework for the multimedia application. CBIR client can capture audio, video from the device and send it to the server for querying. 3G networks will provide bandwidth for the multimedia content.

76

8 Summary of the publications The thesis consists of three publications. Two publications are in the area of contentbased information retrieval and the third one is in the medical image processing area. Query by Image Content using Mobile Information Device Profile (MIDP) We present a Java-based client-server application for content-based image retrieval. The application on the client side is running on the Mobile Information Device Profile (MIDP) devices and has written in Java. Query by Image Content using NOKIA 9210 Communicator We present a new Java-based client-server application for content-based image retrieval over wireless networks. The application on the client side is running on the NOKIA's 9210 Communicator and has written in personal Java.

IMAGE SEGMENTATION ENDOPROSTHESES

OF

THE

CT-SCANS

OF

HIP

3D femoral endosteal cavity shapes are modeled based on CT-scans (Computed Tomography Scans). An interactive image-processing tool is developed. This system detects the femoral component and the femoral medullar canal in order to study how the components fit and fill the femoral medullar canal. In the CT imaging, thirty axial slices are taken above and below the lesser trochanter area from each femur. Different image analysis methods are used for femoral cavity detection depending on the structure of the processed slice.

8.1 Author’s contribution to the publications The work described in this thesis has been carried out in collaboration with the coauthors. The author’s contribution has been essential in that he designed and developed the client on the mobile devices as well as the server on the PC side.

77

In publications, one and two author has designed and developed the Java client on the phone side where as server on the PC side by using servlet APIs. Servlet is making native calls to use the code from the co-authors. In third publication author has design the user interface, wrote different methods for the edge enhancement, contact detection in bone and implant. Author was also responsible for the testing and result verifications.

78

PUBLICATIONS [Publication P1] Ahmad Iftikhar, Faouzi Alaya Cheikh, Bogdan Cramariuc and Moncef Gabbouj, “Query by Image Content using Mobile Information Device Profile (MIDP)” Finsig03, May 19, 2003, in Technical University of Tampere, Tampere, Finland

79

80

QUERY BY IMAGE CONTENT USING MOBILE INFORMATION DEVICE PROFILE (MIDP) Ahmad Iftikhar1 , Faouzi Alaya Cheikh2 , Bogdan Cramariuc2 and Moncef Gabbouj2 1

Nokia Mobile Phones P.O. Box 1000 (Visiokatu 3, Tampere 33720), FIN-33721 2 Signal Processing Laboratory Tampere University of Technology P.O. Box 553, FIN-33101 Tampere, Finland [email protected] 1. ABSTRACT In this paper we present a Java-based client-server application for content-based image retrieval over wireless networks. The application on the client side is running on the Mobile Information device profile (MIDP) and is written in JavaT M . Index Terms- Content, Retrieval, Indexing, Multimedia, Image, Search, Wireless, Mobile. 2. INTRODUCTION 2.1. Wireless Communications and Terminals The way people communicate is changing very fast. Few years ago, mobile phones were used exclusively for voice calls. Today the mobile terminal usage is no longer restricted to voice communication only. In Finland, it is widely accepted among youngsters to use a GSM phone for sending SMS messages, to chat with friends or to play games. Adults may be more interested in checking their stocks or paying a bill using their wireless terminal and the Wireless Application Protocol (WAP). In Japan a phenomenal change in the use of mobile phones happened by the introduction of the “iMode” [1] system. The number of users since its introduction two years ago has risen to 20 millions. The third generation, or 3G [2], phones will create new opportunities for content providers, by providing a way of transmitting text, voice, images, and streamed video. Moreover, their ability to be connected to Internet all the time will provide users with an overwhelming access to a huge amount of information. Users will then face the problem of how to retrieve the information of interest to them in an efficient manner. The goal is to allow for searching and navigation in this wealth of data without the need to make text-based queries for three obvious reasons:

• The user may be unable to type in commands,

• The keyboards of portable devices are not very comfortable for text-based commands,

• Text-based queries may not be very appropriate in the case of images, video or music.

In this paper we will introduce a content-based search engine and its graphical user interface. Even though, the newly introduced pervasive devices are having faster processors, larger memories and their available communication bandwidth is getting wider, they remain far behind the PC capabilities. Therefore, a major challenge in designing such a system is to understand the characteristics of such devices and their hardware and software limitations.

2.2. Content-based Indexing and Retrieval Since the early 1990s, content-based indexing and retrieval (CBIR) of digital images became a very active area of research. Both industrial and academic systems for image retrieval have been proposed. Most of these systems (e.g. QBIC [3] from IBM, NETRA [4] from UCSB, Virage [5] from Virage Inc., MUVIS [6] from TUT) support one or more of the following options: browse, search by example, search based on a single or a combination of low level features. These features can be extracted from the image, such as color, shape, texture, spatial layout of objects in the scene or added to it after its capture, such as contextual information and keywords

3. CLIENT-SERVER ARCHITECTURE 3.1. The Client Side: The Mobile Information Device Profile (MIDP) 3.1.1. Introduction As Java 2 Micro Edition is becoming the defacto stand of wireless device so we have selected J2ME platform as our client side. We are doing as little processing of the information as possible on the client side due to the resource limitions on the client side. MIDP clients can come in many sizes, ranging from wireless mobile devices and Personal Digital Assistant (PD) all the way up to the desktop. The functionality supported by clients can vary as well. Simple clients deliver only web pages. More sophisticated clients, known as rich clients, can deliver services with multimedia content: sampled audio, synthetic tones, MIDI, and video. 3.1.2. Hardware Details The objective of the MIDP is to establish an open, thirdparty application development environment for the Mobile Information Devices (MID). So the MID should have following hardware specifications: Screen-Size: 96 x 54, Display depth: 1-bit, Pixel shape (aspect ratio): approximately 1:1. The input through one or more of the following userinput mechanisms: • One-handed keyboard, • Two-handed keyboard, • Touch screen

3.1.4. The Server Side On the server side we are using a servlet [7]. The client sends the query to the servlet; which checks the query media type and passes it to the appropriate query handler. The heavy processing required for the feature extraction, similarity estimation and results presentation are done on the server through calls from the Java side to methods implemented in native code. In this way we take advantage of the more efficient native code as compared to the pure Java implementation. 3.2. Communication Protocol A communication protocol is defined between the client and the server. This protocol specifies the media type (Image media, Video media, or Audio media, currently we are using Image media only), query type (random query from database, query by image data or query with an image from the database) and query data (image data if the image is not in the database or images’ index in the database). The server sends back to the client the status of query execution and the results of the query, which consists of the list of names of the images and their similarity scores with respect to the query image. The client later fetches scaled versions of the images to be presented to the user. Scaling is done on the server side, in order to reduce the traffic. 4. THE USER INTERFACE AND SCREEN SIZE CONSIDERATION In this application addition to the processing and memory issues, the designer has to consider the screen size of the wireless devices. Normally MIDP devices can show only one image at a time.

Memory: • 128 kilobytes of non-volatile memory for the MIDP components • 8 kilobytes of non-volatile memory for applicationcreated persistent data • 32 kilobytes of volatile memory for the Java runtime (e.g., the Java heap) Networking is a two-way, wireless, possibly intermittent, with limited bandwidth. 3.1.3. Operating System A minimal kernel to manage the underlying hardware (i.e., handling of interrupts, exceptions, and minimal scheduling). This kernel must provide at least one schedulable entity to run the Java Virtual Machine (JVM).

5. RESULTS AND ASSESSMENTS Figures 1-3 show the results of different types of queries made to the image database, namely, color histogram, shape and texture queries. In each case, one similar image is retrieved and displayed on the Mobile Information Device (MID) screen. Server is running on the PC (PII, [8] 300MHz, 124MB RAM, windows XP [9] operating system), Table 1 and 3 show the query time on the server side. Time on the server include only query time on the server. However it is not including time to read the query information from the client and the time to send the results to the client (7650 [10] and 3650 [11]). Where as client time shows the time to send the query to the server, query time on the server and retrieval of the query results. Clients are connected to the server with the circuit switch data (CSD) call [12]. Where as table 2 and 4 show the image retrieval time from the server. Server first resizes the image to the size requested by the client and then

Table 1. Time taken to make a query on the server side and on the client 7650 Query DataBase Server Time Client Time Histogram 851ms 38650ms Texture 481ms 7159ms Shape 1392ms 32153ms

converts the image to the Portable Network Graphics (PNG) [13] image format. Server is always sending the PNG image to the client. Results in the table 1 to 4 depend upon many dynamic factors, load on the network, server status and available memory in the client.

(a)

(b)

Fig. 2. Shape based (a) query and (b) retrieved images.

6. CONCLUSIONS AND FUTURE WORK A novel implementation of TUT’s MUVIS image query system has been proposed and tested on the MIDP emulator using a Java-based client server paradigm. A functional GUI was implemented taking into account the small size of the Mobile devices. however, due to the limiting factors in both the hardware and software of the wireless terminal as well as the communication channel, very limited results have been obtained, namely, reduced sizes of image query results, small number of images, long process and access times. The good news is that with the advent of 3G networks, offering higher data rates and more processing power in wireless devices and more memory, such an application would be possible. Due to the restrictions imposed by the mobile devices technical specifications especially the screen size and by the communication cost. Moreover, the usage of pseudo relevance feedback [14], [15] will enhance the performance of our retrieval process; without increasing the communication cost or crowding the user interface with additional check boxes or radio buttons.

(a)

(b)

Fig. 1. Histogram based (a) query and (b) retrieved images.

(a)

(b)

Fig. 3. Texture based (a) query and (b) retrieved images.

Table 2. Image retrieval time on 7650 Query DataBase Histogram Texture Shape

Server Time 8331ms 641ms 1692ms

Client Time 38594ms 40625ms 2656ms

Image Size 41936B 44122B 878B

Table 3. Time taken to make a query on 3650 Query DataBase Server Time Client Time Histogram 711ms 38703ms Texture 411ms 7109ms Shape 1495ms 32203ms

7. REFERENCES [1] http://www.ntt.docomo.com/i/. [2] http://www.3gpp.org/. [3] http://wwwqbic.almaden.ibm.com/ qbic. [4] http://maya.ece.ucsb.edu/netra. [5] http://www.virage.com. [6] http://www.iva.cs.tut.fi/homepage/mainmuvi.html. [7] http://java.sun.com/products/servlets. [8] http://www.intel.com/. [9] http://www.microsoft.com/windowsxp/. [10] http://www.nokia.com/nokia/0,5184,137,00.html. [11] http://www.nokia.com/nokia/0,5184,2273,00.html. [12] http://www.forum.nokia.com. [13] http://www.libpng.org/pub/png/. [14] J.J. Rocchio. Relevance feedback in information retrieval, in the smart retrieval system. Experiments in Automatic Document Processing, pages 313–323, 1971. Prentice Hall, Englewood Cliffs, New Jersey, USA. [15] Y. Rui, T.S. Huang, and S. Mehrotra. “content-based image retrieval with relevance feedback in MARS”.

[Publication P2] Ahmad Iftikhar, Faouzi Alaya Cheikh, Bogdan Cramariuc and Moncef Gabbouj, “Query by Image Content using NOKIA 9210 Communicator”, Workshop on Image Analysis for Multimedia Services (WIAMIS) 2001, COST 211, From May 16-17, 2001, Technical University of Tampere, Tampere, Finland.

85

86

Query by Image Content using NOKIA 9210 Communicator Ahmad Iftikhar1, Faouzi Alaya Cheikh2, Bogdan Cramariuc2 and Moncef Gabbouj2

Abstract— In this paper we present a new Java-based clientserver application for content-based image retrieval over wireless networks. The application on the client side is running on the NOKIA's 9210 Communicator and is written in pure Java™.

•

The user may be unable to type in commands.

•

The keyboards of portable devices are not very comfortable for text-based commands.

•

Text-based queries may not be very appropriate in the case of images, video or music.

Index Terms— Content, Retrieval, Indexing, Multimedia, Image, Search, Wireless, Mobile, Communicator.

Therefore, a content-based indexing and retrieval engine coupled to a speech recognition engine could be the ultimate interface to such a system

I. INTRODUCTION

In this paper we will introduce a content-based search engine and its graphical user interface. A demo of the system will be given during the presentation. The speech recognition part is not considered in this paper.

A. Wireless Communications and Terminals The way people are communicating is changing very fast. Few years ago, mobile phones were lucrative items restricted to a very small community of rich businessman and government agents. Moreover, they were used exclusively for voice calls. Today the mobile terminal penetration is growing steadily and continuously. And their use is no longer restricted to voice communication only. In Finland, it is widely accepted among youngsters to use a GSM phone for sending SMS messages, to chat with friends or to play games. Adults may be more interested in checking their stocks or paying a bill using their wireless terminal and the Wireless Application Protocol (WAP). In Japan a phenomenal change in the use of mobile phones happened by the introduction of the “iMode” [IMODE] system. The number of users since its introduction two years ago has risen to 17 millions. The third generation, or 3G [3G], phones will create new opportunities for content providers, by providing a way of transmitting text, voice, images, and streamed video. Moreover, their ability to be connected to the Internet all the time will provide users with an overwhelming access to a huge amount of information. Users will then face the problem of how to retrieve the information of interest to them in an efficient manner. The goal is to allow for searching and navigation in this wealth of data without the need to make text-based queries for three obvious reasons:

Even though, the newly introduced pervasive devices are having faster processors, larger memories and their available communication bandwidth is getting wider, they remain far behind the PC capabilities. Therefore, a major challenge in designing such a system is to understand the characteristics of such devices and their hardware and software limitations. B. Content-based Indexing and Retrieval Since the early 1990s, content-based indexing and retrieval (CBIR) of digital images became a very active area of research. Both industrial and academic systems for image retrieval have been built. Most of these systems (e.g. QBIC™ [QBIC] from IBM, NETRA [NETRA] from UCSB, Virage [VIRAGE] from Virage Inc., MUVIS [MUVIS] from TUT) support one or more of the following options: browse, search by example, search based on a single or a combination of low level features. These features can be extracted from the image, such as color, shape, texture, spatial layout of objects in the scene or added to it after its capture, such as contextual information and keywords II. CLIENT-SERVER ARCHITECTURE A. The Client Side: The Nokia 9210 Communicator 1) Introduction

1

Nokia Mobile Phones, P.O. Box 1000 (Visiokatu 3, Tampere 33720), FIN-33721 Tampere, Finland 2

Tampere University of Technology, P.O. Box 553, FIN33101, Tampere, Finland

Nokia 9210 Communicator [NOKIA] is a major step forward in the road to the Mobile Internet environment. This pioneering product showcases the key elements in future mobile communications, such as easy navigation and input, a high-quality color display, mobile messaging with high data speed, imaging and video clips. Additionally, Java support and Symbian's OS (operating system) [EPOC]

bring open development interfaces to the Nokia 9210 Communicator for numerous additional applications to be provided by any third party developers. 2) Hardware Details Nokia 9210 communicator [9210F] contains 32-bit ARMbased RISC processor. It has 8 MB (SD-RAM) of execution memory and its C drive (serial flash) is of 4MB. It can have a multimedia card of up to 64MB, see Figure 1.

RAM 8 MB (Execution area) SF 4MB (Communicator memory)

MB of memory when just VM is up (without Java application). The Java Application (image search engine) takes an additional 397 KB of RAM leaving very little memory for images. In this implementation, images are fetched when requested to be displayed and discarded when not need. B. The Server Side On the server side we are using a Servlet [SERV]. Servlets are Java programs that extend the capabilities of the server. They are similar to applets in a browser. The client sends the query to the servlet; which checks the query media type and passes it to the appropriate query handler.

16 MB MMC card

ROM (ROM of the device)

Figure 1: Memory configuration of the Nokia 9210 It has a color display of 4096 colors. Display size is 640 x 200 pixels. In addition, it has a relatively large size keyboard and is capable of making high speed data calls. It uses Symbian's operating system Crystal 6.0. Java virtual machine consumes about 2.1MB. The proposed implementation of the Java application consumes approximately 397KB of memory. 3) Operating System Symbian's platform [SYMB] is a robust, object oriented operating system for devices with limited capabilities (small memory, little computing power, sensitive to power consumption). Devices using this system do not need to reboot often as this OS is stable, does not leak memory (or very little) and manages the system resources efficiently. Since Symbian's platform devices have little memory, small secondary storage and less computational power, applications written for this systems must be efficient. This is especially the case of Symbian Crystal release 6.0 [CRYST] intended for wireless media. The proposed image search engine deals with images and thus consumes a large amount of memory. The system must thus be well managed to avoid such memory related problems. A high-speed data link is used. Actually the 9210 supports data links up to 43.2KB (High Speed Circuit Switched Data, HSCSD) [9210F], but we are using 38.4KB. High-speed data call reduces airtime but it is a costly option. Airtime will be reduced in 3G systems, and thus queries can be made without using a high-speed data call. Only a high-speed connection for data transfer will be needed. Personal Java [PJAVA] is ported on the Crystal 6.0 that is compatible with JDK1.1.8 [JDK]. But current implementation of Personal Java is not supporting swing (a pure graphics APIs of Java). Personal Java consumes 2.1

The heavy processing required for the feature extraction, similarity estimation and results presentation are done on the server through calls from the Java side to methods implemented in native code. In this way we take advantage of the more efficient native code as compared to the pure Java implementation. C. Communication Protocol A communication protocol is defined between the client and the server. This Protocol specifies the media type (Image media, Video media, or Audio media, currently we are using Image media only), query type (random query from database, query by image data or query with an image from the database) and query data (image data if the image is not in the database or images’ index in the database or image location URL). The server sends back to the client the status of query execution and the results of the query, which consists of the list of names of the images and their similarity scores with respect to the query image. The client later fetches scaled versions (80 x 60 pixels) of the images to be presented to the user (in our experiments we requested 10 images). Scaling is done on the server side, in order to reduce the traffic. Only on the request of the user the full size image is fetched from the server. III. THE USER INTERFACE AND SCREEN SIZE CONSIDERATION

As mentioned earlier, wireless devices have limited resources. In this application and in addition to the processing and memory issues, the designer has to consider the screen size of the wireless device. As the 9210 belongs to the communicator class, it has a relatively larger screen (640 x 200) [9210F]. When displaying the query results, images have been resized to fit the available display. As can be seen in the examples given in Figures 2-5, the image content can still be legible. Furthermore, in the 9210 we take advantage of the command button area and place the four most used commands there. The other commands are placed in the menu. The menu is displayed only when the user presses the menu button, and hence it is not consuming screen space when it is not active.

IV. RESULTS AND ASSESSMENTS Figure 2 shows the GUI and the menu toolbar on the Nokia 9210 Communicator which has been implemented for the proposed content-based image indexing and retrieval system. As can be seen, the screen space is fully utilized and the important features are displayed. Remember that the menu is not accessed very often, and thus the space is used to display the query results. The most commonly used buttons are assigned to the command button area on the right side of the screen. Figures 3-5 show the results of different types of queries made to the image database, namely, color histogram, shape and texture queries. In each case, the top ten similar images are retrieved and displayed on the Communicator screen. Due to the limitations imposed by the wireless device, the operating system and the communication channel, a rather slow query response has been achieved. Table 1 below shows the timing obtained in different queries made with the Nokia 9210 Communicator. Query time on Server starts when a query arrives at the servlet, the servlet extracts the image features and makes the query in the native code. It includes also the time to save the results on the server and creates a Java result object and passes it to the servlet to send it to the client. Sending result object to the client is the time to send Java object containing the image names and similarity scores for 50 images. The image retrieval time starts when the server passes the image retrieval request to the servlet. It includes the time needed to retrieve the actual images from the server’s file system and resize them on the server side. Finally, the image transfer time is the time it takes the server to transmit the resized images to the client.

Query type Shape Histogram Texture 26 17 16 Query time on server (sec) 100 110 108 Sending result object (ms) 230 245 260 Image retrieval (ms) 430 470 490 Image transfer time (ms) Table 1: Timing Results for Image Query The timing provided in Table 1 should be interpreted with care. They varied quite a lot during the testing as they depend on a number of rather dynamic factors, such as the network traffic (load), the load on the server (as well as the state of the server, i.e., servlet must be uploaded again in case no one requested its use from the server) and the available memory on the device. V. CONCLUSIONS AND FUTURE WORK A novel implementation of TUT’s MUVIS image query system has been proposed and tested on the new Nokia 9210 Communicator using a Java-based client server

paradigm. A functional GUI was implemented taking into account the small size of the Communicator. The Demo shows that such an implementation is feasible; however, due to the limiting factors in both the hardware and software of the wireless terminal as well as the communication channel, very limited results have been obtained, namely, reduced sizes of image query results, small number of images, long process and access times. The good news is that with the advent of 3G networks, offering higher data rates and more processing power in wireless devices and more memory, such an application would be possible. Furthermore, a more efficient Java implementation, called J2ME (JAVA 2 Micro Edition) [J2ME], is under development. This implementation is targeted, among other applications, to small size wireless devices with limited capabilities. VI. ACKNOWLEDGEMENTS We gratefully acknowledge the support of Mr. Timo Ulmanen, from Nokia Mobile Phones for his efforts and support on behalf of this work. VII. REFERENCES [IMODE]

http://www.ntt.docomo.com/i/

[3G]

http://www.3gpp.org/

[QBIC]

http://wwwqbic.almaden.ibm.com/~qbic/

[NETRA]

http://maya.ece.ucsb.edu/Netra/

[VIRAGE]

http://www.virage.com/

[MUVIS]

M.Trimeche, F.Alaya Cheikh, M.Gabbouj and Bogdan Cramariuc, "Content-based Description of Images for Retrieval in Large Databases:MUVIS," X European Signal Processing Conference, Eusipco-2000, Tampere, Finland, September 5-8, 2000, pp. 139-142.

[J2ME]

http://java.sun.com/j2me/

[PJAVA]

http://java.sun.com/products/personaljava/

[JDK]

http://java.sun.com/products/jdk/1.1/

[SERV]

http://java.sun.com/products/servlets/

[NOKIA]

http://www.nokia.com/phones/9210/index.html

[9210F]

http://www.nokia.com/phones/9210/features.html

[SYMB]

http://www.symbian.com/

[CRYST]

http://www.symbian.com/technology/v6-papers/v6papers.html

[EPOC]

http://www.epocworld.com

Figure 2: GUI on the Nokia 9210 Communicator

Figure 3: Results of a color histogram-based image query

Figure 4: Results of a shape-based image query

Figure 5: Results of a texture-based image query

[Publication P3] Iftikhar Ahmad, Azhar Quddus, Heikki-Jussi Laine, Olli Yli-Harja, “Image segmentation of the CT-scans of hip endoprostheses”, NORSIG2000, IEEE Nordic Signal Processing Symposium 2000, held at Vildmarkshotellet, Kolmården, Norrköping, Sweden, June 13-15, 2000. http://www.es.isy.liu.se/norsig2000/publ/page271_id129.pdf

92

93

IMAGE SEGMENTATION OF THE CT-SCANS OF HIP ENDOPROSTHESES Iftikhar Ahmad1, Azhar Quddus2, Heikki-Jussi Laine3, Olli Yli-Harja4 1

2

Tampere University of Technology, P. O. Box. 553, FIN-33101, Tampere, Finland, [email protected]

Tampere University of Technology, P. O. Box. 553, FIN-33101, Tampere, Finland, [email protected] 3

Tampere University Hospital Department of Surgery, P. O. Box 2000, SF-33521, Tampere, Finland, [email protected] 4

Tampere University of Technology, P. O. Box. 553, FIN-33101, Tampere, Finland, [email protected]. ABSTRACT

3D femoral endosteal cavity shapes are modelled based on CT-scans (Computed Tomography Scans). An interactive image-processing tool is developed. This system detects the femoral component and the femoral medullar canal in order to study how the components fit and fill the femoral medullar canal. In the CT imaging thirty axial slices are taken above and below the lesser trochanter area from each femur. Different image analysis methods are used for femoral cavity detection depending on the structure of the processed slice. Results are saved in a file for further use. In the femoral shaft area simple technique works fine, but in the problem areas of the femoral bone more sophisticated method is required. In this paper we propose a model-based approach. 1. INTRODUCTION Total hip replacement (endoprosthesis, artificial hip joint) is one of the most commonly performed orthopedic procedures. Every year there are about three thousand patients operated (artificial hip joint or endoprosthesis operations) in Finland and two million around the world because of degenerative hip joint disease. Previously a computed tomography based image-processing program was developed and tested for detection of the femoral cavity in order to study the anatomy [0, 1, 2]. In the present study twenty plastic replicas of femoral components of hip endoprosthesis have been implanted to cadaver femora (thighbone). These specimens had been scanned cross-sectionaly by computed tomography (CT). The aim of the present study is to develop a system, which detects the femoral component and the femoral cavity in order to study how the components fit and fill the femoral cavity. Epoxyplastic was selected material for implants in order to avoid artefacting effect of original metal implants in computed tomography, and an epoxyplastic stem can also be appropriately inserted into femoral canal. The fit and exact contact of the cements

femoral component to the supporting cortical bone and the optimal fill of the endosteal cavity reduces micromotion so that the bony ingrowth and stable fixation can be achieved [21, 22, 23, 24, 27]. The importance of fit and fill in achieving an optimal implant-bone load transfer to minimize stress-shielding and disadvantageous bone remodelling [25, 26, 28, 29]. Previously a computed tomography based image-processing program was developed and tested for detection of the femoral canal in order to study the anatomy [0, 1, 2], and the remarkable variability especially in the upper region of the femoral canal was successfully described [2]. Comparison of the fit and fill of different designs of cementless femoral stems would be valuable in order to predict their long-term performance [30]. Permission to the use of cadaver femora was obtained from the Institutional Ethical Committee of Tampere University Hospital. Nobel P. C. [12] demonstrated that the femoral cavity has no universal shape and described great variability of the femoral endosteal anatomy as well as poor correlation between different endosteal dimensions by using standard radiographs of 200 cadaver femora. Studies using computed tomography combined with image processing methods have revealed the accuracy of the CT compared with radiographs. Three-dimensional anatomic reconstruction has been used as the basis of the femoral component design. Moreover, systems for individual optimal-fit component design, i.e. custom-made stems, have been published (Robertson [17], Bargar W. L. [6], Rhodes M. [15], Bargar W. L. [5], Brien W. W. [8], Hua J. and Walker [9]). Robertson D. D.[16] conclude that a second-order correction algorithm significantly improves the accuracy of CT sizing methods of medullary canal. Mankovich N. J. [11] presented an edge-detection method based on sampling radial lines and spline interpolation. The method was tested by using plastic bone models. CT images are gray-scale images, and both region and edge based segmentation methods can be used. Theoretically, the result of the region and the edge-based

methods will be equal. The simplest region-based method, thresholding, is widely used in image processing and there exist a number of proposal to calculate suitable threshold values (Wezka J. [19], Pun T [14], Wu A. Y. [20], Sahoo P. [18], Luijendijk H. [10], BenHajelBoujemaa H. [7], Pal N. R.and Pal S. k. [13]), but simple thresholding does not give good results. 2. OBJECTIVE OF THE STUDY The goal was to find the fit and fill of the implant in the femoral cavity. In order to find the fit and fill, one has to find the inner border of the femoral cavity and outer boarder of the implant. Fill of the implant is defined as the proportion of the implant area to cavity area in crosssectional image. The fit of the implant is defined as the proportion of the implant-bone contact surface to whole implant surface in cross-sectional image. If the implant has more touching pixels then it has a better fit, where as less touching pixel means a bad fit. As with aging femoral cavity expands, a bad fit might cause problems.

DC level is in implant. Once we get a point inside the implant we track the external boundary of implant, by using the boundary tracking method. After that we detect the inner hard bone by the same procedure. In many cases the external bone is missing or it is too dark to find. We use spline interpolation to complete the missing pixels. The user can also intervene the automatically computed results by direct manual editing. User can edit the implant points and the inner hard bone points separately. After editing the implant pixel, the user can use the manual edited results or original results, computed by pro-

3. CROSS- SECTIONAL MODEL OF THE FEMORAL BONE WITH IMPLANT Images have a large variety of shape and size. Where as implant and cavity also have big diversity in shape and size. In some images, CT-scans, the external bone is missing. In the upper part of the bone there may be soft bone, which may have cavities. These cavities appear as misleading edges in cross-sectional images.

Figure 2 Simple model of the scanned image.

gram, for the rest of the processing. There are some problematic images. which don’t obey this model. These are preprocessed by using nonlinear cellular neural filtering [3], LUM filter [4], or by simple thresholding. With help of these filters we first improve the edges and then process them in the above described manner. Nonlinear cellular neural filtering works by bit planes. It is based on Boolean operations. As we have 12-bit images, so the system is very slow, but it gives good results. 4. PREPROCESSING FILTERS 4.1. LUM Filter Figure 1 Examples of the implant and hard bone In fig. 1 one can see the large variety of images. Implant change’s its shape and size. Also external bone change its shape and size. In the proposed solution we use a model based adaptive technique. Despite the big variety of images, as shown above, we can visualize some apriori information in the image. In fig. 2, cross-section of the image, the implant has the highest gray value, the inner hard bone has second highest level. The model based technique searches for the implant and after that it will look for inner hard bone. First we pass the image through a low pass filter, detecting the maximum DC level. We assume, the maximum

LUM filter [4] can be used for smoothing or sharpening. The LUM smoother and other rank-order smoothers obtain smoothing characteristics by shifting samples toward the median. To obtain sharpening characteristics, samples must be moved away from the median toward more extreme order statistics. First we improve the edges by LUM (sharpener) filter. After that we make thresholding, to remove the hard bone which has the highest gray level. Resulting image has implant and some part of hard-bone. We pass that image to low-pass filter. From low-pass filter we get maximum DC level in the implant. Then we can track the implant. Once we extract the implant, we scale it to higher gray level and add the implant image to the original image. Net result of the preprocessing, with LUM filter [4], is that we get the image that fits in our

model, that is, implant has the higher gray level and hard bone has low gray level. Then we can process the image according to the original model. LUM filter [4] is reasonably fast, but the performance is not very good.

thresholding and scaling works fine for most of the problem images.

4.2. Non-Linear cellular neural filtering

It has been demonstrated that model based technique

5. CONCLUSION

Fig. 3 Image of the complete GUI of the software. In this non-linear cellular neural filter we use the West– East and North-South filters. It detects hard bone very well in the image. Output of the filter contains the hard bone. Filter masks the implant. After using some simple thresholding and processing we get implant. Once we get implant, we can re-scale the gray level of the implant, and add it to the original image. Now, it fits to the original model. 4.3. Simple Thresholding In normal images, implant has the higher gray level and hard-bone has low gray level. There are some problem images where hard bone has very high gray level with respect to implant (due to some hardware problem). In these images we use simple thresholding. In this method we make a window of thresholding. We need some preprocessing. Here we use simple thresholding to reduce the higher gray level to low gray level. This simple

works very well in this application. We developed image segmentation technique, which detects inner border of femoral cavity (endosteum) and the outer border of implant. From inner hard bone and implant we find the fit and fill of the implant. We also provided the facility for manual editing where user does not agree with segmentation results produced by the system. In some images we need preprocessing where inner hard bone have higher gray level as compare to implant. The accuracy of the system is found to be well with in the range required by clinical experts. 6. REFERENCES: [0] Kontola K: Image processing Methods for Femoral Cavity 3D-Shape Modelling, (Diploma thesis) Tamper University of Tchnology, Tampere, 1993 [1] Laine H-J, Kontola K, Lehto MUK, Pitkänen M, Jarske P, Lindholm TS: Image processing for femoral endosteal anatomy detection: description and

testing of a computed tomography based program. Phys Med Biol 42: 673-689, 1997 [2] Laine H-J, Lehto MUK, Moilanen T: Diversity of proximal femoral medullary canal. In press. J Arthro“Igor N. Aizenberg, Naum N. Aizenberg, Sos Agaian, Jaakko T. Astola, Karen Egiazarian”Nonlinear cellular neural filtering for noise reduction and extraction of image details. (Proceedings of SPIE, 25-26 January 1999, San Jose, California, Volume 3646) [3] “Russell C. Hardie, Charles G. Boncelet”, LUM Filters: A class of rank-order-Based Filters for smoothing and sharpening .(IEEE transaction on signal processing Vol 41, No. 3, March 1993) [4] Bargar W. L. 1989 shape the implant to the patient Clin. Orthop. 249 73-8 [5] Bargar W. L., Taylor J. K., Gross T. P.Hayes D. E.E. and Pulido V 1993 The first100 CT based custom cementless primary total hip replacement. J. Bone Joint Surg. 75-B suppl 258 [6] BenHajel-Boujemaa N., Stamon G., and Lemoine J. 1992, Fuzzu interative image segmentation with recursive merging visual communications and Image Processing ’92 ed P Maragos(Bellingham: SPIE) pp 1271-9 [7] Brien W. W. 1993 Design aspects of custom hips J. Bone Joint Surg. 75-B suppl 251 [8] Hua J. and Walker P.S. 1993 A versatile hip design workstation-scientic rationale J. Bone Joint Surg. 75-B suppl 251, 1994 Relative motion of the hip stems under load. An in vitro study of the symmetric, asymmetrical and custom asymmetrical designs J. Bone Joint Surg. 76-A 95-103 [9] Luijendijk H. 1991 Automatic threshold selection using histrograms based on the count of 4-connected regions pattern Recog. Lett. 12 219-28 [10] Mankovich N. J., Robertson D. D.,, Essinger J. 1991 Measuring the accuracy of the CT bone geometry for orthopedic implants Computed Assisted Radiology ed H. U. Lemke (Berlin: Springer) pp 336-43 [11] Noble P. C., Alexander J. W., Lindahl L. J.; Yew D. T., Granberry W. M. and Tullos H. S. 1988. The anatomic basis of femoral component design Clin. Orthop. 287 135-41 [12] Pal N. R. and PalS. K. 1993. A review on image segmentation techniques Pattern Recog. 26 1277-94 [13] Pun T.1980, A new method for gray-level picture thresholding using the entropy of the histogram Signal Process 2 223-37. [14] Rhodes M, Kuo Y. M. Rothman S. and Woznick C. 1987, An application of computer graphics and networks to anatomic model and prosthesis manufacturing IEEE Comput. Graphics Appl. 7 2-12 [15] Robertson D. D., Walker P. S., Granholm J. W., Nelson P. C., Weiss P. J., Fischman E. K. and Magid D 1987, Design of custom hip stem prothesis using 3-D CT modelling J. Comput. Assist. Tomogr. 11 804-9

[16] Robertson D. D., Walker P. S., Granholm J. W., Nelson P. C., Weiss P. J., Fischman E. K.and Magid D1987 Design of cuscustom hip stem prosthesis using 3-D CT modelling J. Comput. Assist. Tomogr. 11 804-9 [17] Sahoo P., Soltani S and Wong A, 1988, A survey of thresholding techniques Comput. Vision, Graphics Image Process. 41 233-60 [18] Wezka J. 1978, A survey of thresholding selection techniques Comput. Vision, Graphics Image Process. 7 259-65 [19] Wu A. Y., Hong T-H and Rosenfeld A. 1982 Thresholding selection using quadtrees IEEE Trans. Pattern Anal. Machine Intell. 4 90-4 [20] Burke DW, O´Connor DO, Zalenski EB, Jasty M, Harris WH (1991) Micromotion of cemented and uncemented femoral components. J Bone Joint Surg 73-B: 33-37 [21] Callaghan JJ, Fulghum CS, Glisson RR, Stranne SK (1992) The effect of femoral stem geometry on interface motion in uncemented porous-coated total hip prostheses. J Bone Joint Surg 74-A: 839-848 [22] Cameron HU, Pilliar RM, MacNab I (1973) The effect of movement on the bonding of porous metal to bone. J Biomed Mater Res 7: 301-311 [23] Hua J, Walker PS (1994) Relative motion of hip stems under load. An in vitro study of symmetrical, asymmetrical and custom asymmetrical designs. J Bone Joint Surg 76-A: 95-103 [24] Hua J, Walker PS (1995) Closeness of fit of uncemented stems improves the strain distribution in the femur. J Orthop Res 13: 339-346 [25] Huiskes R, van Rietbergen B (1995) Preclinical testing of total hip stems. The effects of coating placement. Clin Orthop 319: 64-76 [26] Pilliar RM, Lee JM, Maniatopoulos C (1985) Observations on the effect of movement on bone ingrowth into porous-surfaced implants. Clin Orthop 208: 108-113 [27] Walker PS, Robertson DD (1988) Design and fabrication of cementless hip stems. Clin Orthop 235:2534 [28] Weinans H, Huiskes R, Grootenboer HJ (1994) Effects of fit and bonding characteristics of femoral stems on adaptive bone remodelling. J Biomech Eng 116:393-400 [29] .Laine H-J, Puolakka TJS, Moilanen T, Pajamäki KJ, Wirta J, Lehto MUK: Effect of cementless femoral stem shape and proximal surface texture on “fit-and-fill“ characteristics, five-year clinical outcome and bone remodelling. In Press. Int Orthop.

9

Acknowledgements

Firstly , I wish, to express my deep gratitude to my thesis supervisor Professor Moncef Gabbouj, head of the Department of Information Technology for introducing me to the challenging area of Content Based Information Retrieval (CBIR), for his encouragement, continuous support, guidance and expertise. I would like to thank Timo Ulmanen, Jrki Yli-Nokari, Satu Makela and Hannu Honkala from Nokia for their help. Special thanks go to Ali Hazmi, Faouzi Alaya Cheikh, Bogdan Cramariuc, Asif Azhar, Akbar Javed, Muhammad Waqas , Kashif Haseeb, Muhammad Omair Javed, Usama Rauf, Farooq Ahmad. I would like to thank the whole personal of my group for the pleasant work environment and all the help I received during my work. I am very obliged to the family Tanvir, family Zahid, family Ali Khan, family Javed, Rasheed Khan, Sabhan Khan, Qumreen Khan, Saleem Khan, Bakht Rasheed, Hammad Kahlun, Permindar Chuhan, Mohammad Nadir, Hakeem Khan, Aleem Baig, Mohammad Arif, Mohammad Ali and to the small Pakistani community in Tampere. My sincere and warmest thanks to my brothers and sisters for their love, support and patience during my studies. Finally yet importantly, I am thanking my wife and daughter for their help, support and love.

Tampere, 01-06-2003 Iftikhar Ahmad Opiskelijankatu 38D 58 FIN - 33720, Tampere FINLAND

98

10 References [1] http://www.ntt.docomo.com/i/ [2] http://www.3gpp.org/ [3] Query by Image Content using NOKIA 9210 Communicator, Ahmad Iftikhar, Faouzi Alaya Cheikh, Bogdan Cramariuc and Moncef Gabbouj, Workshop on Image Analysis for Multimedia Services (WIAMIS) 2001, COST 211, From May 16-17, 2001, Technical University of Tampere, Tampere, Finland. [4] http://maya.ece.ucsb.edu/Netra/ [5] Mari Partio, “Content-based Image Retrieval using Shape and Texture Attributes”, M.Sc. Thesis, Tampere University of Technology, Tampere, Finland, November 2002. [6] M.Trimeche, F.Alaya Cheikh, M.Gabbouj and Bogdan Cramariuc, "Content-based Description of Images for Retrieval in Large Databases:MUVIS," X European Signal Processing Conference, Eusipco-2000, Tampere, Finland, September 5-8, 2000, pp. 139-142. [7] Programming Wireless Devices with the JAVA 2 Platform, Micro Edition Roger Riggs, Antero Taivalsaari, Mark VandenBrink, Jim Holliday, Editor; P7-17 [8] Wireless Java for Symbian Devices by Jonathan Allin, P99 [9] Mastering Java 2, John Zukowski, P28 [10] Mastering Java 2, John Zukowski; P990-996 [11] http://www.nokia.com/phones/9210/index.html [12] http://www.nokia.com/phones/9210/features.html [13] http://www.symbian.com/ [14] Professional Symbian Programming Mobile Solutions on the EPOC platform, by Martin Tasker, Jonathan Allin, Jonathan Dixon, John Forrest, Mark Heath, Tim Richardson, Mark Shackman, P29-35 [15] http://www.epocworld.com [16] Java Servlet programming, by Jason Hunter with William Crawford, P7 [17] http://www.wapforum.org/ [18] http://www.nokia.com [19] http://www.apache.org/

99

[20] Mastering Java 2, by Jhon Zukowski; P991 [21] http://www.cknow.com/ckinfo/acro_n/nsapi_1.shtml [22] http://jakarta.apache.org/tomcat/index.html [23] http://java.sun.com/products/jdk/1.2/docs/guide/jni/ [24] Mastering Java 2, John Zukowski, P733 [25] Mastering Java 2, John Zukowski, P32 [26] Java Servlet programming, by Jason Hunter with William Crawford, Appendix A: Servlet API Quick reference, P425 [27] Java Servlet programming, by Jason Hunter with William Crawford, Appendix A: Servlet API Quick reference, P434 [28] Java Servlet programming, by Jason Hunter with William Crawford, Appendix A: Servlet API Quick reference, P426 [29] “Internationalisation in Operating Systems for Handheld Devices”, Jere Käpyaho University of Tampere, Department of Computer and Information Sciences, Master's thesis, December 2001 [30] Java Servlet programming, by Jason Hunter with William Crawford, Appendix B: HTTP Servlet API Quick reference, P447 [31] http://www.w3.org/Protocols/rfc2616/rfc2616.html [32] http://java.sun.com/j2se/1.3/docs/api/java/io/PrintWriter.html [33] Java Servlet programming, by Jason Hunter with William Crawford, Appendix A: Servlet API Quick reference, P436 [34] Java Servlet programming, by Jason Hunter with William Crawford, Appendix A: Servlet API Quick reference, P438 [35] http://www.perl.com [36] http://java.sun.com/marketing/collateral/security.html [37] http://java.sun.com/products/jdk/1.2/docs/api/java/io/InputStream.html [38] http://java.sun.com/products/jdk/1.2/docs/api/java/io/PrintStream.html [39] http://java.sun.com/j2se/1.3/docs/api/java/lang/System.html [40] http://java.sun.com/docs/books/jni/html/design.html [41] http://www.cygwin.com/ [42] http://www.imagemagick.org/ [43] http://java.sun.com/products/java-media/jai/ [44] http://java.sun.com/products/java-media/2D/ [45] Mastering Java 2, John Zukowski, P1100 100

[46] http://java.sun.com/products/midp/ [47] Programming Wireless Devices with the Java 2 Platform, Micro Edition, by Roger Riggs, Antero Taivalsaari, Mark VandenBrink; ISBN 0-201-74627-1; P8 http://java.sun.com/j2me/ [48] http://java.sun.com/j2se/1.4/ [49] http://java.sun.com/j2ee/ [50]http://java.sun.com/products/cdc/ [51] Programming Wireless Devices with the Java 2 Platform, Micro Edition, by Roger Riggs, Antero Taivalsaari, Mark VandenBrink; ISBN 0-201-74627-1; P717 [52] http://java.sun.com/products/mmapi/ [53] http://www.apple.com/quicktime/ [54] Niblack, W. et al, “The QBIC project; querying images by content using color, texture and shape”, SPIE, Vol. 1908, 1993. [55] W.Y.Ma and B.S.Manjunath, " NeTra: a toolbox for navigating large image databases", Multimedia Systems, vol.7, (no.3), Springer-Verlag, Berlin, Germany, pp.184-98, May 1999. [56] Alaya Cheikh, F., Cramariuc, B. and Gabbouj, M., "MUVIS: A System for Content-Based Indexing and Retrieval in Large Image Databases," in Proc. of the VLBV98 workshop, pp.41-44, Urbana, IL, USA, October 8-9, 1998. [57] Alaya Cheikh, F., Cramariuc, B., Reynaud, C., Quinghong, M., Dragos-Adrian, B., Hnich, B., Gabbouj, M., Kerminen, P., Mäkinen, T. and Jaakkola, H., "MUVIS: A System for Content-Based Indexing and Retrieval in Large Image Databases," Proceedings of the SPIE/EI'99 Conference on Storage and Retrieval for Image and Video Databases VII, Vol. 3656, pp.98-106, San Jose, California, 26-29 January 1999. [58] http://www.tut.fi [59] PersonalJavaTM Technology White Paper, August 1998, http://java.sun.com/products/personaljava/pj_white.pdf [60] J.R. Smith, and S.-F. Chang, “Local color and texture extraction and spatial query”, IEEE Int. Conf. On Image Processing, ICIP’96, Lausanne, Switzerland, September 1996.

101

[61] M. Trimeche, “Shape Representations for Image Indexing and Retrieval”, Master of Science Thesis, Tampere University of Technology, May 2000. [62] A. Del Bimbo, Visual Information Retrieval, 270 pages, Morgan Kaufmann Publishers, San Francisco, California, 1999. [63] Digital Image processing, Concepts, Algorithms, and scientific Applications 4th Edition, by Bernd Jahne, chapter12, P383-P387 [64] "Learning the kernel" through examples: an application to shape classification Trouve, A.; Yong Yu; Image Processing, 2001. Proceedings. 2001 International Conference on , Volume: 2 , 7-10 Oct 2001, Page(s): 121 -124 vol.2 [65] Shape Matching: Similarity Measures and Algorithms Remco C. Veltkamp, Dept. Computing Science, Utrecht University, Padualaan 14, 3584 CH Utrecht, The Netherlands, [email protected], January 25, 2001 [66] University of Surrey, “Search for Similar shapes in SQUID system: Shape Quries using Image Database”, March 2001, referred at August 30, 2001 http://www.ee.surrey.ac.uk/CVSSP/imagedb/demo.html [67] NEC USA Inc., “Amore CBIR demo”, referred at August 22, 2001, http://www.ccrl.com/amore [68] Leiden University, “Content-based image retrieval in LCPD: the Leiden 19thcentry portrait database”, June 19, 2000, referred at August 15, 2001, http://ind156b.wi.leidenuniv.nl:2000/ [69] INRIA, “Surfimage, retrieve images by content”, referred at August 29, 2001, http://www-rocq.inria.fr/cgi-bin/imedia/surfimage.cgi [70] Virage, Inc., “Virage CBIR system”, referred at August 29, 2001, http://www.virage.com [71] Columbia University, “A content based image and video search and catalog tool for the web”, referred at August 09, 2001 http://www.ctr.columbia.edu/webseek/ [72] University of Thessaloniki, “ISTORAMA demo”, referred at August 10, 2001, http://uranus.ee.auth.gr/istorama/Home/home_en.html [73] Dublin City University, “Fischlar”, referred at August 15, 2001, http://lorca.compapp.dcu.ie/Video/

102

[74] Mokhtarian, F., S. Abbasi and J. Kittler, ``Robust and Efficient Shape Indexing through Curvature Scale Space,'' Proc. British Machine Vision Conference, pp. 53-62, Edinburgh, UK, 1996. [75] http://www.motorola.com [76] http://www.siemens.com/ [77] http://www.sonyericsson.com/

103