Semantic Web/Grid Services for Distance Learning



Jian Wu 1, Min Cai 2, Shuiguang Deng 1, and Kai Hwang 1, 2 Zhejiang University 1, China

University of Southern California 2, USA

Abstract: We report a scalable distance learning (DL) system by integrating semantic networks, Grid infrastructure, and ontology metadata. This system upgrades online course delivery with fast service discovery. We propose a new semantic service matchmaking scheme to promote the quality of service (QoS) in the DL process. Our experimental results suggest high service recall rate and user scalability with low overhead. Keywords:

Semantic networks, web services, distance education, service discovery, semantic matchmaking, information Grids, and e-learning.

1. Introduction With pervasive Internet accesses, distance learning (DL) is emerging as a viable alternative to traditional educational platforms [1][7]. DL has thrown in new blood to upgrade personal education and knowledge delivery. This has impacted almost all levels of from K-12 education to higher education and lifelong learning. Recent advances in computing and network technologies have enabled a global platform for organizations and individuals to communicate among one another, conduct various activities, and provide value-added services. The Grid and service-oriented computing technologies bring a great chance to build DL system in a scalable and flexible way [8][10]. The development of Grid [8] endows distance learning with a promising future. With the help from grid computing, DL can benefit a lot from the massive computing and storage capacity as well as various grid services. More and more organizations are integrating Grid computing with DL services for business, academic and ∗

Manuscript submitted to IEEE Internet Computing, Special Issue on Distance Learning, Sept. 30, 2006. Cai and Hwang were by supported in part by US NSF Grant ITR-0325409 at the University of Southern California. The work by Wu and Deng was supported by Zhejiang University, China. All rights reserved by the authors and publisher. Corresponding author is Kai Hwang, Email: [email protected], Tel: (213) 740-4470 and Fax: (213) 740-4418.

1

individual use. Some previous work such as the EleGI (European Learning Grid Infrastructure Project, http://www.elegi.org/) and APPLE [3], have suggested the use of Grid and P2P technologies to implement scalable DL infrastructures. This paper proposes a semantic web service discovery method to facilitate DL organizations and learners to achieve best matchmaking. The aim of bringing semantics to learning content and services is to enable large scale collaboration of DL activities over the Grid infrastructure. As computers and software applications are ubiquitously involved in this collaboration, it is essential to have a common understanding of the DL domain, in particular at the conceptual level. Thus, both computers and human participants are able to understand and communicate with each other through a common conceptualization of the domain. In this context, ontology is required as a language to enrich resources with semantics.

2. Semantic Web Services over Grid Infrastructure Learning contents and services are two interacting resources that are manipulated in DL activities. The concept of semantics is brought in for resource enrichment so that they can be better understood and precisely processed by third-parties. Ontology offers a specification of a conceptualization. It is the backbone to better realize the vision of having resources on the Grid semantically enriched and linked for more effective discovery, automation, integration, and reuse by learners and educators [11]. In the Grid-based DL infrastructure, Grid services are basic building blocks. A Grid service is an extended web service that conforms to the Open Grid Service Infrastructure (OGSI) specification. Hardware and software resources in DL can be abstracted as Grid services with semantic descriptions. According to the E-learning framework, the functionality of E-Learning system can be mapped onto various services that can be distinguished between two layers. Common Services: These provide the base level system functionality on which other services rely on and which are not really specific to the learning domain. Examples are logging, alert, authentication, authorization, and metadata management. Learning Domain Services: These are some services that are specific to the formal learning domain. They are split broadly into four sub themes: (1) course creation, e.g. curriculum and course management, (2) course delivery, e.g. activity management and learning flow, (3) assessment, e.g. marking and grading, and (4) record keeping, e.g. reporting, and ePortfolio. Thus, a Grid-based DL system can be viewed as the combination of different Grid services. It is not necessary to start from scratch to build a Grid-based DL system. We can use existing services in communication, coordination, and evaluation to provide various DL

2

organizations under different QoS and cost constraints. This accelerates the construction progress and lowers the development cost. How to retrieve and select proper services is the main undertaking to build a Grid-based DL systems [4][9]. All major universities are now trying to establish their own DL services and compete for enrollees. Distance learners are facing so many choices. Therefore, how to discover and select the best choice, which is personalized and most cost-effectiveness, becomes the most important concern for distance learners [12].

3. Service-Oriented Distance Learning (SODL) In this section, we introduce the SODL architecture. Concrete examples on courseware and discussion services illustrate the ideas of web/Grid based DL systems. Commercially available software tool sets and DL packages are summarized with a qualitative evaluation. 3.1 The SODL System Architecture As shown in Fig.1, the SODL architecture consists of five layers. The Grid resources layer provides an underlying computation and communication infrastructure for the whole DL system. Example Grid resources include high-available clusters for hosting learning services, terabyte storage systems for achieving learning objects, and high-speed networks for delivering podcast streams. On top of Grid resources, various learning services are implemented as semantic Web or Grid services. Figure 1 illustrates the nine essential learning services for a DL system, such as courseware, grading, discussion, calendar, and podcast services. Other learning services can also be easily plugged into the SODL framework. All learning services publish their properties and interfaces to a service repository implemented in the service composition layer. The metadata repository indexes all metadata information of learning objects [2]. A learning portal first discover the appropriate learning services by using the semantic service discovery component detailed in Sec. 3. It then invokes these services for desired functionalities. Heterogeneous learning services from different suppliers are mediated by using a service mediation module. The learning portal provides a unified point of access for three different groups of users, i.e. learners, instructors, and administrators. By using the portal interface, they can access the learning services and their composition according to their roles in the DL system. For example, the instructors can publish the grades of a class via the grading service or create a classroom lecture via the podcast service. The learners have the flexibility of using different portals for their personal learning objectives. In the SODL framework, learning services from different providers can interoperate according to their service properties, interfaces and quality of service.

3

Figure 1: Architecture of service-oriented distant learning using public-domain resources over community Grid infrastructure 3.2 Example Courseware and Discussion Services Figure 2 shows the implementation of two example learning services, i.e. the courseware and discussion services. The courseware service consists of course announcement, curriculum management, learning object management, and so on. The course announcement module notifies all learners registered for a given class with the messages from the instructors. When a learner registers a course, the curriculum management module automatically tracks the prerequisites of the course and her progress towards her degree requirement. The learning object management module stores and indexes all learning objects in a Grid storage system. Discussion services include thread creation, asynchronous posting, synchronized discussion, subscription and annotation, and so on. When an instructor starts a new topic for discussion, she will create a thread in the discussion service. The discussion has two modes, i.e. asynchronous posting and synchronized discussion. The former allow users to leave their messages on a discussion board, while the latter enables two or more users to chat synchronically. Users can also subscribe to and annotate on the discussion threads of their interests. Both services register their properties and interfaces in the service repository. The learning portals use the semantic service discovery to find matched services. In addition, the

4

metadata of learning objects and annotated posts are published in a metadata repository. The ontology service thus resolves the semantic queries from learning portals.

Figure 2: Learning object repository for courseware and discussion services in distance learning using semantic network and metadata resources In Table 1, we categorize DL services into 6 classes. The courseware repository and discussion support were illustrated in Fig.2. On-line testing and grading are needed to evaluate DL students and provide automated correction facilities. The course management, communications tools, and administration support are all indispensable to make a DL package successful. Under each category, we have also identified the software tools marketed by leading online course-delivery systems. Each system has its own strength and weakness. Like the clinic testing of new drugs, these software tools must be applied and tested for sufficient long time to show their relative merits to serve the general public. Our semantic service discovery tools are meant to help optimize these selection and evaluation processes.

4. Semantic Learning Service Discovery From a business perspective, composition of semantic service offers several distinctive advantages. First, compositing existing services yields virtual classes that are highly adapted special learner’s needs. Second, dedicated services minimize the amount of work required to develop new courseware, ensuring a rapid time-to-market. Third, DL application development based on services reduces business risks. With the explosive growth of semantic services, service discovery becomes extremely important to market any new DL software package.

5

Table 1: Typical Distance Learning Services and Some Software Products Categories

Online Course Delivery Software Products

Service Objectives and Typical Contents

Courseware Repository

Course creation, topic modules, learning objects, curriculum design, instructor helpdesk, content sharing, course delivery

Blackboard 6.2, Prometheus, A Tutor 1.5, WebCT Campus ed.6.0, Desire2Learn 7.4, WebMentor, etc.

Discussion Support

Discussion forums including thread creation, posting, synchronization for online sessions, and for off-line readings

ANGEL 6.3, Blackboard 6.2, WebCT Campus ed.6.0, Bazaar 7, Desire2Learn 7.4, LUVIT, Prometheus, etc.

Testing and Grading

On-line testing and assignments, grading, recording, self assessment, course evaluation, student skill building, etc

WebCT Campus ed.6.0, ANGEL 6.3, Blackboard 6.2, Learning Manager, Bazaar 7, FirstClass 7.0, etc.

Course Management

Course information, scheduling, calendar, groupwork, progress review, student tracking, course monitoring, etc.

Blackboard 6.2, Learning Manager, Prometheus, WebCT Campus ed.6.0, Desire2Learn 7.4, etc.

Communications Tools

File exchanges, online journals, notes, and library, real-time chat, whiteboard, on-line assistance, student presentations, etc.

Internet Course Assistant 2.0, Blackboard 6.2, WebCT Campus ed.6.0, Learning Manager, FirstClass 7.0, etc.

Distance-Leaning Administration

Authentication, authorization, registration, hosted services, server support, resource monitory, crash recovery, etc.

Blackboard 6.2, Asymetrix Librarian, WebCT Campus ed.6.0, Virtual -U, Desire2Learn 7.4, FirstClass 7.0, etc.

4.1

Building Virtual Class Through Service Discovery

Consider a concrete DL example, a student, John, wants to take a “Chinese Herb Medicine” class. The teacher, Andy, assesses that John has insufficient background to take the course. To create a custom-designed “Virtual Class” for John, the teacher helps John analyze the pathology by studying patient symptoms case by case to accumulate sufficient experiences. Second, John selects some herb medicine based on analyzing their pharmacology and toxicology factors. Third, John iterates the second step and fills the prescription that potentially may work. At last, the prescription is evaluated to meet the patients need. In this example, building up a special virtual class for learning Chinese Herb Medicine should understand the exact contents and semantics of messages exchanged between student and teacher. They both have to distinguish the similarities and compositions of those medical services. Furthermore, they need to learn how to invoke those services to meet the special demands. This tedious process should be automated with minimum human interventions. Semantic discovery of services and matchmaking of DL services demand to reveal the semantic similarities among the available services.

6

Figure 3:

Building up a virtual class for distance learning of Chinese Herb Medicine using semantic service discovery

4.2 Semantic Similarity in Matchmaking We present below a conceptual model to classify Web/Grid services into 4-tuples: < Common Properties, Special Properties, Service Interface, Quality of Service>

(1)

The common properties refer to service name, service key, service description, service owner, service URL, etc. To measure the similarity of semantic terms, WordNet [5] and HowNet [6] are suggested to power the lexical analysis. Special properties refer to mainly class searching web services. This demand more domain knowledge which offers relations between properties. Web service Interface consists of a set of 3-tuple operations . Here, SN is the service name, IM and OM refer all input and output parameters, respectively. We use lexical analysis to assess the similarity of service names. We compare IM/OM data types to distinguish various I/O combinations. Often, similarity of service interfaces can be measured by a similarity matrix to assess the matching score of all I/O combinations. Quality of Service (QoS): QoS measures Web service’s usability, reliability and fidelity. In reallife DL applications, it is often the case that although the selected services functionally meet our

7

needs, but they cannot sustain a required minimum performance level. We define the QoS of DL semantics by the following 4-tuple: QoS=

(2)

Learning time is related to learning efficiency, reliability refers to the robustness of the virtual class, fidelity implies the course content quality, and security is related privacy and copyright issues. We introduce 4 similarity tests to distinguish semantic terms or contexts. First, lexical similarity is often practiced in ontology, we calculate the minimal distance between two words or two terms in a conceptual hierarchy. Intuitively, the lexical similarity is sensitive to the shortest path that connects the concepts together and their depth in the hierarchy. Second, we evaluate the attribute similarity based on the domain knowledge. This demands to reveal hyperonymy or hyponymy relations. Third, Interface Similarity makes sure the selected services can combined to form a conglomeration of services, working in a tandem to offer some value-added services. Fourthly, we consider QoS Similarity specified by the following weighted sum of QoS attributes:

SimQos ( S1 , S 2 ) = wT × simTime ( S1 , S 2 ) + wR × simRe liability ( S1 , S 2 ) + wF × simFidelity ( S1 , S 2 ) + wS × simSecurity ( S1 , S 2 )

(3)

where simTime () , simRe liability () , simFidelity () , and simSecurity () are the similarity measures of 4 QoS attributes and wT, wR, wF, and wS are their weights, respectively.

5. Experiment Results and Analysis We evaluate the performance of our service matchmaking method by using three wellrecognized metrics, namely the service recall rate, matchmaking precision, and user scalability. The recall rate is the proportion of services that are repeatedly requested. The precision rate is the proportion of retrieved services that are accurately matched. Scalability refers to the computing complexity with respect to the matchmaking overhead. 5.1 Experimental Setup

We have implemented at Zhejiang University the matchmaking method using a Java program named ServMat. The system has five different matchmaking functions according to different similarity computation methods. The first four functions compute the similarity of

8

common properties, special properties, interface and QoS, whereas the last combines the above four similarities. To generate the test sets for matchmaking experiments, we developed a tool based on the IBM XML Generator. We generate 100 service files in five categories and fill each category with 20 service files. The categories are courseware service (CS), grading service (GS), discussion service (DS), whiteboard service (WS) and library service (LS). For the special properties of each service, we used the OpenCyc knowledge base (http://www.opencyc.org)1, one of the largest ontology databases available today. For each matchmaking function of ServMat, we conduct five sets of experiments. In each experiment, five discovery tasks were performed to find the appropriate DL services from the whole test set. There are 25 discovery tasks in total, and each one has 100 iterations of service matchmaking. The matched services are ranked according to their similarity scores to the requests. The services with similarity larger than a certain threshold are considered matched. The default threshold is 60%. We run our experiments on an IBM X260 server with a 2.0-GHz Intel Xeon MP processor and1-GB of RAM, running over a Redhat Linux operating system. 5.2 Recall Rate and Precision of Service Matchmaking

Figure 4 shows the recall rate and precision for each experiment. The first experiment evaluates the similarity of Web services by summing up the similarity of their common properties through lexical similarity assessment method. The second one calculates the similarity of their special properties by using the attribute similarity assessment method. The third one estimates the similarity of Web services by comparing their interface. The fourth one measures the service similarity by comparing their QoS parameters. Since UDDI doesn’t provide dynamic updated QoS parameters for Web services, we used a set of stimulated data for this experiment. The last one compares the all four similarities defined in Web service conceptual model.

1

http://www.opencyc.org

9

Common Property Similarity Interface Similarity Overall Similarity

Special Property Similarity QoS Similarity

100 90

Recall Rate(%)

80 70 60 50 40 30 20 10 0 CS

GS

DS

WS

LS

Se rvice Cate gory

(a) Service recall rate of 5 distance learning service classes

Common Property Similarity Interface Similarity Overall Similarity

Special Property Similarity QoS Similarity

100 90 80

Precision(%)

70 60 50 40 30 20 10 0 CS

GS

DS

WS

LS

Servi ce Category

(b) Precision of semantic service matchmaking

Figure 4:

Service recall rate and precision of five distance learning classes : courseware (CS), grading (GS), discussion (CS), whiteboard (WS), and library services (LS)

5.3 Scalability of Service Matchmaking

A good service matchmaking method should scales well with the number of services being accessed. After accepting a service discovery request, it must return the target services quickly even for a repository with a large number of services. We evaluate the service matchmaking time for different scale of service repositories. We generate 5 test sets with 50,

10

100, 200, 500, and 1000 service requesting files. We randomly assign its function as one from the aforementioned 5 categories of DL services. For each set of services, two discovery tasks with thresholds of 60% and 80% are performed to retrieve Courseware Services. Figure 5 shows the service matchmaking overheads for 5 similarity tests using the ServMat system.

Service Matchmaking Time (ms)

3000 Common Property Similarity Special Property Similarity Interface Similarity QoS Similarity Overall Similarity

2700 2400 2100 1800 1500 1200 900 600 300 0 50

100

200

500

1000

Number of Servi ces

(a) Threshold of acceptable matchmaking = 60% 3300 Common Property Similarity Special Property Similarity Interface Similarity QoS Similarity Overall Similarity

Service Matchmaking Time(ms)

3000 2700 2400 2100 1800 1500 1200 900 600 300 0 50

100

200

500

1000

Number of Servi ces

(b) Threshold of acceptable matchmaking = 80%

Figure 5:

Service matchmaking overhead of five semantic testing experiments

From the plotted results in Fig.5, we make three observations: (1) The matchmaking overhead for individual function grows piecewise linearly with respect to the request number. This implies good scalability of the four matchmaking methods. Only polynomial increase in matchmaking overhead is experienced.

11

(2)

The overhead of the overall similarity test (top curve) is less than the sum of 4 individual similarity tests (lower 4 curves). This is because the individual tests are not totally independent. Some of the tests can be aborted early by the decision from other tests.

(3) The matchmaking overhead of the total similarity test increases about 9% as the threshold changes from 60% to 80%. However, the 4 component test results are independent of the threshold applied. This implies that the threshold only affects the matchmaking precision shown in Fig.4(b), not the overhead involved.

6. Conclusions Semantic web services are embraced by many distance learning (DL) promoters, but very little was done in the past to apply semantic networks, Grid infrastructure, or ontology metadata to facilitate DL services. We build a scalable DL system by integrating various web/grid services and e-learning tools. To upgrade online course delivery, we explore courseware repository, user communications, web-service discovery, and ontology metadata in various e-learning services. Fast discovery of DL services is essential to distance learners, instructors, and administrators. To discover services automatically and accurately, we propose a new semantic service matchmaking scheme, which distinguishes the similarity of learning services to promote the quality of service (QoS) in the DL process. Our similarity assessment covers lexical, attributes, interface, and QoS requirements in DL. Our experimental results suggest high service recall rate and high precision in matchmaking. The scheme scales well with low overhead as the number of services increases.

References: [1] R. T. Abler and I. G. Wells, “Distributed Engineering Education: Evolution of the Telecollaboration Stations for Individualized Distance Learning”, IEEE Trans. on Education, Vol. No.3, Aug 2005. [2] M. Cai, M. Frank, B. Yan, and R. MacGregor, "A Subscribable Peer-to-Peer RDF Repository for Distributed Metadata Management", Journal of Web Semantics: Science, Services and Agents on the World Wide Web, 2(2), 2005. [3] J. Hai, et al, “APPLE: A Novel P2P Based e-Learning Environment”, Proc. of IWDC, 2004, pp 52-62. [4] H. Hoschek, “The Web Service Discovery Architecture”, In Proc. of the Int'l. IEEE/ACM Supercomputing Conference (SC 2002), Baltimore, USA, November, 2002. [5] G. A. Miller, "WordNet: A Lexical Database for English". Communications of the ACM, Vol. 38 No. 11, 1995.

12

[6] Z. Dong, "Knowledge Description: What, How and, Who?" The Proceedings of the International Symposium on Electronic Dictionaries, 1988. [7] V. M. Milutinovic and N. Skundric, “Will Distance Learning Create a Global University?”, IEEE Computer 36(3): 98-100 (2003) [8] V. Pankratius and G. Vossen, “Towards E-Learning Grids: Using Grid Computing in Electronic Learning”, Proc. IEEE Workshop on Knowledge Grid and Grid Intelligence, 2003, pp. 4-15. [9] M. Paolucci, T. Kawamura, T. Payne, and K. Sycara. “Semantic Matchmaking of Web Services Capabilities”, First International Semantic Web Conference on The Semantic Web, LNCS 2342, pages 333–347. Springer-Verlag, 2002. [10] X. Qiu, and A. Jooloor, “Web Service Architecture for e-Learning”, Journal of Systemics, Cybernetics and Informatics Volume 3, Issue 5 2006. [11] Tao, et al, “The Semantic Aspects of e-Learning: Using the Knowledge Life Cycle to Manage Semantics for Grid and Service Oriented Systems (Speech)”, In Proc. of 1st Int’l Conference on Advanced Technology for Enhanced Learning, 2005. [12] J. Wu and Z. Wu, “Similarity-based Web Service Matchmaking”, Proc. of The IEEE International Conf. on Service Computing, (SCC’05), 2005

Biographical Sketches Jian Wu is an assistant professor in College of Computer Science, Zhejiang University, China, where he received the BS and a PhD degree in 1999 and 2003, respectively.. His research interests are in Semantic Web, Web Service, and Data Mining. Contact him at [email protected]. Min Cai received his BS and MS degrees in Computer Science from Southeast University, China, in 1998 and 2001, respectively. He is currently a Ph. D. candidate in Computer Science at the University of Southern California. His research interests include intrusion detection, P2P and grid computing, and web services technologies. His email address is [email protected]. Shuiguang Deng received the BS degree in 2002 from the College of Computer Science, Zhejiang University, China, where he is currently a Ph.D. student. He is a recipient of Microsoft Fellowship 2005 and his research focuses on Workflow, Web service and Semantic Web. Kai Hwang is a Professor of Electrical Engineering and Computer Science at University of Southern California. He received the Ph.D. degree from the University of California, Berkeley. The work reported here was done during his visit of Zhejiang University in 2006. An IEEE Fellow, Dr. Hwang specializes in computer architecture, parallel processing, Internet and wireless security, P2P and Grid computing, and distributed computing systems. Contact him at [email protected] or visit the web site: http://gridsec.usc.edu/Hwang.html.

13