Defining the grid: a snapshot on the current view

J Supercomput (2007) 42: 3–17 DOI 10.1007/s11227-006-0037-9 Defining the grid: a snapshot on the current view Heinz Stockinger Published online: 17 ...

Author: Terence Wells

2 downloads 0 Views 247KB Size

Report

Download PDF

Recommend Documents

Introduction. Defining the Current Positions

A PRIMER ON THE U.S. ELECTRIC GRID

A Snapshot of the Maldives:

This review presents a global view of the current situation

A CLEAR VIEW ON THE TARGET GROUP

Running SAS on the Grid

Defining the Good Language Learner (GLL) points of view

A View on the Philosophy of Music

Negotiating Trust on the Grid

A NEW AUMF: DEFINING COMBATANTS IN THE WAR ON TERROR

The Integrated Grid Architecture A Roadmap to a Smart Grid

KNOWLEDGE GRID : High Performance Knowledge Discovery Services on the Grid

A Snapshot of the Mobile HTML5 jamespearce

Taking a snapshot of the salary scene

Volunteering - snapshot from the

The Swedish View on the Placebo

The other face of Capgemini a snapshot

A Parallel Chain Matrix Product Algorithm on the InteGrade Grid

A MIDDLEWARE ARCHITECTURE FOR INTEGRATING SERVICES ON THE GRID

ON THE CHARACTERISTICS OF GRID WORKFLOWS

recurrence-defining recurrence-defining event. The

Our purpose is to provide business leaders with a snapshot of current practice on Belgian boards

A View from the Bridge

Current Notes. View of the Godlee Observatory from Whitworth Street

J Supercomput (2007) 42: 3–17 DOI 10.1007/s11227-006-0037-9

Defining the grid: a snapshot on the current view Heinz Stockinger

Published online: 17 March 2007 © Springer Science+Business Media, LLC 2007

Abstract The term “Grid” was introduced in early 1998 with the launch of the book “The Grid. Blueprint for a new computing infrastructure”. Since that time many technological changes have occurred in both hardware and software. One of the most important ones seems to be the wide acceptance of Web services. Although the basic Grid idea has not changed much in the last decade, many people have different ideas about what a Grid really is. In the following article we report on a survey where we invited many people in the field of Grid computing to give us their current understanding. Keywords Grid computing · Service oriented architecture · Web services · Standards · Survey “Computational Grids are the equivalent to the electrical power Grid” Foster I, Kesselman C (1998) The Grid. Blueprint for a new computing infrastructure. Morgan Kaufman. “With Web Services we allow a thousand flowers to bloom. With a Grid we organize the planting and growth of a crop of plants to make harvesting easier” [MA].

1 Introduction The ideas of Grid computing have been around for much longer than the advent of the book [4] edited by Ian Foster and Carl Kesselman. However, the launch of the book started a new era in computing which created an entire new research field: Grid computing. The original ideas and definitions compare a computing Grid with the electric power Grid [4]. Actually, the names are even reflected in European companies such as the Austrian Power Grid, swissgrid, etc. representing national electricity Grids. In H. Stockinger () Swiss Institute of Bioinformatics (Vital-IT Group), Bâtiment Génopode, Quartier Sorge, CH-1015, Lausanne, Switzerland e-mail: [email protected]

4

H. Stockinger

addition to this original vision, Ian Foster gave the following checklist [5] that was widely accepted: (1) coordinates resources that are not subject to centralized control . . . (2) . . . using standard, open, general-purpose protocols and interfaces . . . (3) . . . to deliver nontrivial qualities of service More recently, Ian Foster and Steve Tuecke gave a clear description of what they mean by Grid and service-oriented architecture [7]. The article also gives definitions for utility computing and on-demand computing and the differences to Grid computing. Grid definitions from other authors can be found in [1, 2, 6, 8]. In general, computer science and software engineering sometimes do not have definitions as strict as those in the fields of physics or mathematics. As a result of this “lack of definitions” many Grid researchers or people working with Grid technology have different views on what a Grid is. The most common discrepancies are in the definition of the hardware. (For some a local cluster with a middleware system on top is a Grid whereas others believe that a wide-area network connection has to be involved.) Other main discrepancies are on the software side: what actually makes a piece of software a “Grid software”? Is any kind of middleware using Grid security already Grid software? etc. Most of us have thought about (and discussed) similar questions probably without finding a conclusive answer. Due to the recent advances in Web and Grid service technologies, it is often not clear where to draw the line between Web services and Grid services [10]. We are particularly interested in the current view of Grid researchers on what constitutes a Grid. We therefore conducted a survey in early 2006 and invited people to express their views on Grids. The following article reports on opinions collected from many researchers in the world-wide Grid community and tries to focus on the basic characteristics. Having an idea about the current view one can get an impression of how researchers in the Grid domain perceive Grid concepts and how they can be applied to other science domains.

2 Background on the survey In spring 2006 we started a survey where we contacted more than 170 Grid researchers all over the globe to give us their current views on how they define the Grid. The criteria for the survey were not to influence the answers, i.e. we refrained from giving questions or definitions to either agree or disagree with. Contacted people should have the maximum freedom in their definitions. The main guideline was the following: Try to define what are the important aspects that build a Grid, what is distinctive, and where are the borders to distributed computing, Internet computing etc. Additionally, people were asked to give precise answers of a maximum of 0.5–1 page. More than 40 people responded to this call, and a distilled summary can be found in this article. We are aware that the freedom we gave to people results in difficulties in summarizing all the opinions that were submitted. However, it also

Defining the grid: a snapshot on the current view

5

reflects reality since many researchers have different views. Given the pool of answers, we classify the responses according to a few categories that are characteristic for computational Grids.

3 Survey results One of the main interests of the survey was to find out if people have a more-or-less common understanding of a Grid or if there are many conflicting opinions. The result is, of course, biased in the sense that we mainly asked researchers actively working in the field. This has the advantage that we get a more condensed view of what the community thinks rather than the general public. There is also no obvious way to evaluate the received answers. We used the approach of in a first pass highlighting all the main keywords that were used to describe the Grid. Not surprisingly, many similar words and phrases were used to describe the vision as well as main characteristics. Therefore, in a second pass a classification method was used to categorize the answers according to: (1) Grid vision, i.e. what is the basic idea behind Grid computing and what are the main goals to be achieved (2) Differences with respect to other computing domains such as distributed computing, Internet and Web computing (3) Grid characteristics i.e. what actually “makes” a Grid? It is sometimes easier to give a set of characteristics and features to describe a concept. In the following subsections we describe the results of the survey based on the answers we received. Often, we cite people directly indicated by the initials. For details on the actual person represented by the initial refer to Sect. 5. Sometimes people agree with certain definitions by GGF [11] or CoreGRID [3]. In this case, no direct citations are used. In more detail, several people describe parts of the Grid, describe characteristics and what is different with respect to traditional approaches. There are many overlaps and hardly any contradictions. We consider this as a main message of the paper: the Grid community survey results are rather coherent on what constitutes a Grid. 3.1 The Grid vision 3.1.1 Overview The overall vision that was given in [4] has not changed but a few more additions were given, such as the ones below. For instance, “in the Grid vision there is a distinction between (a) the Grid approach, or paradigm, that represents a general concept and idea to promote a vision for sophisticated international scientific and businessoriented collaborations and (b) the physical instantiation of a production Grid based on available resources and services to enable the vision for sophisticated international scientific and business-oriented collaborations” [GvL].

6

H. Stockinger

A Grid infrastructure must provide a set of technical capabilities, as follows [7]: • “Resource modeling. Describes available resources, their capabilities, and the relationships between them to facilitate discovery, provisioning, and quality of service management. • Monitoring and notification. Provides visibility into the state of resources—and notifies applications and infrastructure management services of changes in state— to enable discovery and maintain quality of service. Logging of significant events and state transitions is also needed to support accounting and auditing functions. • Allocation. Assures quality of service across an entire set of resources for the lifetime of their use by an application. This is enabled by negotiating the required level(s) of service and ensuring the availability of appropriate resources through some form of reservation—essentially, the dynamic creation of a service-level agreement. • Provisioning, life-cycle management, and decommissioning. Enables an allocated resource to be configured automatically for application use, manages the resource for the duration of the task at hand, and restores the resource to its original state for future use. • Accounting and auditing. Tracks the usage of shared resources and provides mechanisms for transferring cost among user communities and for charging for resource use by applications and users.” • In addition to that security is an important aspect [GA]. “We can consider the grid as the combination of distributed, high-throughput and collaborative systems for the effective sharing and distributed coordination of resources which belong to different control domains” [MP]. Generally, a Grid provides a “distributed computing power infrastructure. It is supposed to provide researchers (users) with a single entry point to launch jobs” [LF]. Simply put, Grid means ”distributed computing across multiple administrative domains” [DS]. Sometimes the Grid is also called to be the “software environment” [GA] that integrates, virtualizes, and manages distributed resources (software and hardware). Another view is that a Grid is “a very large scale resource management system” [AD]. It is important to point out that a Grid can be built with different technologies, which generally means that there is no such thing as a “typical Grid technology.” “Web services are merely a single mechanism (out of many possible mechanisms) that can be used to build a schedulable grid” [AH]. This is further stressed by the statement that “It is very important to notice that Grid services are the ‘current approach’ since other technologies not related to services could be employed in order to build Grid infrastructures (e.g. software components)” [MB]. “Therefore, a multiplicity of technologies is desirable and may need to be employed concurrently in a heterogeneous Grid” [JM]. Consequently, other technologies that are not Web services based can and will be used to build Grids. Others consider the Grid “more a concept or movement rather than a system” [ML] which brings people together. It is an enabling factor, much “more sociological or cultural rather than technical” [ML]. Along the same line is the following opinion “I think one should completely dissociate the Grid definition which is rather a concept to be defined from a user’s point of view, from technical implementation of the architecture, protocols, services and technology. So definitely, defining the Grid

Defining the grid: a snapshot on the current view

7

would rather be to define a set of features. If we observe a given unidentified system that can achieve these features, then this system can be defined as a Grid” [JS]. To conclude, we also present already commonly agreed definitions by GGF and the CoreGRID Network of Excellence since they were suggested in the survey: GGF [11]: A system that is concerned with the integration, virtualization, and management of services and resources in a distributed, heterogeneous environment that supports collections of users and resources (virtual organizations) across traditional administrative and organizational domains (real organizations). CoreGRID (submitted by [TP] for the CoreGRID executive committee): A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide location independent, pervasive, reliable, secure and efficient access to a coordinated set of services encapsulating and virtualizing resources (computing power, storage, instruments, data, etc.) in order to generate knowledge. 3.1.2 Classification Often, people try to classify different “Grid types” according to their main functionalities but this classification is not always agreed. However, we try to convey the main ideas. In principle, most people distinguish between pure Computational Grids and the more enhanced Data Grids. However, there are also additional classifications such as [DS]: (1) “Collaboration Grids: These Grids involve multiple organizations (institutions) and individuals, security domains, protocols, discovery mechanisms, etc.” Important aspects are: • Widely distributed, virtual organizations (VOs) • Service level agreements & commercial partnerships • Business model: increase overall revenue (2) Enterprise Grids: These Grids are in most ways as technically complex as in item 1) above and involve the complete life cycle of service deployment, provision, management, and decommissioning, just like Collaboration Grids. However, the multiple domains are either absent or highly integrated, at least at a political level. These are the production Grids of major data centers. Important aspects are: • Virtualization of enterprise resources and applications • Aggregation and centralization of management • Business model: reduce total cost of ownership “In the enterprise security and auditing is even of greater importance” [GA]. (3) Cluster Grids: Aimed at high performance/throughput computing, these Grids are mostly workload scheduling environments. They tend to be static, rather than dynamic like the above. The services are either generic in nature, e.g. a job submission service, or provide the same service all the time. They do not typically support the whole service life cycle” [DS]. However, clusters themselves (if not connected to other clusters) are typically not called a Grid.

8

H. Stockinger

Another way of categorizing Grids is according to their “geographical distribution, their organizational scope and resource ownership” [GM]: We can then distinguish between cluster Grids, campus Grids, enterprise Grids, and global Grids. “A cluster Grid (also called department Grid ) contains resources located at one site within one organization, and belonging to a single owner. A campus Grid differs from a cluster Grid in that its resources belong to multiple owners. Unlike campus Grids, enterprise Grids contain resources located at multiple sites. Finally, global Grids contain resources from multiple organizations” [GM]. Collaboration Grids are sometimes also called “Beyond Firewall Grids” [DT]. An alternative way of naming different Grids is “IntraGrid, ExtraGrid and InterGrid” [DT]. Clusters and Grids are sometimes used in the same context but a majority of people surveyed makes a clear distinction between Grids and clusters: “The key distinction between clusters and Grids is mainly in the way resources are managed. In case of clusters, the resource allocation is performed by a centralized resource manager and often many nodes cooperatively work together as a single unified resource. In case of Grids, each node has its own resource manager and does not aim for providing a single system view” [RB]. And “nodes” in Grids are ‘autonomous’ whereas nodes in clusters are non-autonomous. As a result, Grids use decentralized resource management whereas clusters are centralized and hence have a single system image” [RB]. Another distinction is that “a Grid is composed by different administrative domains, whose resources are managed by dynamic virtual organizations” [MP]. 3.1.3 Hardware vs. software The physical instantiation of a Grid relies on hardware and software components. Whereas on the hardware side no particular features are identified (sometimes widearea network connections are considered to be an important part of a Grid [10]), the software side is more distinct. For instance, “the key distinction between the Grid and other distributed computing is the use of Grid middleware” [DK]. However, the definition of middleware is also not always commonly agreed on so others suggest to “avoid the definitions tied to resource or middleware level” [AM]. 3.1.4 Basic services At least with the advent of the Open Grid Services Architecture it become clear that basically any conventional “service” (in the meaning of Service Oriented Architecture) can be provided by or via a Grid. However, the most basic ones are the following: resource selection, scheduling, secure execution, data management, data integrity and privacy, authentication, and fault recovery. 3.2 Differences/communalities with other computing domains In the late 1990s Grid computing emerged as a new domain in computer science, although standard techniques and protocols were taken from related domains such as the Internet, distributed computing or the database community. However, Grid computing cannot be discussed in isolation and has many overlaps with “traditional” domains. Nevertheless, a considerable part of our surveyed contributors make clear

Defining the grid: a snapshot on the current view

9

distinctions between Grid computing, distributed computing and Internet computing, which is perhaps the most controversial part of the entire survey. A short insight is given here. 3.2.1 Grid vs. distributed computing Some people consider the Grid as a “general” [DT], some as a “special” form of distributed computing [10] whereas others think that a distinctive feature is the complexity of the Grid in several ways, characterized by scalability and transparency: • Scalability: • “The borders with distributed computing might be defined as the point at which 2-way contexts begin to be replaced by N-way contexts. By context I think we primarily mean security, architecture and programming models” [BC]. • Number of organizations involved: “the main differences are the (potential) inter-organizational characteristics and the looser dependence between the participating partners (either services or institutions)” [JP]. • Transparency: a Grid should further be “platform agnostic” [SF] and be able to utilize heterogeneous resources (both hardware and software). However, we can also find opinions that go in the opposite direction such that there is no “line” between distributed and Grid computing, rather “they complement each other and are part of each other [MT]”. In summary, the most important characteristics that might make “the” difference between Grids and distributed computing are the ones stated in Sect. 3.3. 3.2.2 Grid vs. Internet (Web) The Grid community has adopted and enhanced many Internet and Web service technologies. For instance, a Grid service is a Web service with additional features. However, there is no common agreement about where the border is (if it exists at all) between Internet and Grid computing. Some argue: “whereas in the Internet (Web) messages are exchanged between two points, a Grid provides a higher level of abstraction” [DKr]. Furthermore, the vision goes even beyond that and extends the Internet even more: “similar to today’s World Wide Web as our global information platform, we are building the World Wide Grid to become our global collaboration platform, connecting computers and storage, applications and data, experiments, instruments, sensors and other digital devices” [WG]. Along the lines of the metaphor used on the first page, [MA] makes the following distinction: “I see web services as a computing subsystem that is independently developed and independently deployed across heterogeneous platforms. They are independently managed services that may be composed by other services, e.g. via a workflow enactment. There need not be any a priori design and implementation consistency among the web services to make such composition easy over and above adherence to WSI-like standards. There is no a priori arrangement to permit distributed WS management. In the case of a Grid the available services are independently developed and independently deployed across heterogeneous platforms. However, the designers of a Grid choose to give up some independence between services; instead

10

H. Stockinger

services comply with commonly agreed higher standards, implementing virtual homogeneity. This chosen consistency is intended to make it easier to deploy software and services across the Grid and easier to compose services offered by a Grid”. A Grid typically provides a communication layer that enables services to communicate with each other which leads to a similar argumentation as the one above: “. . . discriminates Grids from the Web, which is a (large) set of independent servers” [AD]. A statement more commonly agreed to is that a Grid could be seen as an extension of the Internet. “Therefore, same basic rule can be applied in the Grid world— integration of heterogeneous resources can be achieved by using standardized protocols and services. Internet protocols provide a good basis for linking resources. However, a wider set of standards is needed for advanced functionalities, such as job execution, data management, security operations, etc.” [EI]. A representative argument to underline that the Grid extends the Internet: “Internet services are not a different research field but a part of the Grid research” [MT]. 3.2.3 Grid, clusters and P2P systems P2P technology is more and more used in the Grid domain. A set of characteristics that helps distinguish Grids, clusters and P2P systems is as follows [RB]: Characteristic

Cluster

Grid

Population

Commodity

High-end computers

Computers

P2P Edge of network (desktop PC)

Ownership

Single

Multiple

Multiple

Discovery

Membership Services

Centralized Index &

Decentralized

Decentralized Info User Management

Centralized

Decentralized

Decentralized

Resource

Centralized

Distributed

Distributed

Centralized

Decentralized

Decentralized

VIA based

Some progress

No standards

management Allocation/ Scheduling Inter-Operability

(e.g., WSRF) Single System Image

Yes

No

No

Scalability

100 s

1000

Millions

Capacity

Guaranteed

Varies, but high

Varies

Throughput

Medium

High

Very High

Speed (Lat.

Low, high

High, Low

High, Low

Bandwidth)

3.3 Grid characteristics Grids typically have a set of characteristics. The most dominant ones that people generally agree on are the following ones:

Defining the grid: a snapshot on the current view

• • • • • • • • • • •

11

Collaboration Aggregation Virtualization Service orientation Heterogeneity Decentralized control Standardization and interoperability Access transparency Scalability Reconfigurability Security In addition to that we can identify a set of important topics and aspects:

• • • • •

Application support Computing model Licensing model Procedures and policies Auditing

Collaboration A commonly agreed aspect of a Grid is sharing of resources in a distributed fashion. Furthermore, a Grid “spans multiple administrative domains seamlessly” [AM]. It even goes as far as people define “collaboration Grids” [GF]. It is furthermore important that the collaboration provides positive synergies among users and service providers. “Done properly, it will result in synergistic, and potentially emergent, advantages that otherwise will remain unreachable” [JM]. Finally, the resources should be “shared in a fair way” [FvL]. Aggregation A Grid is more than the sum of all parts: “A Grid aggregates many resources and therefore provides an aggregation of the capacity of the individual resources into a higher capacity virtual resource. The capability of individual resources is preserved. As a consequence, from a global standpoint the Grid enables running larger applications faster (aggregation capacity), while from a local standpoint the Grid enables running new applications” [GM]. The aggregation is also used for “improved performance, higher quality of service, better utilization, and easier access to data” [FD]. Finally, resources can (or should be) be added dynamically or statically [DE]. Virtualization Grid services are often provided with a certain interface that hides the complexity of the underlying resources. This is also known as virtualization, which also provides an abstract “layer” between clients and resources [GA]. Therefore, a Grid provides the “ability to virtualize the sum of parts into a singular wide-area programming model” [BC]. Virtualization covers both, data (flat files, databases etc.) and computing resources [WG]. The list of resources to virtualize can be extended as follows [RM]:

12

H. Stockinger

• Grid as workflow virtualization—the use of Grid computing services to execute and manage processes across multiple compute platforms. • Data Grid as data virtualization—the management of shared collections independently of the remote storage systems where the data is stored. • Semantic Grid as information virtualization—the ability to reason on inferred attributes from multiple independent information repositories. “Virtualization is based on the ability to manage naming conventions, state information, access methods, and remote operations independently of the remote resource. All of the grid environments require” [RM]: • Name space virtualization, logical names for resources, users, files, and metadata that are independent of the name spaces used on the remote resource. • Trust virtualization, the ability to manage authentication and authorization independently of the remote resource. • Constraint virtualization, the ability to manage access controls independently of the remote resource. • Access virtualization, the ability to port an arbitrary access mechanism on top of the Grid middleware. For Data Grids, this is the ability to support access through multiple loadable libraries (Windows, Perl, Python, C), Java, Digital libraries (DSpace, Fedora, OAI-PMH), workflow actors (Kepler), Web browsers, etc. • Network virtualization, the ability to manage transport in the presence of network devices such as firewalls, load levelers, private virtual networks. This typically requires multiple protocols to support client-initiated versus server-initiated I/O, bulk operations versus single-file operations. • Latency management, the ability to minimize the number of messages sent over wide area networks. Examples include execution of procedures at the remote resource when the complexity (ratio of operations to bytes transmitted) is sufficiently small. The standard case is data filtering or sub-setting. • Federation, the ability to interoperate across multiple grid environments. This requires the ability to share logical name spaces, and Shibboleth-style authentication. Grids establish trust mechanisms to allow assertions about the authenticity of an individual to be verified from the “home” Grid. Service orientation Grids provide services, following the concept of a service orient architecture. In the widest sense “all large scale collections of services can be viewed as Grids” [GF]. Heterogeneity A Grid typically consists of “heterogeneous computing resources” [RM], i.e. there is a variety of different hardware and software components with different performance and latency characteristics.

Defining the grid: a snapshot on the current view

13

Decentralized control We have seen these characteristics already in the 3-point checklist by [IF] but we list it here again since it was mentioned several times in the survey answers. In other words, “components are under control of multiple entities, i.e. the key difficulties in Grids lay exactly in not having a single ‘owner’ of the whole system” [WC], i.e. the resources are “under different ownerships” [JM]. “One of the requirements of a Grid is the use of distributed control mechanisms” [MP]. Standardization and interoperability A Grid “promotes standard interface definitions for services that need to interoperate to create a general distributed infrastructure to fulfill users’ tasks and provide user level utilities” [FD]. “Grid systems that implement one standard must interoperate with Grids that adhere to the same standard” [DE]. “Grid is exposing the need for increased levels of integration of distinct technologies and for increased agreements in the standardization of services. The success of the implementation of the Grid very much depends on these aspects” [JC]. Furthermore, the Grid should provide uniform access to heterogeneous resources through virtualization [GM]. An even stronger statement on standards and interoperability is the following one: “Any Grid not based on standards is wasteful. If you consider Grid services that are not interoperable then the concept just doesn’t work and deliver the promised value. The rule in Grids or us in HP is ‘ruthless standardization’.” [GA] Access transparency The Grid “should allow its users to access the computing infrastructure without having to be intimately aware of the underlying architecture or network topology” [SF]. This is sometimes considered “the most distinctive aspect of Grid Computing, that is, the levels of transparency provided for the end-user, through the virtualization of resources” [JC]. Scalability Even if Grid implementations and infrastructures sometimes do not solve a “new problem”, it is often the scale of data, resources and users that contributes to the additional complexity of a Grid. This is also expressed by the fact that a Grid should be “non-trivial in the sense of what a user was not able to solve earlier” [SV]. Reconfigurability A Grid should be “dynamically reconfigurable” as it is specified in the definition from CoreGRID.

14

H. Stockinger

Security Secure access to resources an essential feature of a Grid. Therefore, “authorized users and applications have a limited number of operations (even none at all)” [JM] that they can run on services. Basically, Grid security is one of the first things that real Grid users have to deal with and therefore is essential for any Grid software system that spans multiple administrative domains. Application support In general, a Grid might support a large variety of different applications. “Applications should also be part of the Grid and the whole Grid environment (where for environment I mean the hardware, middleware, and applications) should be data-driven. In particular, it should be able to react to changes of the system and application behaviors captured by application and system data” [MT]. Computational model In general, a Grid supports “several computational models (e.g., batch, interactive, distributed and parallel computing. . . )” [AD]. Licensing model Since Grids originate from the academic community, there is a “global emphasis on open source software” [FK], which is also followed by several companies that are involved in Grid development. Procedures and policies Grid users and service providers interact with each other in a similar way like on the open market where certain rules have to be followed. Therefore, “procedure and polices” [FG] need to be in place to allow for (coordinated) sharing of resources. 3.4 Discussion Although there is enthusiasm in the Grid community, not every one believes that the high goals defined in the overall Grid vision are achieved satisfactory by today’s Grid implementations. This was also partly evident in the responses we received. The following section reflects on the current status as well as its relevance to the IT community. Status and trend “One of the biggest fears for Grid computing is that it might be seen as today’s sexy technology that will quickly get replaced by tomorrow’s sexy technology” [SF]. The Grid researchers and technologists have to start to point to results/applications that utilize the Grid to solve problems or enable new applications that would have be unachievable without the Grid. A similar opinion is as follows: “Contemporary Grid implementations are still far from initially described image and from being widely adopted” [EI].

Defining the grid: a snapshot on the current view

15

Relevance to wider IT community In a recent market survey we analyzed mainly the middle European IT market and looked at how Grid technologies are or can be applied in to a business and/or commercial IT environment [9]. The major outcome was that many companies are using distributed computing technologies but are not yet ready to adopt a Grid computing model. This raises the question of why Grid is not yet more widespread in the commercial world? Another question from our survey: “Is there something that mainstream corporate IT can gain from the Grid, or is it just reserved for the boffins running nuclear simulations, protein folding experiments, or whatever? How can an IT manager of a bank or insurance company utilize grid technologies to solve his/her business and technical problems?” [SF].

4 Conclusion The presented survey is one of the first attempts to get an overall view on what the Grid research community thinks about the definition of a Grid. We previously interviewed a set of companies on their perception on Grid usability in a business environment [9] and found that the opinions of IT leaders in industry are rather diverse with respect to Grid computing. Therefore, it was of major interest to see what the Grid community itself thinks about the topic. An interesting result is that there are hardly any big discrepancies seen within the research community. One might argue that the survey did not yield much new information since Grid researchers agree on the main points. However, the important point is that there is a common understanding about the Grid (vision) although there have been many technological changes and advances in the last years.

5 Contributors Init.

Name

Organization

Country

AD AH

University of Pisa Stanford Linear Accelerator Center University of Amsterdam Trinity College Dublin Research Centre Jülich Louisiana State University/ JPL

Italy USA

AM BC DE DK

Andrea Domenici Andrew Hanushevsky André Merzky Brian Coghlan Dietmar Erwin Dan Katz

DKr DL DS DT EI EL

Dieter Kranzlmüller Domenico Laforenza Dave Snelling Domenico Talia Emir Imamagic Erwin Laure

University of Linz/CERN CNR Pisa Fujitsu University of Calabria University of Zagreb CERN

The Netherlands Ireland Germany USA Austria/Switzerland Italy UK Italy Croatia Switzerland

16

H. Stockinger

Init.

Name

Organization

Country

FD FG FK FvL

Flavia Donno Fabrizio Gagliardi Fotis Karayannis Frank van Lingen

Switzerland Switzerland Greece USA

GA GF GM

Greg Astfalk Geoffrey Fox Gabriel Mateescu

CERN Microsoft GRNet California Institute of Technology HP Indiana University National Research Council Canada

GvL IF

Gregor von Laszewski Ian Foster

JC JM JP JS LC

Jose Cunha John Morrison Jean-Marc Pierson Jean Salzemann Lorenzo Cerutti

LF

Laurent Falquet

MA MB

Malcolm Atkinson Miguel BoteLorenzo

ML ML MP

Max Lemke Miron Livny Maria S. Perez

MT RB RM

Michela Taufer Rajkumar Buyya Reagan Moore

RM RP SF SV TP WC

Rodrigo Ferandes de Mello Ron Perrot Stephen Flinter Sathish Vadhiyar Thierry Priol Walfredo Cirne

WG

Wolfgang Gentzsch

USA USA Canada

Argonne National Lab

USA

Argonne National Lab/U. Chicago University of Lisbon University College Cork INSA Lyon CNRS Clermont-Ferrant Swiss Institute of Bioinformatics Swiss Institute of Bioinformatics National e-Science Centre University of Valladolid

USA

European Commission University of Wisconsin Technical University of Madrid University of Texas, al Paso University of Melbourne San Diego Supercomputing Center University of São Paulo Queen’s University Belfast Science Foundation Ireland Indian Institute of Science CoreGRID Federal University of Campina Grande D-Grid Initiative

Portugal Ireland France France Switzerland Switzerland UK Spain Belgium USA Spain USA Australia USA Brasil UK Ireland India France Brasil Germany

Defining the grid: a snapshot on the current view

17

Acknowledgement HS is supported by the EU project EMBRACE Grid which is funded by the European Commission within its FP6 Programme, under the thematic area “Life sciences, genomics and biotechnology for health,” contract number LUNG-CT-2004-512092. Active contributions from: Greg Astfalk, Malcolm Atkinson, Miguel Bote-Lorenzo, Rajkumar Buyya, Lorenzo Cerutti, Walfredo Cirne, Brian Coghlan, Jose Cunha, Andrea Domenici, Flavia Donno, Dietmar Erwin, Laurent Falquet, Stephen Flinter, Ian Foster, Geoffrey Fox, Fabrizio Gagliardi, Wolfgang Gentzsch, Andrew Hanushevsky, Emir Imamagic, Fotis Karayannis, Daniel S. Katz, Dieter Kranzlmüller, Domenico Laforenza, Erwin Laure, Max Lemke, Rodrigo Fernandes de Mello, Miron Livny, Gabriel Mateescu, Rodrigo Mello, André Merzky, Reagan Moore, John Morrison, Maria S. Perez, Ron Perrot, Jean-Marc Pierson, Thierry Priol, Jean Salzemann, Dave Snelling, Michela Taufer, Domenico Talia, Sathish Vadhiyar, Frank van Lingen, Gregor von Laszewski.

References 1. Bote-Lorenzo ML, Dimitriadis YA, Gómez-Sánchez E (2004) Grid characteristics and uses: a grid definition. In: First European across grids conference, Santiago de Compostela, Spain, February 13– 14, 2004 2. Chetty M, Buyya R (2002) Weaving computational grids: how analogous are they with electrical grids? Comput Sci Eng 4(4):61–71, IEEE Computer Society Press and American Institute of Physics, USA, July–August 2002 3. CoreGRID network of excellence (2006) http://www.coregrid.org 4. Foster I, Kesselman C (1998) The Grid. Blueprint for a new computing infrastructure. Morgan Kaufman 5. Foster I (2002) What is the Grid? A three point checklist. http://www-fp.mcs.anl.gov/~foster/Articles/ WhatIsTheGrid.pdf 6. Foster I, Kesselman C, Tuecke S (2001) The anatomy of the grid: enabling scalable virtual organizations. Int J Supercomput Appl 15(3):200–222 7. Foster I, Tuecke S (2005) Describing the elephant: the different faces of IT as services. ACM Queue 3(6):26–29 8. Grimshaw A (2002) What is a Grid. Grid Today 1(26). http://www.gridtoday.com/02/1209/021209. html 9. Schikuta E, Donno F, Stockinger H, Vinek E, Wanek H, Weishäupl T, Witzany C (2005) Business in the Grid: project results. In: 1st Austrian grid symposium, OCG Verlag, Hagenberg, Austria, December 1–2, 2005 10. Stockinger H (2006) Grid computing: a critical discussion on business applicability. IEEE DS Online 7(6), art. no. 0606-o6002 11. Treadwell J (ed) (2005) Open grid services architecture glossary of terms. GFD-I.44, Jan 25, 2005. http://www.ggf.org/documents/GFD.44.pdf

Heinz Stockinger Heinz Stockinger’s main research interests are Grid and distributed computing. He has been working in several Grid projects in Europe (mainly CERN) and in the USA. Affiliated with the Swiss Institute of Bioinformatics he works for the EMBRACE Grid project. He has been appointed “Privatdozent” at the University of Vienna, Austria. Additionally, he is a lecturer at the Swiss Federal Institute of Technology (EPFL, Lausanne). Heinz holds a Ph.D. degree in Computer Science and Business Administration from the University of Vienna.