A Survey on Cloud Storage

1764 JOURNAL OF COMPUTERS, VOL. 6, NO. 8, AUGUST 2011 A Survey on Cloud Storage Jiehui JU1 1.School of Information and Electronic Engineering, Zheji...
Author: Chloe Norman
33 downloads 3 Views 718KB Size
1764

JOURNAL OF COMPUTERS, VOL. 6, NO. 8, AUGUST 2011

A Survey on Cloud Storage Jiehui JU1 1.School of Information and Electronic Engineering, Zhejiang University of Science and Technology,Hangzhou,China Email: [email protected]

Jiyi WU2,3, Jianqing FU3, Zhijie LIN1,3 2. Key Lab of E-Business and Information Security, Hangzhou Normal University,Hangzhou,China 3.School of Computer Science and Technology, Zhejiang University,Hangzhou,China Email: [email protected]; [email protected]; [email protected]

Abstract—As interest in the cloud increases, there has been a lot of talk about the maturity and trustworthiness of cloud storage technologies. Is it still hype or is it real? Many endusers and IT managers are getting very excited about the potential benefits of cloud storage, such as being able to store and manipulate data in the cloud and capitalizing on the promise of higher-performance, more scalable, and cheaper storage. In this paper, we present a typical Cloud Storage system architecture, a reference Cloud Storage model and Multi-Tenancy Cloud Storage model, survey the past and the state-of-the-art of Cloud Storage, and discuss the Advantage and challenges that must be addressed to implement Cloud Storage. Use cases in various Cloud Storage offerings were also summarized. Index Terms—Cloud Storage, Cloud Computing, reference model, Multi-Tenancy, survey

I. INTRODUCTION One of IT’s biggest expenses is disk storage. ComputerWorld estimates that in many enterprises storage is responsible for almost 30% of capital expenditures as the average growth of data approaches close to 50% annually in most enterprise. Amid this milieu, there’s strong concern that enterprise will drown in the expense of storing data, especially unstructured data. To address this need, Cloud storage services have started to become popular. Ranging from Cloud storage focused at the enterprise to that focused on end users, Cloud storage providers offer huge capacity cost reductions, the elimination of labor required for storage management and maintenance, and immediate provisioning of capacity at a very low cost per terabyte. Cloud storage, though, is not a brand new concept. The central ideas for Cloud storage are related to past service bureau computing paradigms and to those of application service providers and storage service providers of the late 90’s. This time, however, the economic situation and the advent of new technologies have sparked strong interest in the Cloud storage provider model. With on-premises Corresponding Author: Jiyi WU, [email protected]

© 2011 ACADEMY PUBLISHER doi:10.4304/jcp.6.8.1764-1771

storage costs already high and rising in many IT departments, Cloud storage providers can lower cost by off-loading the burden of storage management and shielding enterprises from other costs as well, such as storage and network hardware changes. Cloud storage providers deliver economies of scale by using the same storage capacity to meet the needs of many organizations, passing the cost savings to their customer base. Cloud Storage is part of a wider definition called Cloud Computing which, according to the National Institute of Standards and Technology, is “a model for enabling convenient, on demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction”. The service models are divided in Cloud Software as a Service (SaaS), Cloud Platform as a Service (PaaS) and Cloud Infrastructure as a Service (IaaS). Computing resources like servers and network can be replaced, but the core of most of the organizations is the information, usually stored in data centers. For this reason security and availability are the first issues when companies are deciding to migrate part of their data to the cloud, generally by the internet. This kind of precaution is not so different from the one when data is stored in private data centers, but there are some analysis concerned to this migration to public cloud that need to done by corporations and service providers. II. CLOUD STORAGE INFRASTRUCTURE REQUIREMENTS When you combine the technology trends such as virtualization with the increased economic pressures, exploding growth of unstructured data and regulatory environments that are requiring enterprises to keep data for longer periods of time, it is easy to see the need for a trustworthy and appropriate storage infrastructure. Whether a cloud is public or private, the key to success is creating a storage infrastructure in which all resources can be efficiently utilized and shared. Because all data resides on the storage systems, data storage becomes even more crucial in a shared infrastructure model. There are ten critical common

JOURNAL OF COMPUTERS, VOL. 6, NO. 8, AUGUST 2011

denominators that must be considered to make cloud storage valuable. These include: A. Elasticity Cloud storage must be elastic to rapidly adjust the underlying infrastructure to changing subscriber demands and comply with Service Level Agreements (SLAs). B. Automatic Cloud storage must have the ability to be automated so that policies can be leveraged to make underlying infrastructure changes such as placing user and content management in different storage tiers and geographic locations quickly and without human intervention.

1765

time-sharing nature of the underlying hardware and unanticipated sharing and reallocation of machines can significantly affect run times. III. MULTI-TENANCY CLOUD STORAGE REFERENCE MODEL A. Typical cloud storage system architecture A typical cloud storage system architecture includes a master control server and several storage servers, as shown in Fig 1.

C. Scalability Cloud storage needs to scale quickly and to tremendous capacities. This translates into scalability across objects, performance, users, clients, and capacity with a single name space across all storage capacity being critical for low Opex reasons. D. Data Security For private clouds, security is assumed to be tightly controlled. For public clouds, data should either be stored on a partition of a shared storage system, or cloud storage providers must establish multi-tenancy policies to allow multiple business units or separate companies to securely share the same storage hardware. E. Performance A proven storage infrastructure providing fast, robust data recovery is an essential element of a cloud service. F. Reliability Enterprise users also want to make sure that their data is reliably backed up for disaster recovery purposes and that it meets pertinent compliance guidelines. G. Ease of Management The need for improved manageability in the face of exploring storage capability and costs is a major benefit enterprises are expecting from cloud storage deployment. H. Ease of Data Access Ease of access to data in the cloud is critical in enabling seamless integration of cloud storage into existing enterprise workflows and to minimize the learning curve for cloud storage adoption. I. Energy Efficiency IT datacenters are growing bottlenecks and approaching ceilings on available power, cooling and flooring space. Green storage technology is the technology that enables energy efficiency and waste reduction in storage solutions leading to an overall lower carbon footprint. J. Latency Not all applications are suitable for a Cloud storage model. It is important to measure and test network latency before committing to a migration. Virtual machines can introduce additional latency through the © 2011 ACADEMY PUBLISHER

Figure 1. A typical Cloud Storage system architecture

For some computer owners, finding enough storage space to hold all the data they've acquired is a real challenge. Some people invest in larger hard drives. Others prefer external storage devices like thumb drives or compact discs. Desperate computer owners might delete entire folders worth of old files in order to make space for new information. But some are choosing to rely on a growing trend: cloud storage. While cloud storage sounds like it has something to do with weather fronts and storm systems, it really refers to saving data to an off-site storage system maintained by a third party. Instead of storing information to your computer's hard drive or other local storage device, you save it to a remote database. The Internet provides the connection between your computer and the database. On the surface, cloud storage has several advantages over traditional data storage. For example, if you store your data on a cloud storage system, you'll be able to get to that data from any location that has Internet access. You wouldn't need to carry around a physical storage device or use the same computer to save and retrieve your information. With the right storage system, you could even allow other people to access the data, turning a personal project into a collaborative effort. So cloud storage is convenient and offers more flexibility, but how does it work? Find out in the next section.

1766

JOURNAL OF COMPUTERS, VOL. 6, NO. 8, AUGUST 2011

B. Cloud Storage reference model The appeal of cloud storage is due to some of the same attributes that define other cloud services: pay as you go, the illusion of infinite capacity (elasticity), and the simplicity of use/management. It is therefore important that any interface for cloud storage support these attributes, while allowing for a multitude of business cases and offerings, long into the future. The model created and published by the Storage Networking Industry Association™ ,shows multiple types of cloud data storage interfaces able to support both legacy and new applications. All of the interfaces allow storage to be provided on demand, drawn from a pool of resources. The capacity is drawn from a pool of storage capacity provided by storage services. The data services are applied to individual data elements as determined by the data system metadata. Metadata specifies the data requirements on the basis of individual data elements or on groups of data elements (containers). As shown in Fig 2, the SNIA Cloud Data Management Interface (CDMI) is the functional interface that applications will use to create, retrieve, update and delete data elements from the cloud. As part of this interface the client will be able to discover the capabilities of the cloud storage offering and use this interface to manage containers and the data that is placed in them. In addition, metadata can be set on containers and their contained data elements through this interface.

Figure 2.Cloud Storage reference model

It is expected that the interface will be able to be implemented by the majority of existing cloud storage offerings today. This can be done with an adapter to their existing proprietary interface, or by implementing the interface directly. In addition, existing client libraries such as XAM can be adapted to this interface as show in Figure 2.

© 2011 ACADEMY PUBLISHER

This interface is also used by administrative and management applications to manage containers, accounts, security access and monitoring/billing information, even for storage that is accessible by other protocols. The capabilities of the underlying storage and data services are exposed so that clients can understand the offering. Conformant cloud offerings may offer a subset of either interface as long as they expose the limitations in the capabilities part of the interface. C. Multi-Tenancy Cloud Storage The terms multi-tenant and multi-tenancy are not new; both have been used to describe application architectures designed to support multiple users or “tenants” for many years. With the advent of cloud computing, this terminology has simply been extended to include any cloud architecture—or infrastructure element within that architecture (application, server, network, storage)—that supports multiple tenants. Tenants could be separate companies, or departments within a company, or even just different applications. To provide “secure” multi-tenancy and address the concerns of cloud skeptics, a mechanism to enforce separation at one or more layers within the infrastructure is required: z Application layer. A specially written, multitenant application or multiple, separate instances of the same application can provide multi-tenancy at this level. z Server layer. Server virtualization and operating systems provide a means of separating tenants and application instances on servers and controlling utilization of and access to server resources. z Network Layer. Various mechanisms, including zoning and VLANs, can be used to enforce network separation. IP security (IPsec) also provides network encryption at the IP layer (application independent) for additional security. z Storage Layer. Mechanisms such as LUN masking and SAN zoning can be used to control storage access. Physical storage partitions segregate and assign resources (CPU, memory, disks, interfaces, etc.) into fixed containers. Achieving secure multi-tenancy may require the use of one or more mechanisms at each infrastructure layer. While mechanisms to support multi-tenancy and enforce separation exist at every infrastructure layer, this paper is primarily concerned with storage and the requirements for secure and effective storage multitenancy in a cloud environment. To understand the full set of storage requirements, it is necessary to consider cloud storage from both the perspective of the tenant (user) and the provider of cloud services. Cloud computing services can be broken down into a variety of types, ranging from Software as a Service (SaaS)—in which the provider delivers specific application services to each tenant—to Data storage as a Service (DaaS) —which is virtualized storage on demand over a network. Regardless of the type of cloud service, from a tenant perspective there will be specific

JOURNAL OF COMPUTERS, VOL. 6, NO. 8, AUGUST 2011

requirements that apply directly or indirectly to data storage. Tenant requirements are typically defined in terms of service level agreements (SLAs), which cover a variety of capabilities including: z Security z Performance z Data protection and availability z Data management From the provider’s perspective, multi-tenant storage should provide convenient mechanisms for satisfying these and other tenant SLAs as well as supporting additional capabilities such as: z Accounting. The ability to monitor usage by each tenant for billing or other purposes. z Self service. The ability to allow a tenant to perform a defined set of management tasks on their data and the storage they use, thereby offloading these functions from the provider. z Non-disruptive upgrades and repairs. Downtime in multi-tenant environments may be difficult or impossible to schedule, so maintenance activities must be possible without incurring downtime from the point of view of the tenant. z Performance management. The ability to balance cost and performance as the lifecycle requirements of data changes over time. Designed to enable multi-tenant storage offerings, the SNIA’s Cloud Data Management Interface (CDMI) for cloud storage and data management integrates and is interoperable with various types of client applications. CDMI offers a standard approach to data portability, compliance and security, as well as the ability to connect one cloud provider to another, enabling compatibility between cloud vendors. Using this approach, a client will be able to discover the capabilities of cloud storage and use this interface to manage data containers and the data elements that are placed in them. CDMI makes extensive use of metadata to simplify application access and enable multiple levels of service as required by a diverse set of users. In the storage layer, the CDMI interface can simplify management since data system metadata can be applied to container hierarchies. For the functional data path interface for data storage, CDMI assigns each data object a separate URI (Uniform Resource Identifier). Since objects can be fetched using the standard HTTP protocol employing RESTful (REpresentational State Transfer) operations, each data element can be managed as a separate resource. In this way, it is possible to separate and classify data elements and containers for secure access as well as service levels. The result is a level of isolation suitable to tenant based, on-demand data access. IV. CLOUD STORAGE USE CASES In this part we will summarizes the use cases in various Cloud Storage offerings.

© 2011 ACADEMY PUBLISHER

1767

A. Web Facing Applications Web facing applications will typically use a Cloud Storage offering that provides the data directly to the user’s browser using a URL. The data is typed (MIME) and the browser invokes the appropriate application to view the data. Media (audio, video) files are served as a stream of data, allowing use of parts of the data within the file without requiring all data in the file to have been received by the client.

Figure 3. Data Storage as a Service

Social media sites include Myspace, Facebook, Twitter, Blogs, etc. Cloud Storage is used as a auxiliary storage space augmenting the web facing social application. Pictures and content are stored in Cloud Storage (URL based typically). A content management system is used to keep track of additional metadata associated with the data. Smugmug is an example of this. B. Unstructured Data Storage This is a pre-allocated storage space (LUN, Filesystem) that is exported via standard client protocols (ex: WebDAV, NFS, CIFS), and “mounted” on a local machine. Normal POSIX semantics are available at that point for creating/reading/writing/deleting the files. A number of vendors have offerings in this space. A sub case is cloud desktop. Examples include iCloud, ThinkGrid etc. This is the ability to synchronize local client data, from multiple clients, with a Cloud Storage version. Changes are detected and then synchronization is done asynchronously and opportunistically. Access may or may not be through standard file protocols and URIs. Clients and servers have a way of sharing state describing what has/needs to be synced.

Figure 4. Backup to the Cloud

1768

JOURNAL OF COMPUTERS, VOL. 6, NO. 8, AUGUST 2011

C. Backup to the Cloud 1) Backup Software running on, some, local machines – destination Cloud Storage. This is local backup software or backup server using Cloud Storage as the destination of backup data. z Backup Software on each machine (i.e. TimeFinder) This is a backup application that only backs up a single machine. Iron Mountain’s Connected Backup for PC is an example. z Backup Server based for multiple local machines This is a local, central backup server that aggregates the use of the Cloud Storage for one location. It generally takes the form of an appliance giving the user an interface to manage the appliance. Also the appliance would back up its own metadata. z File Server Appliance locally with embedded backup to Cloud This is a NAS server with integrated backup to the Cloud. Examples include Datto, Seagate Free Agent, Iomega’s StoreCenter IX2, HP’s MediaSmart Server, Cachengo ect. A common technique used by some local servers is to have the client computers turn on data sharing (i.e. becoming a CIFS server). Then having the local backup server become a CIFS client of the backup client and then backing up the data in that manner. This is an elegant way to circumvent having to install 3rd party backup software on the backup clients. 2) Backup of Cloud Computing Data Backup of the data used in Cloud Computing (IaaS). Example(s): vSphere (includes de-dup as well) 3) Backup from one cloud provider to the other This is the case of using a second cloud provider as the target of backup data from the first cloud provider. 4) Restore (i.e. Give me back all my laptop data) This is the obvious reason of why you are doing the backups in the first place, needing to restore. Most solutions allow for both online restores as well as physical shipment of media to the customer. Examples: Mozy ships DVDs of data. R1Soft is doing bare metal restore. D. Archive/Retention to the Cloud This is the use case of using the Cloud Storage for archiving of data. Theoretically XAM should be an ideal interface for this. Iron Mountain - VFS (not XAM based) is an example of this.

Figure 5. Archive as a Service

© 2011 ACADEMY PUBLISHER

Considerations for the use case: z Does the user maintain a local copy? z Does the Cloud provider do virus scans or other operations on your behalf? It’s not useful to have to pull the files back over the wire. z Retention Period: “Keep my files for X amount of time” This is the case where you define the period of time that you guarantee files will be retained. z Secure Deletion: “When it’s gone, it’s REALLY gone” This is the case where the service provider provides a means of deleting data in such a way that it’s truly gone, i.e. not recoverable by any means. A common method for this is encrypting the data at rest and then shredding the encryption keys. z eDiscovery : “Satisfy my subpoenas“ This is providing a service such that when certain documents are required to be produced for a court case, the appropriate documents are produced without undue time or costs. Email archiving may be a specialization of this case. E. Preservation in the Cloud Preservation is distinguished from Archive/Retention in that the goal of preservation is to actively maintain the upkeep of information, most likely for long periods of time. Many of these repositories leverage RDF as a way to describe the data and it’s relation to other data. Fedora Commons and other content managers are used to keep track of the metadata. Preserving of machine images along with the data so that a user can at a minimum display the information using the application that generated the data. F. Databases in the Cloud Cloud Table Storage falls into the following categories: 1) Horizontally Scalable, Object-Relational:Examples include Microsoft Azure Tables, Google BigTable, (hyperTable) , SimpleDB. 2) Vertically Scalable, Traditional Relational: SQL services is an example. 3) Document Model: Example(s): CouchDB is an example. Cloud Data Storage Interface for Cloud Table Storage were not standardized at this time, but the Cloud Data Management Interface is applicable regardless. G. Storage for Cloud Computing 1) Storage for IaaS Traditional data storage which is accessible as part of the computing infrastructure. EC2 leverages S3 as if it were a private cloud. z Image Storage The image of the Guest OS which is made available to hypervisors for staring a VM. z Guest Auxiliary Storage Provision the storage space, at a given QoS, which the guest needs beyond the boot storage.

JOURNAL OF COMPUTERS, VOL. 6, NO. 8, AUGUST 2011

z Application Image Storage The application is maintained in the cloud and invoked locally. Maintenance of the application is done centrally. Streamed in using a statistically probable execution order so that not all bits need be present to start executing. This is a function of the distribution network and can be layered on top of the interfaces we are defining.

1769

z

Activation and de-activation relative to some trust model – activation requires assembly from the erasure coded blocks, decoding and decryption, de-activation involve encryption, encoding and distribution. z Topology of the network is more nuanced than the typical two tier processing model and more dynamic as well. Examples include “Federated” Cloud Storage , “Cloud Exchange”,Cloud “Bursting”, offloading, Hybrid Internal/External Clouds . V. ADVANTAGES AND CHALLENGES

Figure 6. Storage for Cloud Computing

2) Storage for PaaS This type of storage is not usually surfaced or manageable. 3) Storage for SaaS (Software as a Service) This type of storage is not usually surfaced or manageable. A typical examples is Salesforce.com uses storage service from Aamzon S3. H. Content Distribution Distributing data globally for the purposes of decreasing latency and increasing scalability. Here list the following examples: z “Hot” media serving – move to point of presence, replicate out to caches, then recover the resources when unused. z Data transformation in route (i.e. localization, NTSC->PAL) This use case as a layering on top of the standard interfaces. I. Cloud Storage Peering (i.e. “Intercloud” Storage) This is the concept of having the Storage Clouds of different Cloud Storage Providers being able to interoperate between each other (in other words doing for Storage Clouds what the Internet did for separate, proprietary networks). Possible Characteristics: z Shared storage and replication between cloud storage offerings. z Distribute the data across cloud storage providers (possibly via a storage broker that provides a blended rate). z Data may be erasure encoded as well as replicated. z Caching and distribution between the client and “dumb storage” provider, geographic staging and replication.

© 2011 ACADEMY PUBLISHER

A. Advantage of cloud storage With everything, as has been said before, the devil is in the details. Certainly, there are more examples of the growing popularity of cloud storage and valid business reasons for its popularity. Here are five key benefits of using cloud storage and of applications that take advantage of storage in the cloud. z Ease of management: The maintenance of the software, hardware and general infrastructure to support storage is drastically simplified by an application in the cloud. Applications that take advantage of storage in the cloud are often far easier to set up and maintain than deploying an equivalent service on premise. At the customer site, often all that is needed to manage your storage implementation is a simple web browser leaving the headaches to the service provider. z Cost effectiveness:For total cost of ownership, cloud storage is a clear winner. Elimination of the costly systems and the people required to maintain them typically provides organizations with significant cost savings that more than offset the fees for cloud storage. The costs of being able to provide high levels of availability and the scalability an organization needs are also unmatched. The economies of scale achieved by data centers simply can’t be matched by all but the very largest of organizations. z Lower impact outages and upgrades:Typically cloud computing provides cost effective redundancies in storage hardware. This translates into uninterrupted service during a planned or unplanned outage. This is also true for hardware upgrades which for the end user will no longer be visible. z Disaster preparedness: Off site storage isn’t new. Keeping important data backed up off site has been the foundation of disaster recovery since the inception of the tape drive. Cloud storage services not only keep your data off premise, but they also make their living at ensuring that they have redundancy and systems in place for disaster recovery. z Simplified planning: Cloud storage solutions free the IT manager from detailed capacity planning. Cloud-based solutions are flexible and

1770

JOURNAL OF COMPUTERS, VOL. 6, NO. 8, AUGUST 2011

provide storage as needed. This eliminates the need to over provision for storage that may be needed to meet B. Challenges in the implementation However, with every type of cloud storage, there are challenges in the implementation (i.e. the devil is in the details). 1) Physical Security First, understand some things about the data center that is hosting the cloud where your data is stored: z Is the data center physically secure? z What about it's ability to withstand power outages? z For how long? z Are there multiple, independent (on different grids) electrical power paths? z How are communications facilities enabled and where does the fiber enter the facility? z How many communications providers have a POP (point of presence) at the facility? z How is the data center certified (SAS 70 Type II)? World class data centers are expensive, and they are also well understood. What is the tier rating of the data center? (Tier IV is best). Make sure you do business with a cloud storage service provider who makes use of such facilities. 2) Data encryption Encryption is a key technology for data security. Understand data in motion and data at rest encryption. Remember, security can range from simple (easy to manage, low cost and quite frankly, not very secure) all the way to highly secure (very complex, expensive to manage, and quite limiting in terms of access). You and the provider of your Cloud Storage solution have many decisions and options to consider. For example, do the Web services APIs that you use to access the cloud, either programmatically, or with clients written to those APIs, provide SSL encryption for access, this is generally considered to be a standard. Once the object arrives at the cloud, it is decrypted, and stored. Is there an option to encrypt it prior to storing? Do you want to worry about encryption before you upload the file for cloud storage or do you prefer that the cloud storage service automatically do it for you? These are options, understand your cloud storage solution and make your decisions based on desired levels of security. 3) Access Controls Authentication and identity management is more important than ever. And, it is not really all that different. What level of enforcement of password strength and change frequency does the service provider invoke? What is the recovery methodology for password and account name? How are passwords delivered to users upon a change? What about logs and the ability to audit access? This is not all that different from how you secure your internal systems and data, and it works the same way, if you use strong passwords, changed frequently, with typical IT security processes, you will protect that element of access. © 2011 ACADEMY PUBLISHER

4) Service Level Agreements (SLA) What kind of service commitment is your provider willing to offer you? Are they going to be up 99.9% of the time or 99.99% of the time? And how does that difference impact your ability to conduct your business? What is the backup strategy that your cloud provider uses, and does it include alternative site replication? Do they use one at all, or is backup something you have to provide for? Is there any SLA associated with backup, archive, or preservation of data. If your account becomes inactive (say you don't pay your bill), do they keep your data? For how long? Once again, realize that there are different services, with different features, at different costs, and you get what you pay for. 5) Trusted Service Provider The trusted service provider is a critical link. Unlike your in-house IT department, you are now putting your trust in a 3rd party. You must feel confident that they will do what they say they will do. Can they demonstrate that the safeguards they claim are indeed delivered? What is their record? Do you have a successful business relationship with them already, and if not, do you know of others who do? Remember, are they in business to serve business, or is it simply another service that they offer, focused first on cost per gigabyte, versus service and support. This is where many IT service providers have made their living, providing world class service and support, along with effective, efficient, low cost infrastructure.

Figure 7. Survey on concerns with cloud storge services

As show in Fig 7, the problem include security,control, performance,support,vendor lock-in, are concerned by users with cloud storge services. These challenges include: z Security (always an issue and not necessarily a cloud storage specific issue) z Data integrity (making sure the stored data is “correct”) z Power (since you have copies you will have extra storage which adds power)

JOURNAL OF COMPUTERS, VOL. 6, NO. 8, AUGUST 2011

z z z

Replication time and costs (how fast can you replicate data since this can be important to data resiliency) Cost (how much extra money do you have to pay to buy the extra storage for copies) Reliability VI. CONCLUSIONS AND FUTURE WORK

Cloud Storage with a great deal of promise, aren’t designed to be high performing file systems but rather extremely scalable, easy to manage storage systems. They use a different approach to data resiliency, Redundant array of inexpensive nodes, coupled with object based or object-like file systems and data replication (multiple copies of the data), to create a very scalable storage system. This article gives a quick introduction to cloud storage. It covers the key technologies in Cloud Computing and Cloud Storage,several different types of clouds services, and describes the advantages and challenges of Cloud Storage after the introduction of the Cloud Storage reference model. ACKNOWLEDGMENT Funding for this research was provided in part by the Scientific Research Program of Zhejiang Educational Department under Grant No.20071371. We like to thank anonymous reviewers for their valuable comments. REFERENCES [1] Luis M.Vaquero,Luis Rodero-Merino, Juan Caceres,Maik Lindner. A Break in the Clouds: Toward a Cloud Definition. ACM SIGCOMM Computer Communication Review,2009,39(1):50-55. [2] Wu Jiyi,Ping Lingdi,Pan Xuezeng.Cloud Computing: Concept and Platform, Telecommunications Science, 12:23-30, 2009. [3] Jonathan Strickland.How Cloud Storage Works[OL], http://communication.howstuffworks.com/cloudstorage.htm, 2010.

© 2011 ACADEMY PUBLISHER

1771

[4] Storage Networking Industry Association.Cloud Storage Reference Model,Jun.2009. [5] Storage Networking Industry Association.Cloud Storage for Cloud Computing,Jun.2009. [6] Luiz Andre Barroso,Jeffrey Dean,Urs Holzle.Web search for a planet: The Google cluster architecture. IEEE Micro, 2003,23(2):22−28. [7] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The Google file system. In: Proc. of the 19th ACM SOSP. New York: ACM Press, 2003. 29−43. [8] Robert L.Grossman, Yunhong Gu, Michael Sabala,Wanzhi Zhang. Compute and storage clouds using wide area high performance networks. Future Generation Computer Systems, 2009,25(2):179-183. [9] Yunhong Gu and Robert L.Grossman. Sector and Sphere: the design and implementation of a high-performance data cloud. Philosophical Transactions of the Royal Society. A(2009)367:2429-2445. [10] Robert L Grossman,Yunhong Gu. Data Mining Using High Performance Data Clouds: Experimental Studies Using Sector and Sphere. In Proc. of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining,2008, 920-927. [11] Daniel J. Abadi. Data Management in the Cloud: Limitations and Opportunities. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 2009,32(1):3-12. [12] Peter Mell and Tim Grande. NIST. Retrieved from http://csrc.nist.gov/groups/SNS/cloud-computing/clouddef-v15.doc,2010. [13] S Lesem. Cloud Storage Strategy Retrieved from http://cloudstoragestrategy.com/2009/07/security-andcloud-storage-everybody-talksabout-it-but-is-it-really-allthat-different.html,2010.

Jiehui JU is a lecturer at School of Information and Electronic Engineering, Zhejiang University of Science and Technology. She received Master's degree in computer science and technology from Zhejiang University in 2006. Her research interests include Cloud Computing, SaaS and information security. Jiehui JU was born in 1977. She received B.S degree from Ningbo University in 1999.

Suggest Documents