OpenStack: The OpenSource Cloud’s Application in High Energy Physics
That Title’s Overstated
OpenStack: The OpenSource Cloud’s Potential Application in Data Intensive Research
That Title’s Overstated
OpenStack: The OpenSource Cloud’s Potential Application in Data Intensive Research
Not as Catchy...
Caveats » I am not a storage or network engineer » I am not a scientist
I am: » a Technical Product Manager. » Dashboard Developer » working for piston{cloud}computing » Pragmatic.
Caveats » I am not a storage or network engineer » I am not a scientist » despite illusions of grandeur.
I am: » a Technical Product Manager. » Dashboard Developer » working for piston{cloud}computing » Pragmatic.
What is openstack? » Founded by NASA and Rackspace » The open source cloud computing platform » Feature-rich and massively scalable » Powers cloud storage, compute, and networking » A world-wide open source collaboration
OpenStack as a Cloud OS Connects to apps via APIs AP
Self-service Portals for users USER
ADMIN
CLOUD OPERATING SYSTEM Creates Pools of Resources
Automates The Network
Benefits of OpenStack as a Common Platform »
Easy to migrate data and applications across clouds Based on:
»
» security policies » economics » research needs No vendor lock-in
»
Common Layer of Data Exchange
»
Less exposed to security issues than public cloud, but still interoperable.
3 Major OpenStack Components » OpenStack Compute/Nova: provision and manage large networks of virtual machines » OpenStack Object Store/Swift: Create petabytes of reliable storage using standard servers » OpenStack Image Service/Glance: Catalog and manage large libraries of server images +
» Other components: Dashboard, Load Balancing, Authentication...
Compute/Nova Key Features 1. REST-based
2. Horizontally and massively scalable 3. Hardware agnostic: supports a variety of standard commodity hardware. 4. Hypervisor Agnostic: support for Xen, Citrix XenServer, Microsoft Hyper-V, KVM, UML, LXC and ESX
HOST 1
HOST 2
HOST 3
HOST 4, ETC.
VMs
Hypervisor: Turns 1 server into many “virtual machines” (instances or VMs) (VMWare ESX, Citrix XEN Server, KVM, Etc.)
»
Hypervisors provide abstraction layer between apps and hardware (SERVERS)
»
OpenStack pools servers, you run operating systems and applications on VMs instead of physical computers
Nova close up » nova-api daemon » endpoint for all OpenStack or EC2 API queries
» nova-schedule process » takes a virtual machine instance request from the queue and determines which compute server host it should run on » a pluggable architecture allowing custom scheduling algorithm
» nova-compute process » worker daemon that creates and terminates virtual machine instances
We mentioned Commodity. How Commodity?
Commodity Hardware » Piston Silicon Mechanics » » » » »
2 Intel Xeon processors 5600 Series 96GB of DDR3 RAM 24TB of SATA storage Redundant 1200W power supplies 2U rackmount chassis
» That’s what our clients get, we’re on: » 32GB, 16TB, 2 Intel Xeon E5645 processors
Commodity Hardware » Piston Silicon Mechanics » » » » »
2 Intel Xeon processors 5600 Series 96GB of DDR3 RAM 24TB of SATA storage Redundant 1200W power supplies 2U rackmount chassis
» That’s what our clients get, we’re on: » 32GB, 16TB, 2 Intel Xeon E5645 processors DevOp borrowed the rest for other machines
Performance: 500 VM Spin Up » Assuming: » 500 copies of one 8GM image » Image warm on the nodes » 50 VMs/Server
» Based on NASA’s experience in regular use, less than 30 seconds
» Worst case: » Image is still in Glance » VM has to be copied via HTTP
Image Service/Glance
1. Store & retrieve VM images
4. Storage agnostic: Store images locally, or use OpenStack Object Storage, HTTP, or S3
2. RESTbased API
3. Compatible with all common image formats
Storage/Swift Key Features 1. REST-based API 2. Data distributed evenly throughout system.
3. Runs on commodity hardware 5. No central database required
4. Scalable to multiple petabytes, billions of objects 6. Account/Container/Object structure (not file system, no nesting) plus Replication (N copies of accounts, containers, objects)
The Storage Story: Nova » Nova/Compute has it’s own storage » Block Storage or Nova-volume » an iSCSI solution » employs the use of Logical Volume Manager (LVM) for Linux » intended for read/write purposes (databases, log, etc.) » basically is an LVM/iSCSI implementation to mount block devices in VM.
The Storage Story: Swift » Swift: Object Storage » » » » » »
Fully Distributed Commodity Hardware (Linux/x86) Data Protection in Software Not a File System Not SAN/NAS/DAS... or any attached storage Optimized for Scale - Petabytes
Swift in Production » Swift has been running in production at Rackspace for over a year with near 100% uptime. » Rackspace’s swift clusters store billions of objects and petabytes of data. » Internap, KT, SDSC, and HP are also running Swift in production
Sharing the Research Common software platform making Federation possible, through a shared API. Swift OS or EC2 API
Location A
Location B
Private Cloud
Private Cloud
To federate Swift across locations, you write a scheduler within OpenStack and drive it through the API.
Swift Components Clients
Proxy Servers
Rings
Account Servers
Container Servers
Object Servers
Swift Components » Proxy Server » Tie together the Swift architecture » Request routing » Exposes the public API
Swift Components » The Ring: Maps names to entities (accounts, containers, objects) on disk. » Stores data based on zones, devices, partitions, and replicas » Weights can be used to balance the distribution of partitions » Used by the Proxy Server for many background processes
Swift Components... » Object Server: » » » »
Blob storage server metadata kept in xattrs data in binary format Object location based on name & timestamp hash
Swift & Large Object Storage »
default 5GB limit on the size of an uploaded object
»
segmentation makes download size of a single object is virtually unlimited
»
segments large object are uploaded and a special manifest file is created
»
when downloaded, all segments are concatenated as a single object.
»
greater upload speed » possible parallel uploads of segments.
But Wait, Swift... » Doesn’t load balance for often requested objects. » throw Varnish Cache or Squid Proxy in front of Swift » Has a “simple” ReSTful API » Wasn't intended for storing unknown data » Isn’t searchable » Is like Amazon’s S3
Potential Solutions for Those Needing to Search Data » Or wait... » Swifts Blueprints Include Searchable MetaData » https://blueprints.launchpad.net/swift/ +spec/future-searchable-metadata » Contribute to the greater community
What’s Piston Doing Different? » Piston Enterprise OS: » A hardened cloud operating system built on OpenStack » Optimized for secure and easy operation of enterprise private clouds » Fully supports interoperability with other OpenStack powered public and private cloud solutions. ™
™
{pent } OS
TM
features
{CloudKey}™ »
Two-factor capable physical authentication
»
Minimizes security risk of administrative logins
»
Hands-free install in under 5 minutes
Null-Tier [Architecture]™ »
Storage, compute and networking on every node
»
Massively scalable
»
Automated scaling
{pent } OS
TM
Null-Tier [Architecture]™
Top of Rack Switch
{CloudKey}™
Server
Hands-Free OS Install and Configuration
-
Highly available Virtual Machines
…
Highly available
Networking Storage Compute Management
{pent } OS
Server
controllers
-
Networking Storage Compute Management
Highly available Virtual Storage
Contact
» Neil Johnston » email:
[email protected] » twitter: @neiljohnston
Or my co-authors: » Joshua McKenty » email:
[email protected]
» Christopher MacGown » email:
[email protected]