Deploying HPE Vertica Within the Microsoft Azure Cloud

Deploying HPE Vertica Within the Microsoft Azure Cloud HPE Vertica Analytic Database January, 2016 Deploying HPE Vertica Within the Microsoft Azure ...
Author: Neil Harrell
14 downloads 0 Views 484KB Size
Deploying HPE Vertica Within the Microsoft Azure Cloud HPE Vertica Analytic Database January, 2016

Deploying HPE Vertica Within the Microsoft Azure Cloud

January, 2016

Legal Notices Warranty The only warranties for Hewlett Packard Enterprise products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. Hewlett Packard Enterprise shall not be liable for technical or editorial errors or omissions contained herein. The information contained herein is subject to change without notice.

Restricted Rights Legend Confidential computer software. Valid license from Hewlett Packard Enterprise required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license.

Copyright Notice © Copyright 2016 Hewlett Packard Enterprise Development L.P.

Trademark Notices Adobe® is a trademark of Adobe Systems Incorporated. Microsoft® and Windows® are U.S. registered trademarks of Microsoft Corporation. UNIX® is a registered trademark of The Open Group. This product includes an interface of the 'zlib' general purpose compression library, which is Copyright © 1995-2002 Jean-loup Gailly and Mark Adler.

© Copyright 2016 Hewlett Packard Enterprise Development LP.

Page 2

Deploying HPE Vertica Within the Microsoft Azure Cloud

January, 2016

Table of Contents Document Overview ....................................................................................................................................... 4 Configuring the Microsoft Azure Environment ....................................................................................... 4 Virtual Machine Configurations ...................................................................................................................................................................................4 Region Availability .................................................................................................................................................................................................................... 5 Enable I/O Throughput Using Azure Premium Storage ..................................................................................................................... 5 Operating System Requirements for Linux Virtual Machines with Premium Storage .......................................... 5 Configure the VM Instance to Optimize for Excellent Performance .................................................. 5 Step 1: Create an Availability Group for the Cluster ................................................................................................................................. 6 Step 2: Install Microsoft LIS 4.0.11 Drivers.......................................................................................................................................................... 6 Step 3: Create, Mount, and Configure Disks for Data ............................................................................................................................. 6 Virtual Machine Performance....................................................................................................................... 6 CPU Performance...................................................................................................................................................................................................................... 6 Network Performance ........................................................................................................................................................................................................... 7 Disk I/O Performance ............................................................................................................................................................................................................ 8 For More Information ..................................................................................................................................... 8 Appendix A: VNetPerf Results..................................................................................................................... 9

© Copyright 2016 Hewlett Packard Enterprise Development LP.

Page 3

Deploying HPE Vertica Within the Microsoft Azure Cloud

January, 2016

Document Overview This document explores running HPE Vertica in Microsoft Azure, a Hyper-V based cloud infrastructure, and provides specific recommendations about Vertica deployment in Microsoft Azure. Use this document as a guideline for architecting the Vertica cluster. Hewlett Packard Enterprise recommends that customers who are beginning to explore deploying Vertica in the Microsoft Azure environment use this document as a guideline for configuring the Vertica cluster. This document outlines testing using the Vertica VPerf test tool results to indicate performance on a three node cluster, built on GS5, GS4, and GS3 instance types. The VPerf test tool is bundled with the Vertica software and is also available on the Big Data Marketplace. This document provides guidance on how to deploy Vertica on the Microsoft Azure cloud platform. This document discusses the different virtual machines offered in an Azure environment. Hewlett Packard Enterprise has identified a specific class of virtual machines (VMs) that best fit the needs of a Vertica cluster configuration.

Configuring the Microsoft Azure Environment Virtual Machine Configurations Choose the GS Virtual Machine series on Azure for Vertica implementations. Although Azure offers other VM types, the GS series is best choice for Vertica, because it offers the appropriate ratios of CPU to I/O throughput required for effectively running a Vertica cluster. This table shows a breakdown of the virtual machine configurations. VM Type

# of Cores

RAM

Max Disk IOPs

Max Disk Bandwidth

GS3

8

112

20000

500 MB

GS4

16

224

40000

1000 MB

GS5

32

448

80000

2000 MB

© Copyright 2016 Hewlett Packard Enterprise Development LP.

Page 4

Deploying HPE Vertica Within the Microsoft Azure Cloud

January, 2016

Region Availability The GS instance is available only in the West US, East US-2, and West Europe Azure regions. When you create your Vertica cluster, you must choose one of these regions to create your Resource Group and Virtual Network. You can access Vertica on Azure with an established Internet connection. However, your Vertica cluster must exist in one of the regions that supports the GS instance type. The Resource Group and Virtual Network must exist in the same region.

Enable I/O Throughput Using Azure Premium Storage To achieve the necessary amount of I/O throughput for each VM, you must create the data disk containers in an Azure Premium Storage group. Azure Premium Storage offers high-performance, low-latency disks for VMs running I/O-intensive workloads such as Vertica. VM disks that use Premium Storage store data on solid state drives, which allow for much higher I/O transfer rates than the other storage account options. For more information, see Premium Storage: High Performance Storage for Azure Virtual Machine Workloads.

Operating System Requirements for Linux Virtual Machines with Premium Storage Not all distributions of Linux have been validated by Microsoft for use with Premium Storage. The following table shows the supported operating systems that have been validated for use with Premium Storage and that are also supported by Vertica.

Distribution

Version

Supported Kernel

Supported Image

Ubuntu

12.04

3.2.0-75.110

Ubuntu-12_04_5-LTS-amd64-server20150119-en-us-30GB

14.04

3.13.0-44.73

Ubuntu-14_04_1-LTS-amd64-server20150123-en-us-30GB

CentOS

6.5, 6.6, 6.7, 7.0

LIS 4.0 Required

Configure the VM Instance to Optimize for Excellent Performance After creating VMs for your cluster, follow the steps in this section to configure them.

© Copyright 2016 Hewlett Packard Enterprise Development LP.

Page 5

Deploying HPE Vertica Within the Microsoft Azure Cloud

January, 2016

Note: Use this configuration information when deploying Vertica on Azure only. These recommendations are not appropriate for other platforms.

Step 1: Create an Availability Group for the Cluster You can use the Azure Availability Set option to make sure your Vertica nodes are deployed in close proximity to each other within the Azure Datacenter. You can configure this option for each VM within the Azure Portal interface. Click here for more information about Azure Availability Sets. Note: If you create the availability group after creating the VM, the VM automatically reboots.

Step 2: Install Microsoft LIS 4.0.11 Drivers Install the latest Linux Integration Services for Hyper-V. These drivers that enable synthetic device support for Linux VMs for optimal performance and compatibility. You can install the drivers from Microsoft’s downloads site.

Step 3: Create, Mount, and Configure Disks for Data The Azure environment has a limit of 1023 GBs per VHD container file. If your Vertica database requires more than 1 TB of storage space, add software-based disk RAID for storage. Using multiple VHD drive containers in a RAID 0 configuration increases the drive’s write I/O performance. Be sure to base the I/O throughput on the number of CPU cores within the VM. You need a minimum of 20 MB of through per CPU core. However, Hewlett Packard Enterprise recommends using 40 MB per core. For example, the minimum amount of I/O throughput on a GS4 instance with 16 cores is 320 MB. Therefore, the recommended amount for that instance is 640 MB. To achieve optimal throughput, disks should be:   

Attached to the VM with a cache setting of NONE on the Azure Portal. Formatted with EXT4. Mounted in the OS with the Barrier = 0 option.

Virtual Machine Performance CPU Performance The following table contains the results of the Vertica VCPUPerf test tool when run on the various instances of Azure GS VMs. This tool measures the maximum performance of a single CPU core on the server.

© Copyright 2016 Hewlett Packard Enterprise Development LP.

Page 6

Deploying HPE Vertica Within the Microsoft Azure Cloud

January, 2016

Azure GS3

Azure GS4

Azure GS5

Expected time on Core 2, 2.53GHz: ~9.5s

Expected time on Core 2, 2.53GHz: ~9.5s

Expected time on Core 2, 2.53GHz: ~9.5s

Expected time on Nehalem, 2.67GHz: ~9.0s

Expected time on Nehalem, 2.67GHz: ~9.0s

Expected time on Nehalem, 2.67GHz: ~9.0s

Expected time on Xeon 5670, 2.93GHz: ~8.0s

Expected time on Xeon 5670, 2.93GHz: ~8.0s

Expected time on Xeon 5670, 2.93GHz: ~8.0s

This machine's time: CPU Time: 10.450000s Real Time:10.460000s

This machine's time: CPU Time: 10.460000s Real Time:10.460000s

This machine's time: CPU Time: 10.460000s Real Time:10.460000s

Some machines automatically throttle the CPU to save power.

Some machines automatically throttle the CPU to save power.

Some machines automatically throttle the CPU to save power.

This test can be done in

Suggest Documents