Virtualization of Video Streaming Functions

Saarland University Faculty of Natural Sciences and Technology I Department of Computer Science Master Thesis Virtualization of Video Streaming Funct...
Author: Prosper Owen
5 downloads 2 Views 4MB Size
Saarland University Faculty of Natural Sciences and Technology I Department of Computer Science Master Thesis

Virtualization of Video Streaming Functions Submmited by: Birhan Tadele Teklehaimanot Advisor Goran Appelquist Supervisor Prof. Dr.-Ing. Thorsten Herfet Reviewers Prof. Dr.-Ing. Thorsten Herfet Prof. Dr. Dietrich Klakow

April 25, 2016

Eidesstattliche Erkl¨ arung Ich erkl¨are hiermit an Eides Statt, dass ich die vorliegende Arbeit selbstst¨andig verfasst und keine anderen als die angegebenen Quellen und Hilfsmittel verwendet habe. Ich erkl¨are hiermit an Eides Statt, dass die vorliegende Arbeit mit der elektronischen Version u ¨bereinstimmt.

Statement in Lieu of an Oath I hereby confirm that I have written this thesis on my own and that I have not used any other media or materials than the ones referred to in this thesis. I hereby confirm the congruence of the contents of the printed data and the electronic version of the thesis. Saarbr¨ ucken, on April 25, 2016

Birhan Tadele Teklehaimanot

Einverst¨ andniserkl¨ arung Ich bin damit einverstanden, dass meine (bestandene) Arbeit in beiden Versionen in die Bibliothek der Informatik aufgenommen und damit ver¨offentlichtwird.

Declaration of Consent I agree to make both versions of my thesis (with a passing grade) accessible to the public by having them added to the library of the Computer Science Department. Saarbr¨ ucken, on April 25, 2016

Birhan Tadele Teklehaimanot

Abstract Edgeware is a leading provider of video streaming solutions to network and service operators. The Edgeware Video Consolidation Platform(VCP) is a complete video streaming solution consisting of the Convoy Management system and Orbit streaming servers. The Orbit streaming servers are purpose designed hardware platforms which are composed of a dedicated hardware streaming engine and a purpose designed flash as a storage system. The Orbit streaming server is an accelerated HTTP streaming cache server which have up to 80 Gbps bandwidth and can stream to 128000 clients from a single rack unit. In line with the new trend of moving more and more functionalities towards a virtualized or software environment, the main goal of this thesis is to make a performance comparison between Edgeware’s Orbit streaming server and one of the best generic HTTP accelerators(reverse proxy severs) after implementing logging functionality of the Orbit on top of it. This is achieved by implementing test cases for the use cases that can help to evaluate those servers. Finally, after evaluating those proxy servers Varnish is selected and then compared the modified Varnish and Orbit to investigate the performance difference.

iii

Acknowledgements First and foremost, I would like to express my heartfelt gratitude to my supervisor Prof. Dr.Ing Thorsten Herfet for providing me an opportunity to write my thesis with him. My sincere thanks to G¨ oran Appelquist, for his patience and invaluable guidance. During our periodic discussions, his constructive suggestions have helped me to gain such a wonderful experience on doing my thesis. Besides, I have learned a lot while working with him. Furthermore, I would like to thank my immediate family, specifically my father and my mother for helping me to reach here even though the culture in our village is difficult to send girls to school. Last but not least, my wholehearted gratitude to my brothers, sisters and my husband for their love, encouragement, endless motivation and support.

Contents Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Introduction

ix 1

1.1

Multimedia Streaming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

1.2

Streaming Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

1.2.1

Traditional HTTP Download Technologies . . . . . . . . . . . . . . . . . .

2

1.2.2

True Streaming Technologies . . . . . . . . . . . . . . . . . . . . . . . . .

3

1.2.3

HTTP Adaptive Bitrate Streaming Technologies . . . . . . . . . . . . . .

4

2 Background and Related Works 2.1

Content Delivery Networks(CDN)

. . . . . . . . . . . . . . . . . . . . . . . . . .

6

Components of CDN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

Proxy servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

2.2.1

Forward proxy servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

2.2.2

Transparent proxy servers . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

2.2.3

Reverse proxy servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

2.1.1 2.2

6

3 The Orbit streaming server

12

3.1

Edgeware solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.2

Edgeware Orbit server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.3

Proposed solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4 General comparisons of HTTP reverse proxy servers 4.1

Squid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.1.1

4.2

15

Pros and cons

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Apache traffic server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 v

vi

Contents

4.2.1 4.3

Nginx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.3.1

4.4

Pros and cons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Aiscaler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.5.1

4.6

Pros and Cons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

Varnish . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.4.1

4.5

Pros and cons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Pros and cons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Conclusion

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5 Test methodology 5.1

21

Definition of test cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.1.1

Live test case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5.1.2

100% cache hit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.1.3

90% cache hit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.2

Implementation of test cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.3

Configuration of proxy servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.4

Test Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 5.4.1

Test setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

5.4.2

Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

6 Performance comparison of Orbit and proxy servers

29

6.1

Live test case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

6.2

100% cache hit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

6.3

90% cached . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 6.3.1

Nginx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

6.3.2

Varnish . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

6.4

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

6.5

Implementation of logger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

6.6

Orbit and modified Varnish performance comparison . . . . . . . . . . . . . . . . 43 6.6.1

Varnish results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

6.6.2

Orbit results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

Contents

vii

7 Conclusion

49

List of Figures

52

List of Tables

55

Bibliography

57

List of Abbreviations CDN

Content Delivery Network

HLS

HTTP Live Streaming

HDS

HTTP Dynamic Streaming

SSL

Secure Socket Layer

ATS

Apache Traffic Server

QoS

Quality of Service

ASF

Apache Software Foundation

FMS

Flash Media Server

VoD

Video on demand

RTMP

Real Time Message Protocol

WMS

Windows Media Services

TCP

Transport Control Protocol

IIS

Internet Information Services

ATS

Apacche Traffic Server

CAPEX

Capital Expenditure

OPEX

Operational Expenditure

VCP

Video Consolidation Platform

QoE

Quality of Experience

GPL

General Public License

VCL

Varnish Configuration Language

MSE

Massive storage engine

ix

Chapter 1

Introduction 1.1

Multimedia Streaming

Internet was originally designed to support data traffic transmission. Later in the early 1990’s, the need of multimedia transmission emerges due to the growth of Internet in terms of users, applications and nodes. Nowadays multimedia content is composing of large portion of the internet traffic as more and more users are accessing multimedia content over the internet. For instance, the share of mobile video traffic will reach up to 67% in 2017 [7] and it is estimated that the multimedia transmission will cover up to 90% of the internet traffic within the next few years [6]. Multimedia includes text, still images, audio, animation and video in an integrated manner.

Multimedia streaming refers to transmission of multimedia content from streaming sender to a streaming receiver in a compressed form without downloading the whole content at the receiver device. The basic difference between multimedia streaming and textual data transfer is that multimedia streaming requires real-time delivery but can tolerate certain amount of data loss. Moreover, the main components of a multimedia streaming system are the encoder, streaming server, streaming client, media transfer protocol and the underlying physical network.

The minimal set actions performed by the streaming system are, first camera captures and produces either still images or video. The output from camera can be raw media without any compression and this requires a very large bandwidth as it’s size is very large compared to compressed version of it. Therefore, the media should be compressed using appropriate compression techniques at the sender end to transmit over the internet. The compressed media are then stored in the server together with its metadata which is the description of the media such as location and timing information about the media. Hence, when the receiver requests a certain media, the sender sends that media and its metadata. The media is received in the form of packets and reassembled to the original compressed stream at the receiver end. Then, 1

2

Chapter 1. Introduction

the decoder takes this compressed stream and it decodes the media. Finally, the decoded media is subsequently passed to the renderer for display.

In general, depending on the media, streaming can be classified into on-demand and live streaming. In on-demand streaming media is pre-recorded, compressed and stored on the streaming server and delivered to clients when requested. On the other hand, in live streaming media is captured, compressed and transmitted on the fly.

1.2

Streaming Technologies

In this section the background of the most widely used streaming technologies is described in short.

1.2.1

Traditional HTTP Download Technologies

Traditional HTTP Download is the basic technology for transmission of content over internet. It uses HTTP protocol to download the content to the receiver device and then it play out locally. The most commonly used traditional HTTP download methods are the following:

1.2.1.1

HTTP Download

HTTP Download is widely used technology for data transfer. When using http download the content is downloaded and stored on the receiver’s device and playback does not begin until the media is downloaded completely which may cause delays for large media.

1.2.1.2

HTTP Progressive Download

HTTP progressive download is a widely used media streaming technology and it uses HTTP/TCP protocol. In progressive download, contents are downloaded partially and progressively stored on user’s device [14] which improves HTTP Download method by reducing the delay before playback begins. First, the metadata which tells the player how to play the media is downloaded. Then, playback begins after the metadata and sufficient data has been buffered on the receiver’s device and the rest of the content is continued to be downloaded and save while the player plays what is already downloaded data. However, there is no bandwidth adaption since it does not consider the variation of network condition between client and server. In addition, it cannot be used to stream a live media as it needs offline preparation and can be inefficient in controlling the bandwidth from the ISP point of view [9].

Chapter 1. Introduction

1.2.1.3

3

HTTP Pseudo Streaming

HTTP Pseudo Streaming is very similar to HTTP Progressive Download, except that the player can seek forward or backward even the content is not yet downloaded. The player uses byte offset or number of seconds from the start of the video to find the desired part of video. The player can buffer the content without saving it in the receiver’s device and it is not mandatory to download the video from start to finish which means the player can stop the stream and jump to different point.

1.2.2

True Streaming Technologies

True streaming technologies are the most popular streaming protocols for Flash and windows media streaming. These technologies create a connection to a dedicated media server and the content is sent to the end-user device as a series of small packets. True Streaming technologies use a stateful protocol, which means from the first time a client connects to the streaming server until the time it disconnects from the streaming server, the server keeps track of the client’s state. Commands like PLAY, PAUSE and STOP can be issued by the end-user for playback control and multi-bitrate delivery is supported. However, it is not common to switch between bitrates once streaming has started. The two most common methods are Adobe Flash Media Server and Microsoft Windows Media Server.

1.2.2.1

Adobe Flash Media Server

Adobe Flash Media Server (FMS) uses proprietary data from Adobe Systems (formerly Macromedia) and is a hugely popular streaming platform. FMS is commonly installed on a media or origin server running linux operating system; however, FMS can be supported on the Windows Server operating system. FMS supports stored Video on Demand(VOD) and live media delivery [14].

1.2.2.2

Microsoft Windows Media Services

Microsoft Windows Media Services (WMS) supports both VOD and live media delivery. WMS is normally installed on a media or origin server running the Windows Server 2003 or 2008 operating system; however, there are proprietary variants which run on non-Windows servers. WMS has the ability to enforce authentication and impose connection limits. The preferred protocol for WMS is Real Time Streaming Protocol(RTSP) [14].

4

1.2.3

Chapter 1. Introduction

HTTP Adaptive Bitrate Streaming Technologies

HTTP adaptive bit-rate streaming is currently the most sophisticated method for streaming media delivery [13]. The content is encoded at multiple bit rates allowing selection of different encoded versions during streaming based on the available network bandwidth and client resources. For each encoded version, the content is divided into a series of smaller segments or chunks each with 2-10 seconds in length which are reassembled and played back as a single continuous stream. This makes very easy for the receiver player to jump forward or backward in the video. Eventually, a manifest file is created to act as a table of contents for the segments.

The quality of the segments can be adapted during streaming based on network quality conditions which means player is able to switch seamlessly between the different fragments at any time during the playback, so the player can select the desired quality level and adjust automatically based on real-time network conditions. Therefore, during streaming only the relevant segments are requested by the player and sent for reassembly and playback on the receiver device. An additional benefit of HTTP adaptive bit rate streaming is the ability to utilize CDNs to cache video content closer to end-viewers.

In addition, Apple HTTP live streaming, adobe HTTP dynamic streaming, microsoft IIS smooth streaming and MPEG dynamic adaptive streaming over HTTP are the most widely used examples of HTTP adaptive bit rate streaming technologies.Each of them are described briefly as follows.

1.2.3.1

Apple HTTP Live Streaming

Apple HTTP live streaming (HLS) is an HTTP-based media streaming technology implemented by Apple in 2009 [15]. In HLS, video and/audio inputs are typically encoded using H.264/Advanced audio coding(AAC) and then the stream segmenter breaks the stream into a series of short segments that are saved as transport stream files(TS files)along with an index file (.m3u8) which indicates the order of the TS files. These TS files are stored on a standard HTTP web server for distribution along with the URL for the index file. The receiver player begins by fetching the index file using its URL and then reading it to request the appropriate segments, and displays content without any pauses or gaps as a continuous stream. HLS is optimized for delivery to IOS devices and Safari browsers, but there are solutions on all other platforms as well, with different quality.

1.2.3.2

Adobe HTTP Dynamic Streaming

HTTP Dynamic Streaming(HDS), is Adobe’s method for adaptive bitrate streaming and it supports live and on-demand delivery of MP4 media over regular HTTP connections. HDS allows

Chapter 1. Introduction

5

for adaptive streaming over HTTP to any device that’s compatible with Adobe Flash orAdobe Integrated Runtime(AIR). It converts the content into a fragmented MP4 file format and delivers high-definition video/audio using any Flash compatible codec (H.264/AAC, VP6/MP3).On demand content is converted using file packager utility as a post processing step; whereas, live streams are converted real time using Adobe’s live Packer utility. An open source file specification is used for fragmentation with .F4M for the manifest (index) file and .F4F for the media segments which are stored in a single file. Moreover, adobe’s HTTP origin module is installed on a standard HTTP web server to handle fragment requests and the streams are played using Adobe Flash Player 10.1 or AIR.

1.2.3.3

Microsoft IIS Smooth Streaming

IIS(Internet Information Services) Smooth Streaming is Microsoft’s HTTP adaptive streaming technology which uses Microsoft Silverlight as an application framework similar to Adobe Flash. It supports multiple audio and video codecs. IIS is installed on an origin server running Windows Server 2008 and Smooth Streaming is a plug-in software module to IIS. In IIS, media content is stored as single file format with the segments stored as fragments MP4 within the file [14].

1.2.3.4

MPEG-Dynamic Adaptive Streaming over HTTP

MPEG Dynamic Adaptive Streaming over HTTP(DASH) is a standard which is defined by MPEG to enable the interoperability between severs and clients of different vendors. It is a generic solution based on HLS, HDS and Microsoft IIS Smooth Streaming. DASH is introduced in order to standardize these proprietary solutions and have interoperability among themselves. As a result, client applications can use all the streaming formats of different proprietary solutions [10]. DASH, like HLS, HDS and Microsoft IIS Smooth Streaming, uses the concept of segments and the equivalent of a playlist or manifest file, known as a Media Presentation Description (MPD) file. Moreover, DASH can treats the video stream as a single file (does not create segment files), so the MPD file points to offsets in the origin file rather than to segment files [13].

Chapter 2

Background and Related Works In this chapter the concept of content delivery networks (CDN) is explained along with it’s components. Furthermore, the concept of proxy servers and how caching work with reverse proxy servers are also introduced in a brief way to insure a better understanding of this work.

2.1

Content Delivery Networks(CDN)

High bandwidth requirement and rate variation of videos in compressed format introduces some challenging issues to the end-to-end delivery over wide area network. As a consequence, Content Delivery Networks has been evolved in order to overcome these challenges and improve accessibility of the Internet. CDN is a large, geographically distributed network of specialized servers that accelerate the delivery of web content and rich media to Internet-connected devices [8]. The main concept at the basis of this technology is the delivery at edge points of the network, in proximity to the request areas, to improve the user’s perceived Quality of Service (QoS) when accessing Web content. This concept of CDN uses edge caching, which entails storing replicas of static text, image, audio, and video including various forms of interactive media streaming content in multiple servers around the ”edges” of the Internet, so that user requests can be served by a nearby edge server rather than by a far-off origin server. The purpose of caching is to reduce network traffic to a minimum; this is achieved by delivering content from caches as close to the requesting user as possible but also by ensuring the delivery device has effectively cached the content from previous requests [14]. CDNs typically host static content including images, video, media clips, advertisements, and other embedded objects for dynamic Web content. Typical customers of a CDN are media and Internet advertisement companies, data centers, Internet Service Providers (ISPs), online music retailers, mobile operators, consumer electronics manufacturers, and other carrier companies. 6

Chapter 2. Background and Related Works

7

Figure 2.1: CDN Architecture [8].

2.1.1

Components of CDN

Most CDN architectures are constructed from the following key components: • Content delivery component: It contains origin server and a set of edge servers (cache servers) to replicate the content and deployed as near as possible to the users. The origin servers are the master sources for the content and can be deployed within the operator’s network or more commonly within a content owner’s infrastructure. The primary purpose of content delivery component is to deliver data to end users. • Content distribution component: moves content from the origin server to cache servers and ensures consistency. These can be deployed in a hierarchical model to allow tiered caching and protection to any origin servers. • Request-routing component: Direct user requests to cache servers and interact with the distribution component to keep the content fresh. • Accounting component: maintains logs of client accesses and records usage of the servers assists in traffic reporting and usage-based billing.

2.2

Proxy servers

A proxy server is an intermediary server that intercepts requests from clients seeking resources from different severs across the internet. Those resources can be images, files, web page, video, audio,etc. Proxy server facilitates communication between clients and servers and can filter requests based on its various rules. It can allow or reject communications by validating the

8

Chapter 2. Background and Related Works

requests against the available rules. There are different kinds of proxy servers, we will see here three basic types of proxy servers although the focus of this work is on reverse proxy server.

2.2.1

Forward proxy servers

A forward proxy server intermediates traffic between client and the destination chosen by the client. It enables a client to connect to a remote network to which it normally does not have access. Moreover, it can also be used to cache data, reducing load on the networks between the forward proxy and the remote web server. A forward proxy cache needs explicit configuration of the browser to direct all requests to the proxy cache rather than the target web server.

2.2.2

Transparent proxy servers

Transparent proxy cache achieves the same goal as forward proxy cache, but operates transparently to the browser. The browser does not need to be explicitly configured to access the cache. Transparent caches are especially useful to ISPs, because they require no browser setup modification. Moreover, they are the simplest way to use a cache internally on a network, because they don’t require explicit coordination with other caches. However, many companies like YouTube are currently trying to prevent the use of transparent proxies since they want to have a full control of the communication between the service providers and clients.

2.2.3

Reverse proxy servers

Reverse proxy, also known as Web Server Accelerator is an intermediary server which stores responses from the origin server( a server that contains the content) in its cache and serve subsequent requests for the same content from this cache. It proxies on behalf of servers and appears to the end users as the origin server. The origin servers will never be accessed from outside since every request for the origin server is passed through the reverse proxies. When a client requests some content, the DNS will route the request to the reverse proxy server instead of the origin server. The reverse proxy checks the content in its cache, if not it connects to the origin server and fetches the requested content to its cache and serves the users. Requested content can be fetched from one or more origin servers but for the user it looks like a content of one server. Reverse Proxy servers check validity of the stored data using the additional HTTP headers received from the origin server. In addition, origin server controls weather a given content should be cached by the proxy server or not using HTTP headers.

Reverse proxy server checks weather the requested data is in cache and still valid when it receives a request on behalf of a server. So, if the content is not in the cache it forwards the request to the origin server. Moreover, if the data is in cache but not valid, it deletes the content from the cache and forwards the request to the origin server. On the other hand, if the data is in cache

Chapter 2. Background and Related Works

9

and it is still valid the reverse proxy forwards the requested data to the client from its cache. It also checks weather the response is cacheable or not before storing the content to its cache, when it receives a response from the origin servers.

Reverse proxy reduces load on the origin server rather than reducing upstream network bandwidth on the client side as forward and transparent proxy servers. Reverse proxy handles all traffics before it can reach the origin server by sitting between the client and the origin server. Reverse proxy servers are used to reduce bandwidth usage and improve performance by storing static contents like images, videos, audios, etc on its cache and then serving users without going to the origin server. This can also help to offload a very busy server and to reduce the response time and enhance customer’s browsing experience. Moreover, proxy servers protect origin servers and act as additional defence against security attacks because they intercept requests to the origin servers.

2.2.3.1

How caching works on reverse proxy

Clients always use HTTP when talking to a caching proxy(reverse proxy) even if the application is an FTP transfer. • Is it cacheable A response is called cacheable if it can be used to answer a future request. A cache decides whether a particular response is cacheable or not by checking some parts of the request and response. In particular it check the following: the response status code, request method, response cache-control directives, a response validator and request authentication. Moreover, in some of the caches, valuable responses(frequently requested) are cacheable than those requested once. The most important HTTP header tags which are used by reverse proxy to check validity of the cached content and to check whether the response from the origin is cacheable are:

– Last-Modified:-Tells the proxy when the file was last modified. – Expires:- Tells the proxy when to drop the file from the cache. – Cache-Control: Tells the proxy if the file should be cached. – Pragma: Also tells the proxy if the file should be cached. • Definition of Cache Hit and Miss When a cache receives a request, it checks to see if the response has been cached. If it is found in the cache we call it cache hit, otherwise we call it cache miss. When the object is found, the cache has to decide if the stored is fresh or stale. A cached response is called fresh if the expiration time has not been reached; otherwise it is stale. Moreover, fresh

10

Chapter 2. Background and Related Works

response will be send to the client immediately; on the other hand stale responses require validation from the origin server.

Hit ratio is used in order to measure the effectiveness of cache. It refers to the percentage of requests that are satisfied as cache hits and usually it includes the validated and non validated hits. • Cache replacement policies Cache replacement refers to the process of removing the old responses when the cache is full and there is need of space for the new ones. Usually cache assigns some kind of value to the object cached and the least valuable are removed first. Although, the definition of “valuable“ differs from cache to cache. Typically, an object’s value is related to the probability that it will be requested again, thus maximizing the hit ratio. The most known cache replacement algorithms according to caching researchers are listed below.

1 Least recently used (LRU) This algorithm is the most popular replacement algorithm that provides high performance in almost all situations. It removes the objects that have not used for long time. This algorithm can be implemented by a simple list and every time an object is accessed it will be on the top of the list. The least recently accessed object will automatically be moved to the bottom of the list. 2 First in First out (FIFO) FIFO replacement algorithm is even simpler to implement than LRU. In this algorithm objects are removed in the order they are added to the cache. 3 Least Frequently used (LFU) LFU is similar to LRU. However, instead of selecting objects only in terms of time since access and it also considers the number of access as a significant parameter. LFU replaces objects with small access count and keeps objects with high access account. 4 Size A size based algorithm uses the object size as the primary removal criteria. This means that the largest object is removed first from the cache. However, the algorithm needs an algorithm that measures how old an object stayed in the cache in order to remove old objects first. Otherwise, the cache will only have smaller objects. 5 GreedyDual-Size(GDS) GDS assigns a value for every object based on the cost of the cache miss and size of the object. Since GDS doesn’t specify what ”cost ” means, it offers a lot of flexibility to optimize what you want. For example cost can be defined as the latency which is the time it takes to receive the response. It can be also defend as the number of packets transmitted over the network or the number of hops between the origin server and the cache. 6 Greedy-Dual Size Frequency Greedy–Dual Size Frequency policy is proposed to maximize hit and byte hit rates for WWW proxies. This Proposed caching strategy

Chapter 2. Background and Related Works

11

incorporates the main characteristics of a file such as file size, file access frequency and recentness of the last access. This algorithm is an improvement of Greedy-DualSize algorithm current champion among the replacement strategies proposed for Web proxy caches. In general, GDF-like replacement policies emphasizing frequency have better byte hit ratio but result in worse file hit ratio.

Chapter 3

The Orbit streaming server This chapter starts with an introduction to Edgeware solutions and then the Orbit server features and functionalities are explained which are mainly based on Edgeware’s marketing material. Finally, we describe the proposed solution on this thesis work.

3.1

Edgeware solutions

Edgeware is a leading provider of video streaming solutions to network and service operators. It has a special platform for providing this solutions called Edgeware Video Consolidation Platform(VCP). VCP is a highly accelerated and consolidated platform which significantly reduces the high infrastructure costs required for the delivery of TV and video services [11]. VCP is a highly scalable platform to deliver a high quality video services to any screen, across any network topology. It supports all major adaptive streaming frameworks such as Microsoft, Apple and Adobe. It also provides CDN (described in 2.1) video delivery management functions. VCP reduces capital expenditure of a company by at least 50 % [11]. As shown in 3.1 below, VCP contains a highly accelerated video origin, called the VCP Origin, and a widely deployed operator Content Delivery Network (CDN) solution, called the VCP Edge. The VCP Origin solution

Figure 3.1: Components of Edgeware Video Consolidation Platform [11]

12

Chapter 3. The Orbit streaming server

13

reduces complexity, performance requirements and cost of the origin servers by offloading all recording, ingest, re-packaging and play out capacity [11]. Therefore, load balancers or complex file systems are not required. The VCP Edge is based on Edgeware’s widely deployed Distributed Video Delivery Network (D-VDN) solution, incorporated into the new Video Consolidation Platform [11]. VCP Edge is the main part of Edgeware video consolidation platform and it addresses the network infrastructure costs of networks service providers and operators which are growing due to the rapid increase of TV and video demand over the Internet. Moreover, VCP Edge is a highly optimized CDN caching and distribution solution for a wide range of service applications. It is designed to deliver next generation video services with the highest Quality of Experience (QoE) and scalability to any screen [11]. In addition, VCP Edge is easily integrated with any content management, middleware, conditional access and resource management systems. The VCP Edge has two components called Orbit hardware platform and convoy management software which are fully integrated. The convoy management software allows an operator to set-up and manage a complete video delivery network across any network topology. It ensures efficient and effective configuration, content management, license control, session management, monitoring, and an open integration framework in close integration with the optimized Orbit Delivery Servers.

3.2

Edgeware Orbit server

The Orbit servers are fully integrated with Convoy Management Software, providing highly scalable asset propagation, session management and fault tolerance. The Orbit servers offer advanced capabilities for operators and content providers to offer a full range of Cloud TV and video services, irrespective of network topology and core bandwidth. These servers use a combination of a dedicated hardware streaming engine and a purpose designed flash-based storage system, coupled with a Linux based control plane to give up to 80 Gbps or 128 000 streams from a single unit.

The main functionalities of the Orbit server are Ingest, repackaging, encryption, caching and streaming. Repackaging and encryption are done just in time. The Orbit platform have functionalities like session handling, logging and backend selector.

• Backend selector: the origin servers that contains the content in Edgeware are organized as a group of servers which contains a set of nodes which are a model of physical computers. Each node contains a set of IP addresses that models network interfaces. The server groups model data center. The main functionality of the backend selector is load balancing and fail-over. The load can be spread randomly over servers in a group, or be based on the content requested to optimize cache utilization.

14

Chapter 3. The Orbit streaming server

• Session handling: is a module which is used to limit access to the content, for setting maximum number of TCP connections per session(client) and to group related requests i.e during a fragmented streaming and when user sends HTTP requests with a few seconds apart. • Session Logger: This logging is enabled when the session handling module is enabled and it logs: – TIMESTAMP: Time, relative to epoch, when gathering of data for the sample started. – DURATION: Duration from first request to the last transfer. – SENDTIME: The time spend streaming from the video server. – IP: IP address of the client (remote host) that initiated the session. – SESSIONID: The identifier of a session. – CONTENT: URI of the initial request. – BYTES: Number of bytes transferred from the video server. – REFERRER: The Referer HTTP request header, if provided by the client. – USERAGENT: The User-Agent HTTP request header, if provided by the client.

3.3

Proposed solution

The main aim of this thesis is to replace this Orbit sever hardware with an implemented software version and then see the impact of performance in Edgeware solution. To do this we will select three proxy servers using their general behaviour and then implement test cases for the use cases that can help to evaluate those three proxy servers. Finally, after evaluating the three proxy servers we will select one proxy server that will use as a cache server and then implement some of the Orbit server functionality explained in 3.2 on top of this selected web server(reverse proxy server).

Chapter 4

General comparisons of HTTP reverse proxy servers HTTP reverse proxy servers are proxy servers that intercept HTTP requests coming from the clients as it is described in 2.2.3. In this section the most commonly used HTTP reverse proxy servers i.e squid, Apache traffic server, Nginx, Varnish and aicache are compared based on performance, flexibility and license in a general way. However, it is not easy to find up to date research papers on the reverse proxy servers and every proxy server adds new features and functionalities overtime. Therefore, comparing those reverse proxy servers is not an easy task. Moreover, their performance depends on the implementation, architecture and the flexibility room to optimize them. In this chapter the pros and cons of each reverse proxy server is described one by one. In addition, we tried to find some benchmarks on the performance measurement of those servers from existing companies which are currently using those proxy servers if any. Finally, we have selected three reverse proxy servers for further evaluation and testing.

4.1

Squid

Squid is originated from the Harvest project in 1990s and it is the oldest and well-known of the popular HTTP reverse proxy servers [2]. It is open source software licensed under the GNU GPL and it supports HTTP, HTTPS, FTP. Squid offers a rich access control, authorization and logging environment. It runs on many platforms including Linux, FreeBSD, and Microsoft Windows and it typically runs as a single-process, single-threaded, asynchronous event processor. Squid stores the content in RAM until the RAM is full and then on disk. Therefore, the RAM size and disk speed are important factors for its performance. Thousands of web-sites around the internet use Squid to considerably increase their content delivery [2].

15

16

Chapter 4. General comparisons of HTTP reverse proxy servers

The cache replacement algorithms which are used in squid are LRU described 2.2.3.1, GreedyDual Size Frequency (GDSF) which keeps smaller objects in cache, Least Frequently Used with Dynamic Aging (LFUDA) that keeps poplar objects in cache regardless of thier size thus optimizes byte hit rate at the expense of hit rate since one large, popular object will prevent many smaller, slightly less popular objects from being cached.

4.1.1

Pros and cons

Squid has the following advantages: • Caching of static objects. These are served much faster, assuming that your cache size is big enough to keep the most frequently requested objects in the cache. • Buffering of dynamic content • Nonlinear URL space/server setup. Squid can be used to do some tricks with the URL space and/or domain-based virtual server support. • Features: Squid is richer than any available reverse proxy servers in features The disadvantages are: • Buffering limit for log records: Squid could not keep more than 64KB in its buffer log. • Speed: Squid is not very fast when compared with the other reverse proxy servers available today. Only if you are using a lot of dynamic features then there is a reason to use Squid, and then only if the application and the server are designed with caching in mind. • Memory usage: Squid uses quite a bit of memory. It can grow three times bigger than the limit provided in the configuration file. • Stability: Compared to the other reverse proxy servers, Squid is not the most stable. • Scalability: Squid is limited in scalability on modern multi-core systems since it runs as a single process, single threaded, and asynchronous event processor.

4.2

Apache traffic server

Apache traffic server was originally developed by Inktomi, and later donated to the Apache server foundation (ASF) by Yahoo. Apache traffic server is a fast, scalable and feature rich proxy server [1]. It has feature rich plugin APIs to develop extensions. Since it is a multithreaded event driven server, it combines asynchronous event processing and multi-threading technologies to deal with concurrency. Apache traffic server can draw benefits from each technologies but it also makes the code and the technology complex and sometimes difficult to

Chapter 4. General comparisons of HTTP reverse proxy servers

17

understand. Apache traffic server is free and open source software that has robust plugin APIs to extend and modify its behaviours and functionalities. It scales very well in modern multi-core systems because it is a multi-threaded event driven proxy server.

There are a small number of “worker threads” in Apache traffic server; each such worker thread is running its own asynchronous event processor. In a typical setup, this means Traffic Server will run with around 20-40 threads only. This is configurable, but increasing the number of threads above the default (which is 3 threads per CPU core) will yield worse performance due to the overhead caused by more threads [18]. In ATS, the cache eviction algorithms that the RAM cache supports are, LRU, LFU and Clocked Least Frequently Used by Size ( CLFUS) which balances recentness, frequency, and size to maximize hit rate. The default algorithm is CLFUS, but the user can select and set it in the configuration of ATS. Besides, Apache traffic server uses a FIFO algorithm to update it’s disk cache.

In Yahoo CDN apache traffic server deliver 350,000 requests per second and 30 Gbps(95 percentile ) on which there are around 100 servers distributed over all the world. Additionally in their lab they got 105,000 request per second out of one cache for small content and 3,6 Gbps out of one server for large content. In addition to this Comcast use ATS in their CDN.

4.2.1

Pros and cons

The advantages of using Apache: • It is very scalable, which means it needs little configuration and it can work in many modes • It is easily adapts to your network • It uses efficient subsystem storage The disadvantages are: • It has many configuration files. • It is not stable compared to others. • it needs restart for some cases

4.3

Nginx

Nginx is an HTTP web server that also can function as a HTTP reverse proxy server. It is free and an open source software that has a lot of plug-ins and released under a BSD-like license.

18

Chapter 4. General comparisons of HTTP reverse proxy servers

Nginx uses event driven multiple processes to solve the concurrency problem which needs small CPU [3]. In addition to HTTP, it can proxy several other TCP protocols, and also have a flexible plugin interface for extensions and additions of its behaviour and functionalities. It is also well documented and widely available compared to the other reverse proxy servers. Nginx uses a persistent disk based cache and the OS page cache keeps object in RAM. Moreover, Nginx uses LRU cache replacement policy to evict contents from it’s cache if the specified size for the cache exceeds.

4.3.1

Pros and Cons

• It has high performance and is stable with simple configuration. • Consumes lower CPU power and memory Disadvantages: • Nginx requires a recompile of the entire application for new plugin APIs. • Has some latency in accepting new connections • Storage time is unlimited

4.4

Varnish

Varnish is a free and open source software which is licensed under two-clause BSD license. It was initiated in 2005 and the first version was released in 2006 [18]. It focuses mainly on performance and flexibility. Varnish was designed originally as a reverse proxy with the principle of solving real problems, optimize for modern hardware (64-bit, multi-core, etc) and modern workloads, work with the kernel not against it, innovation not regurgitation [19]. It takes the advantage of modern kernel features to simplify the code. Moreover, Varnish does not keep track of whether your cache is on disk or in memory. Instead, Varnish will request a large chunk of memory and leave it to the operating system to figure out where that memory really is. The operating system can generally do a better job than a user-space program. In general to get a simpler design and reduce the amount of work Varnish needs to do, but it sacrificed portability. For example in 32-bit system the virtual memory address space is limited to 3GB which limits the size of cache and number of concurrent users[4].

Varnish is developed and tested on GNU/Linux and FreeBSD and its development is governed by Varnish Governance Board (VGB). Varnish moves a lot of complexities to the kernel by using the advanced features of the operating system such as accept filters, epoll and kqueue. All caching are done using virtual memory provided by the operating system and each active

Chapter 4. General comparisons of HTTP reverse proxy servers

19

connection uses up a thread. Besides, Varnish uses LRU cache replacement algorithm both in RAM and disk to remove contents from the caches.

Varnish uses its own domain specific configuration language called varnish configuration language which translate to C-code, compiled with a normal C compiler and then dynamically linked directly into Varnish at run-time. Varnish configuration language is lightning fast and gives freedom to system administrators by allowing developers to define their own policies rather than constrained by vanish developers. Varnish also contains modules called vmods which makes it easy to extend, add new functionalities to Varnish or integrate Varnish with other software, such as database or other network services. Example Integration with GeoIP databases or device detection for mobile users.

Varnish has two processes called parent and child process. The parent process starts the child process when the varnishadm daemon starts and recover it when it dies by any reason. Varnish contains different subroutines such as vcl recv, vcl f etch, vcl pipe, vcl pass, vcl hit, vcl miss, and vcl error. But most of the tasks of VCL can be performed by: • vcl recv: receives the requests, parse them, makes decision of serving from the cache or a backend etc. It is able to alter the headers as well. • vcl f etch: This method is called when an object is retrieved from a back end. The basic operations here are to change the header, change the backend if previous one was unhealthy etc. Varnish plus is the commercial version of Varnish which contains all features of Varnish plus some. Varnish plus has a measured performance up to 20 Gbit per second per single server in video and audio streaming and it can stream to as many as 6500 users from one single server [20].

4.4.1

Pros and cons

The main advantages Varnish are: • It is very flexible compared to other reverse proxy servers due to its own language VCL. • Varnish gives you access to very detailed logs that are useful when debugging problems without any cost. • Developers can implement their own policies using VCL. • It has modules called Vmods which are helpful to extend its functionalities. Disadvantages of Varnish:

20

Chapter 4. General comparisons of HTTP reverse proxy servers

• Opens new thread for every connection use the advantage of the operating system to solve this • It is not portable for all systems because it is designed for modern hardware(64-bit, multicore, etc).

4.5

Aiscaler

Aiscaler is not a pure caching solution instead it is all-in-one application delivery controller (ADC) solution which is normally installed as a reverse proxy on a dedicated machine. Some features of aiscaler are caching, SSL offloading, DDoS protection, multiplexed session management, mobile device detection and IP-based geocontent delivery. Aiscaler is easy to configure and creates better user experience by increasing speed and availability of a web site by offloading request processing from the web, reducing code complexity and reducing cost for servers, reducing space, power and cooling. Aicache is a Linux application, custom written in C. it is a right threaded application which means it use limited number of threads (processes)

4.5.1

Pros and cons

The advantages of Aicache which is a caching feature of Aiscaler are : • High performance • Has good configuration flexibility • Support for real time alerting and responding • Responses are cached in RAM no disk • Low resource usage Disadvantages of Aiscaler: • Aiscaler performs well especially with a dynamic web sites. But in our case we need a reverse proxy for static contents. Pages load faster with over 250,000 RPS served directly from aiScaler. However, we are not interested in aiscaler because it is all in one solution not pure caching solution.

4.6

Conclusion

Based on the advantages and disadvantages of the above HTTP reverse proxy servers, we selected Apache traffic server, Nginx and Varnish for further evaluation and testing to see their performance and behaviours.

Chapter 5

Test methodology We need to define some test cases to evaluate the performance of the generic HTTP acceleration servers(cache servers) which are defined in chapter 4 as well as the Orbit server demonstrated in chapter 3. We can measure the performance of cache servers using cache hit, cache miss and live test cases. But in the case of cache miss, since the cache server should fetch the content from the origin server (the server that contains the content) the performance will depend both on the cache server and the origin server from which the content is fetched. Therefore, due to such limitations only live , 100% cache hit and 90% cache hit test cases are implemented and evaluated as part of this thesis. Moreover, to compare the cache servers we need some parameters (characteristics) as a performance measure. In this thesis the characteristics that we have used to evaluate those cache servers and the Orbit server are response time, CPU usage and network traffic (bandwidth). We will describe them more in 6.

The video assets that we have used in our test case are stored in an origin server so that the proxy servers and the Orbit server will fetch those video assets. All assets are chunked into different length of small fragments. The fragments are grouped based on their content bitrate. In other words, an asset contains different length of many fragments with a different sizes. Therefore, we should state number of assets, number of fragments and their length and the quality(content bitrate) of the fragments as an input in all test cases.

This chapter starts with the definition of test cases that we have used for our evaluation. Following the definition, the implementation of those test cases is explained and then, the configuration of the generic proxy servers which are selected in chapter 4 will be demonstrated. Finally setup of the test environment is summarized in short.

21

22

5.1

Chapter 5. Test methodology

Definition of test cases

As it is mentioned in the above we have used three test cases namely, live, 100% cache hit and 90% cache hit test cases to evaluate the performance of the three reverse proxy servers selected in chapter 4 and Orbit server. But, for Apache traffic server, the 90% cache hit test case is not implemented due to the bad result collected from the live and 100% cache hit test cases. In the case of 100% cache hit and 90% cache hit test cases, the video transmission was not a real video on demand(VoD) since all clients were synchronized to spread over the assets and request the fragments of the assets sequentially i.e all clients were requesting the same portion of the assets at the same time. However, in a real VoD different clients can request different portion of the assets at any time. The size of each fragment that we have used in our test cases was 132 KB.

5.1.1

Live test case

In this test case all clients were requesting a single asset with 5 second fragments at 300 kbps. The asset was not in cache before the test so that all servers fetch and serve the first request for a fragment from the origin server and then serve the next requests for the same fragment from their cache. Proxy servers put data in memory to increase performance by decreasing the access of hard disk. Every client was requesting the fragments sequentially and continuously as they are represented in figure 5.1. Fragment length is the time that a client waits for that fragment before requesting the next fragment which means if the client didn’t receive the fragment within 5 seconds its timeout will reach and the request will be a late request. In this test case, we have tested the maximum number of clients(streams) in each server. In addition, the response time, CPU usage and egress bandwidth are tested having the same number of clients for all servers.

Figure 5.1: Live fragments

Chapter 5. Test methodology

5.1.2

23

100% cache hit

In this test case all assets were saved in the cache of the proxy servers and the Orbit server before the test and all clients were requesting those assets to simulate video on demand streaming. In the proxy servers, the clients were spread over the assets and request the fragments in the asset sequentially. Proxy servers put data in RAM cache as much as possible to decrease the load of disk access and to keep contents in RAM for fast access. In the case of ATS the user can specify the size of this RAM in its configuration. However in Varnish and nginx contents will be stored in RAM as much as there is free space in the RAM. The assets are chunked into 5 second fragments with a 300 kbps content bitrate. Figure 5.2 represents the fragments in this test case. The length of the fragments is a time that a clients waits to receive that fragment as it is stated in 5.1.1. The focus of this test case was to see the number of assets that each proxy server can serve without hitting any limitation which depends on their resource usage and then having the maximum assets that can fit in the RAM cache of each proxy server, we have tested and compared the response time, CPU usage and bandwidth of all servers.

Figure 5.2: 100% cached fragments.

24

5.1.3

Chapter 5. Test methodology

90% cache hit

To see the ingest performance of the proxy servers, 90% test case is implemented. In this test case, all clients spread over the assets and request the fragments sequentially the same as the 100% cache hit test case. The reverse proxy servers were serving 90% of the requests from their cache and fetch and serve the rest from the origin. Figure 5.3 represent the 90% cache hit fragments on which the fragments that are saved in the cache before the test are represented by a green color and the other one represents the fragments that are not in the cache server.

Figure 5.3: 90% cached fragments.

Chapter 5. Test methodology

5.2

25

Implementation of test cases

The test cases are implemented using Python and bash script. The implementations are based on an in-house tool called rq which is used to generate TCP load. In those implementations rq is configured with different parameters like the destination hosts and ports, number of clients, number of video assets, number of fragments in each asset, quality and fragment length. In the implementations, a reporter module is used to generate output of the test results in a pdf format. Besides, VMstat and network statistics are collected from the proxy servers machine to produce CPU usage and network traffic streaming respectively. For each test case, the number of assets and fragments are varying as can be clearly seen in the table 6.2. However, for the 100% and 90% cache hit test cases, the video on demand property were not simulated exactly as it is stated in section 5.1.

5.3

Configuration of proxy servers

In this work, each proxy server was configured as a cache server. The configuration is different for each proxy server and the default configuration could not give us a reliable result for the implemented test cases discussed above. Therefore, a lot of optimizations were made for each proxy server to get a result which is comparable with Orbit server.

In the case of ATS we have configured it as reverse proxy server that can be used as a cache server. It is only configured for the live and 100% cache hit test cases. The main configuration files are called records, remap, storage and cache. Records configuration of ATS is used to configure the server to act as a reverse proxy server and set how much RAM should be used to store the most accessed assets.It also sets the IP address and port on which it should be accessed and the IP address to access the origin server and some more tunings. In remap configuration file we have defined both map and reverse-map rules. A map rule translates the URL in the client requests into the URL of the origin server where the content is located. It constructs a complete request URL from the the client URL and its headers and then looks for match with its list of targeted URLs in the remap rules. Besides, a reverse-map translates the URL of redirect responses from the origin server into the address of ATS, so that clients are redirected to ATS instead of accessing an origin server directly. Therefore, clients cannot access the origin server without knowledge of Apache traffic server. Furthermore, storage configuration is used to set how much hard disk will be used by Apache traffic server. Likewise, some caching rules are set in the cache configuration file.

Varnish uses its own configuration language called varnish configuration language(VCL) to configure it. This language was used to tell Varnish which origin server it has to use and all required caching rules. Unlike Apache traffic server, VCL is a very flexible configuration language. More-

26

Chapter 5. Test methodology

over, another configuration file was used to tweak Varnish, to set the IP address and port on which Varnish should listen.

In the case of Nginx, it has its own way of configuration and caching rules are defined to configure Nginx as a reverse proxy server. In addition, we have defined the address of the origin server, the storage to be used, the address and port on which Nginx listens and many optimizations in the configuration file.

5.4

Test Environment

The test environment that we have used in our test case contains the following components: client machine, Orbit server, origin server and switch. Only the Orbit machine is replaced by the proxy server’s machine when we are testing proxy servers.

5.4.1

Test setup

The proxy servers were installed and configured one by one on the same machine so that the test environment will be the same for all test cases. All the components of the test environment are demonstrated in short as follows. • Client machine with Ubuntu 12.04 Application called rq is installed in this machine. Rq is an in-house purpose built performance streaming application which is used to generate TCP load. It can do progressive streaming and adaptive bitrate streaming. Rq can be configured with number of clients, number of assets, number of fragments, fragment length, ramp-up, timeout, duration, etc and it creates tcp sockets and send GET requests based on the implementation of the test cases. In this machine, two network interfaces each with 10 Gbps were used to communicate with the proxy servers machine. In addition, rq can emulate thousands of concurrent HTTP clients. • Proxy server with Ubuntu 12.04 This is a machine on which we installed the reverse proxy servers to be tested. Those are used as a cache server, when a request comes from a client the proxy server checks if the requested data is in cache. If data is in cache it serve the client from the cache, otherwise it fetches the data from the origin server and serve the clients simultaneously. In this machine, we have 11 GB RAM and we specified 50 GB disk space to do the test and two 10 Gbps interfaces were used, one is shared by the client machine and the origin server and the other one with the client machine only. All proxy servers uses RAM cache to serve objects as quick as possible and reduces load on disks in addition to the specified

Chapter 5. Test methodology

27

cache storage. Therefore, memory and CPU are the basic constraint in the proxy servers. This machine is replaced with Orbit server during the Orbit tests runs. • Origin server with Ubuntu 12.04 In order to support HTTP requests, lighttpd web server is installed and configured in this machine. Video segments of different bit rates are stored on this server and this is the server from which reverse proxy servers get the content when there is a cache miss. • Switch The above components are connected as shown in the figures 5.4 and 5.5 through switch both in the Orbit and proxy servers test environment.

Figure 5.4: Orbit test.

Figure 5.5: Proxy servers test.

28

5.4.2

Chapter 5. Test methodology

Parameters

The parameters that we have used in all test case are, number of clients, number of assets, number of fragments in each asset, ramp up in milliseconds, timeout in seconds, quality, fragment length, duration in minutes, content type which is video. The value of each of the parameters varies according to the specific test cases as it can be seen in table 5.1 below. We have used 10000 number of fragments in the live test case to make sure the proxy servers download the first request from the origin server and serve the next requests from its cache to simulate the live streaming. However, in the 100% cache hit and 90% cache hit test case there are 180 fragments(the duration of the test(900 seconds) divided by the fragment length(5 seconds)) in each asset. The number of assets are different for all proxy servers in the 100% cache hit and 90% cache hit test cases due to the drawback of disk access and memory limitation that we have in the proxy servers machine. In other words, the number of assets that a proxy server can serve depends on its memory and CPU usage.

Parameters Clients Fragments ramp up Quality fragment length in seconds durationin minutes

Live test case 25000 10000 2ms 300 kbps 5s 15m

100% cache hit 25000 180 4ms 300 kbps 5s 15m

90% cache hit 25000 180 4ms 300 kbps 5s 15m

Table 5.1: Parameters of test cases.

As part of the test setup, different test cases are implemented in the client machine and the proxy server machine was configured and optimized for each proxy server as it is described in sections 5.2 and 5.3.

Chapter 6

Performance comparison of Orbit and proxy servers The results of the above test cases for the generic proxy servers and Orbit server are presented in this chapter to compare response time, CPU usage and bandwidth of all servers. The comparison is accompanied with the analysis of each result. The Orbit, client and the proxy servers machines have two interfaces with 10 Gbps each as they are stated in the figures 5.4 and 5.5.

As it can be seen in the figures below, in the response time there is a heat-map graph with a blue color in it’s right side which represents the number of request in percentage. 0-20% of the requests are represented with a light blue color and the weight of the blue color increases proportionally with the number of requests in percentage. The y-axis represents response time in milliseconds and x-axis is the duration of the test in percentage.

In the CPU usage figures, the red graph represents the time on which the CPU is idle, the blue and green graph are the time of CPU that are spent waiting for the system and user respectively. The graph which is represented by a cyan color is the CPU time waiting for the I/O operation(disk access).

In the figures of network streaming ports, for the proxy server tests red and blue colors are used to represent the egress bandwidth of both interfaces(since we have two interfaces) and green and cyan colors are used for the ingest bandwidth. Besides, in the Orbit test blue and red colors are used for both the egress and ingest bandwidths.

29

30

Chapter 6. Performance comparison of Orbit and proxy servers

6.1

Live test case

In this test case, 25000 clients were requesting a single asset which contains 10000 fragments each with 5 second length at 300 kbps, and the ramp up between each clients was 2 milliseconds to simulate the live test case. Orbit, Varnish and Nginx creates only one connection with the origin server for each cache miss even many clients request the same fragment at the same time. However, ATS might send more than one request per fragment to the origin server if many client requests come at the same time for that fragment, but it tries to decrease the number of connections to the origin server for the same fragment.

Proxy servers put data in RAM cache to increase performance by decreasing the access of disk cache. Since the size of the fragments was 132 KB and we have a single asset in this test case the overall size of the asset was 10000 times 132 KB which is 1.32 GB. Therefore, the size of the RAM cache was enough to save all the fragments and proxy servers was accessing the fragments from RAM cache in this test case.

In live test case, we encountered a limitation related with the number of concurrent requests that every proxy server can support. As a result, the maximum number of concurrent requests that Orbit, ATS and Varnish can support are 32000. When there are more than 32000 clients, Varnish hits memory limitation and it restarts in the mid of the test. Moreover, Apache traffic server will be very slow when the number of concurrent clients were more than 32000 and then there will be many late requests. On the other hand, Nginx can support up to 45000 concurrent requests without any late requests which implies Nginx can handle many more concurrent requests than Varnish and Apache traffic server with the same resources. The reason for the different number of concurrent request in the proxy servers, is due to they way they used to handle incoming requests. Nginx uses asynchronous event-driven connection handling algorithm on which it does not create new thread for each request. However, Varnish is a multi-threaded program that uses one thread per each connection and ATS uses a hybrid event-driven engine with a multi-threaded processing model to handle incoming requests.

In addition to the number concurrent requests, we used response time, CPU usage and network traffic as a main criteria to compare the proxy servers and the Orbit server. A test that runs with 25000 clients is used for all proxy servers and the Orbit server in this section to evaluate the results for the above criteria. The result of live test case for all proxy servers and Orbit server are demonstrated as follows. • Response time Response time is the amount of time taken between a client request and receipt of the response. Having enough memory, Varnish is the best reverse proxy server in terms of

Chapter 6. Performance comparison of Orbit and proxy servers

31

response time. It has comparable response time with the Orbit server as it can be seen in the figures 6.3 and 6.4. However, in Nginx 6.2 and ATS 6.1 the response time of 0-20% of the total requests reaches up to 1.1 and 2.4 seconds respectively. As a result, Varnish is very fast compared to Nginx and Apache traffic server in live test case.

Figure 6.1: ATS response time.

Figure 6.2: Nginx response time.

Figure 6.3: Varnish response time.

Figure 6.4: orbit response time

• CPU usage As it can be seen from the figures below, in Apache traffic server 6.5 and Varnish 6.7 the CPU is 75% idle but in Nginx CPU is idle around 80%. Therefore, Nginx 6.6 uses less CPU compared to Varnish and Apache traffic server in live test case. However, 95% of CPU is idle in the Orbit server 6.8 since it doesn’t use CPU and memory to perform its activities. It uses FPGA instead of cpu and RAM and flash memory as a storage.

32

Chapter 6. Performance comparison of Orbit and proxy servers

Figure 6.5: ATS CPU usage.

Figure 6.7: Varnish CPU usage.

Figure 6.6: Nginx CPU usage.

Figure 6.8: Orbit CPU usage

• Network traffic As it can be seen below, the ingest bandwidth for each interface is around zero since all the servers send a single request per fragment to the fetch from the origin server and the egress bandwidth is 3.5 Gbps in all interfaces. The egress bandwidth starts from zero and then gradually grows until all clients joined and it stays constant. The egress bandwidth is around 7 Gbps for every server since each server has two interfaces as it is stated in the figures 5.4 and 5.5 above.

Chapter 6. Performance comparison of Orbit and proxy servers

Figure 6.9: ATS network traffic.

Figure 6.10: Nginx network traffic.

Figure 6.11: Varnish network traffic.

Figure 6.12: Orbit network traffic

33

Results of all servers for the live test case are summarized in the table below.

Servers Orbit Varnish Nginx ATS

Response time < 1 ms < 1 ms