Reference Architecture for Data Management in Grid Computing

WHITE PAPER GemFireTM on Intel® for Financial Services (GIFS) Reference Architecture for Data Management in Grid Computing Meeting New Market Demand...
Author: Lionel Fisher
4 downloads 2 Views 6MB Size
WHITE PAPER

GemFireTM on Intel® for Financial Services (GIFS) Reference Architecture for Data Management in Grid Computing

Meeting New Market Demands The last few years in financial markets have seen meteoric rises in trade volumes, increasingly sophisticated algorithmic trading models, drastically improved pricing engines for foreign exchange (F/X) & derivatives, and more complex financial instruments. Combined with the ever-increasing burden of new regulatory requirements such as RegNMS and continued efforts to meet full compliance with Sarbanes Oxley and Basel II, profitability is more dependant than ever on the arms race to achieve reduced latencies, scalable computing power, and cost-effective hardware and software infrastructure technologies in the data center and beyond. The GemFireTM on Intel® for Financial Services (GIFS) solution from GemStone® Systems and Intel® Corporation represents an ensemble of software, hardware, tools and services meant to address such growing challenges in the financial markets. Please refer to the white paper titled ‘Powering the Next Generation Data Infrastructure: GemFire on Intel® for Financial Services (GIFS)’ for more details about this solution offering.

The Business and Technology Challenge Today, it is technology that drives the profitability of global and regional investment banks. Even a single inaccurate instrument pricing or position hedging event can lead to large trading losses and undesired regulatory consequences. Armies of quantitative analysts are competing daily to apply the latest and greatest techniques to gain even the slightest competitive edge in diverse areas such as credit derivatives, equity derivatives, and proprietary trading strategies. The winners gain greater trading profits, happier customers, and efficient regulatory and operational compliance levels. Those that fail to address these difficult challenges risk reduced profitability of services, lose customers, are liable for regulatory fines, and face disruptions in business. To provide quantitative analysts with the tools they need to underpin critical services, grid computing has emerged as a key strategic technology.

WHITE PAPER As banks have begun to deploy grid environments in support of computationally intensive pricing, hedging, and general value at risk (VaR) modeling, they have generally bolted these onto existing data repositories such as relational databases, file-system based solutions, or grid-vendor solutions. In the early phases of grid deployment these have sufficed to support the limited number of nodes and serviced business areas. Now that early deployments have

The GIFS solution powered by the GemFireTM Enterprise Data Fabric (EDF) and Intel’s Xeon®based hardware platform provides key technical features that enable deployment of high performance grid architectures.

proven their value, banks are attempting to grow their grids to support more asset classes, business lines, and more accurate quantitative models. The result has been a (often unexpected) hard limit on the grid size that initial data repositories can support. As the computational grid grows from 25 or 50 nodes into hundreds or even thousands, data bottlenecks have prevented them from delivering the expected increases in computing power or deriving increased utilization from their hardware investments. The main reasons for the failure of grids to scale are directly related to failures in the data infrastructure on which they depend, specifically: • Relational databases cannot scale to support thousands of concurrent high-activity clients. Open-form VaR techniques such as Monte Carlo simulations, or even less accurate approximations such as binomial and trinomial expansions require huge amounts of transitory data to be passed between grid tasks. The very nature of virtualized grid deployments means that even this transitory data must be available to any grid node at any given point of time. Early solutions that used RDBMS’ as data repositories simply cannot scale to either the required amount of concurrent activity or the huge volumes of fast-produced data. The RDBMS vendor solutions to this problem, as of now, are very expensive as well as inadequate. Even the simple task of writing an overnight batch’s results such as yield curves, volatility surfaces, and greeks back to the database for intra-day availability to traders and analysts is a challenge, with the RDBMS unable to absorb terabytes of data as fast as the grid produces them. • File-based “caching” solutions that write interim or final results to files are faced with similar contention from a high number of grid nodes trying to read and write data concurrently. Additionally, latency of direct disk interaction and lack of fine-grained locking are performance killers that promise to eliminate this option for all but the simplest use-cases on small grids. • Grid vendor caching solutions suffer from a lack of maturity, serious scalability limitations, little or no high-availability support, and are sometimes also reliant on underlying file systems (with all of the deficits listed above). These vendors do a fantastic job at managing and virtualizing a large number of compute resources, but have generally treated the data grid as an after-thought. The data grid technology needed to support large grids is every bit as difficult (if not more) to create, requiring years of specialized development and experience that compute grid vendors have not invested in. • Lack of efficient and dynamic local reference data caching at the compute node level is also responsible for data I/O bottlenecks and the inability to scale and/or accelerate the compute grid.



GemFireTM on Intel® for Financial Services (GIFS)

GIFS for Grid Data Management: Solution Overview The GIFS solution powered by the GemFireTM Enterprise Data Fabric (EDF) and Intel’s Xeon®-based hardware platform provides key technical features that enable deployment of high performance grid architectures. GemFire EDF, from GemStone Systems, provides a scalable, distributed platform to manage increasing volumes of data and streaming events with almost zero latency. It is best envisioned as a middle-tier, operational data management layer that sits in between information sources and consumers as shown in Figure 1. Applications written in different languages like Java, C#, C++, XML or SQL, deployed on different hardware nodes and platforms, can access, store, distribute and analyze data and events represented within the GemFireTM fabric. With advanced data virtualization, distributed caching, and complex event processing capabilities, the GemFire EDF enables the delivery of actionable information to the right application, at the right time.

Query Analytics

Portals

Grid Computing

SQL

XML

Execution SOA

Reporting

Event-Processing

Java

With advanced data virtualization, distributed caching, and complex event processing capabilities, the GemFireTM EDF enables the delivery of actionable information to the right application, at the right time.

C++/ C#

GEMFIRETM ENTERPRISE DATA FABRIC

Disk

t en Ev ams re St

Figure 1: GemFireTM Enterprise Data Fabric (EDF)

GemFireTM on Intel® for Financial Services (GIFS)



WHITE PAPER GemFire’s distributed caching technology solves the aforementioned gird problems through a set of targeted features, which along with Intel® Virtualization Technology (VT) and Intel® I/O Acceleration Technology (I/O AT) provides almost limitless grid scalability, calculation complexity, and dramatically improved CPU utilization. GemFire’s data partitioning mechanism virtualizes the memory pool of many processes and machines to build huge distributed caches. These caches guarantee one-hop memory-speed access to data, highly concurrent and contention free access for grid nodes, and massive aggregate data throughput into and out of the grid. Additionally, GemFire provides various update models to assure that reference data can be managed exactly as you need it: pre-cached or distributed on-demand, updated in real-time, and isolated from on-going dependent tasks so that parent-child calculations use the same base data.

Key Technology Elements 20 years of object-based technology experience has culminated in features that make the dramatic grid computing improvements possible as a natural result of distributed caching features supported in the GemFire EDF. For the past many years GemFire has focused on using main-memory caching to achieve unparalleled performance scalability gains without sacrificing the reliability benefits of more traditional solutions, such as relational databases. To facilitate grid computing in the financial enterprise, GemFire EDF provides the following key features: • Cache Server Topology: Multiple cache servers coordinate data partitioning and high availability. Just like a compute grid virtualizes and parallelizes a pool of computing resources, GemFire’s cache server clusters virtualize large memory pools across many processes and machines. Redundancy of data for high-availability and distribution to data replication sites are simple to configure. • Cache Server Clients: These refer to embedded caches that run as part of the compute grid node process and efficiently connect into the cache server cloud. A grid-node cache of reference and recently used data is automatically maintained. Distribution policies assure that data updates are controlled according to desired requirements—supporting everything from real-time push-based updates of all cached data to fine-grained distribution policies. Load balancing across the cache server cluster is automatic and fail-over from downed cache servers is transparent. • Partitioned Regions: Through this feature, a user or a client can dynamically allocate cache entries to GemFire cache server nodes and add capacity on the fly. Distributed hashing algorithms guarantee that any data lookup from a grid node into the data grid occurs in only one hop. Any number of machines can join a cache server cluster, each one contributing as much memory as it has to spare. When members of a partitioned region near their memory limits, buckets of data are dynamically re-allocated to other data grid nodes with more spare capacity.



GemFireTM on Intel® for Financial Services (GIFS)

• High Availability: Data is automatically made highly available by simply tuning GemFire cache’s redundancy level property. The failure of any node is thus transparent to other compute grid nodes, and spare capacity on other (or dynamically added) servers re-establish data redundancy concurrent with on-going activity. • High-speed Transport: GemFire optimizes transport through the dynamic selection of TCP/IP, unicast, and multicast network protocol transports. Interoperability with Intel’s I/O Acceleration Technology increases performance even further by eliminating low-level buffer copies and further parallelizing application and network I/O tasks. • GemFire Serialization: Optimizes data marshalling/unmarshalling between client and server. GemFire eliminates reflection in serialization, optimizes object-byte conversions, XML parsing, storage, and access. • Capacity Control: Protection from memory overflow can be ensured by off-loading entries to disk on an LRU basis. This provides an extra layer of protection in the data grid in case the total pool of virtualized memory reaches its limit. It can also be used to increase the capacity of compute grid node embedded caches beyond the available local memory resources. • Cache Persistence: Fast write-behind disk persistence can be enabled to guarantee disaster recovery. Although partitioned regions (discussed above) support redundancy, there is often a need to have critical grid calculation results backed-up to disk for an additional level of redundancy. GemFire’s write-behind model ensures that disk persistence happens in the background, concurrent with all the other activities in the data grid. • Cache API’s: These API’s allow any process to listen for data events. They also facilitate interaction with database that is transparent to application logic. GemFire API’s can be used for database read-through (on cache misses, for example), database write-through (synchronous), database write-behind (asynchronous), and as a bridge to messaging middleware. Even processes outside of the data and compute grid can easily connect into a GemFire distributed system and register interest for almost any type of data event. • Distributed Cache Transactions: GemFire’s transaction service assures atomic updates across the entire data grid, when necessary. Long-running calculations are thus prevented from using inconsistent reference data. • Management Tools: GemFire console offers sophisticated real-time statistics monitoring capabilities. In addition, JMX API’s are exposed to facilitate integration with system management tools like Tivoli and OpenView. These provide a robust framework for profiling, tuning, and managing your data grid.

GemFireTM on Intel® for Financial Services (GIFS)



WHITE PAPER • Virtualizes memory resources across many processes or machines.

• Scales to huge cache sizes and thousands of compute grid nodes without compromising performance.

• Dynamic or Static Data Partitioning policies.

• Add capacity on-the-fly.

• Easily Managed HA/Data Redundancy.

• Transactional Updates.

GF Cache Server GF Cache Server

GF Cache Server

GF Cache Server

GF Cache Server

Highly Parallelized, Massively Scalable Data Access Layer Accelerated through Intel® I/O AT and Infiniband GF Client Caches Grid Node

Grid Node

Grid Node

Grid Node

GF Client Caches Grid Node

Grid Node

Grid Node

Grid Node

Grid Node

Grid Node

Grid Node

Grid Node

Scheduler

• Fast Access to any data grid node. • Distributed Hashing Algorithm assures efficient data lookup • Maintains auto-updated near-cache for reference data. Figure 2: GemFireTM on Intel® for Grid Data Management



GemFireTM on Intel® for Financial Services (GIFS)

Sample Workflow Consider a Monte Carlo simulation running under a GemFire EDF-enabled grid computing environment: • A tier-1 grid task begins by retrieving credit derivatives positions from either a GemFire data grid or an underlying end of day positions database. This task may do some preliminary calculations and then spawn additional grid tasks to perform sub-calculations. The data from this first task is placed into the GemFire data grid and a pointer is passed along to child tasks. • Child grid tasks retrieve the results of the first calculation, do their wave of calculations, and then place the results of these back into the data grid. This data is transparently placed into multiple locations within the data grid to minimize the possibility of any contention and I/O bottlenecks due to concurrent access. They then each spawn yet another wave of child tasks. • The process continues this way, with each tier of child calculations exponentially increasing the load on both the compute and data grids. A GemFire data grid with many nodes can support a very large number of concurrent Monte Carlo simulations due to the highly distributed and memory-speed nature of the data grid. GemFire disk-overflow capabilities mean that in a case where memory resources are exhausted, an entire simulation need not fail, as GemFire automatically pages data into and out of disk in a highly distributed and parallel fashion.

GemFire’s data grid solution is critical to scaling any dataintensive grid computing farm.

• Eventually, the exponentially growing resource requirements of each tier in the simulation will reach the capacity of the grid infrastructure. Although more tiers in a Monte Carlo simulation increase the accuracy of resulting complex instrument pricing or portfolio hedging, the exponential increase in resources required for additional simulation tiers forces a hard stop eventually. Quite simply, with a GemFire data grid more tiers can be run with more data production than with any other grid data repository approach. The ultimate sophistication a bank is able to achieve in the simulations is proven to go directly to the bottom-line of profits—if a competitor bank is more accurate in their simulations, it opens up both unpleasant arbitrage exposure, significant portfolio hedging inaccuracies, and inefficient use of capital.

Who the Solution Will Benefit GemFire’s data grid solution is critical to scaling any data-intensive grid computing farm. Investment banks have been at the vanguard of implementing grids to manage exposure and pricing requirements, and are thus perfectly positioned to reap large benefits. Equity derivatives and credit derivatives pricing and exposure management groups have clearly been the first to bump into the compute grid scalability limitations solved by GemFire. Requirements of more exotic instruments and research-based investment strategies will soon follow. The early adopters of this technology will see increased profits on the business side. On the technology operations side, both reduced operating costs and improved quality of service will reap profits.

Summary The GIFS solution provides a truly transformative approach to solving the challenges inherent in building high-throughput, ultra low-latency, resilient, and manageable data grids. GemFire EDF’s caching, distribution, replication, serialization, and complex event processing capabilities coupled with Intel® Xeon® platform’s Virtualization and I/O acceleration provide the best possible technology infrastructure platform for data management in grid computing environments.

GemFireTM on Intel® for Financial Services (GIFS)



Information Links and Resources www.gemstone.com/products/gemfire/ www.gemstone.com/download www.intel.com/go/fsi

THE ENTERPRISE DATA FABRIC

THE ENTERPRISE DATA FABRIC

GemStone Corporate Information Corporate Headquarters: 1260 NW Waterhouse Ave., Suite 200 Beaverton, OR 97006 | Phone: 503.533.3000 | Fax: 503.629.8556 | [email protected] | www.gemstone.com Regional Sales Offices: New York | 90 Park Avenue 17th Floor New York, NY 10016 | Phone: 212.786.7328 Washington D.C. | 3 Bethesda Metro Center Suite 778 Bethesda, MD 20814 | Phone: 301.664.8494 Santa Clara | 2880 Lakeside Drive Suite 331 Santa Clara, CA 95054 | Phone: 408.496.0242 Copyright © 2006 by GemStone Systems, Inc. All rights reserved. GemStone®, GemFire™, and the GemStone logo are trademarks or registered trademarks of GemStone Systems, Inc. Information in this document is subject to change without notice. Copyright © 2006 Intel Corporation. All rights reserved. Intel, the Intel logo, and Xeon are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. *Other names and brands may be claimed as the property of others.

312462-001US

Suggest Documents