The Mirage NFS Router

The Mirage NFS Router Scott Baker, John H. Hartman University of Arizona ([email protected], [email protected]) Abstract NFS has no provision f...

Author: Oliver French

7 downloads 0 Views 94KB Size

Report

Download PDF

Recommend Documents

NFS NFS Network File System =============================

The Mirage of the State

NFS Intelligent

NFS Series

Not Quite NFS, Soft Cache Consistency for NFS

NFS-cc: Tuning NFS for Concurrent Read Sharing

Mirage Kidsta Nasal Mask

VMware Mirage Administrator's Guide

nfs # University of Nottingham

The Mirage of An Economics of Knowledge*

sco NFS nnntstrator's Guide

NMEA Forwarding Services (NFS)

rc16. Rancho Mirage, CA

MIRAGE XTERA CARGO TRAILERS

Hobie Mirage Kayak Manual

Mirage 2600DM Microondas

MIRAGE: Calibration Radiometry System

Mirage 4K25. Konfigurationsanleitung

Hobie Mirage Revolution

MIRAGE V PILOTS ASSOCIATION

The Mirage of Floating Exchange Rates

CUB-43USB-NFS Setup Guide

Tuning Systems for NFS

Memoria de Calidades LE MIRAGE

The Mirage NFS Router Scott Baker, John H. Hartman University of Arizona ([email protected], [email protected])

Abstract

NFS has no provision for spreading a single file system over multiple servers. As a result, the server becomes overloaded as more clients are added. The system administrator has few solutions to this problem, including upgrading the NFS server or introducing an additional server and moving some files from the old server to the new. The former solution is expensive and disruptive, while the second requires reconfiguring the clients to be aware of the new server. Similarly, if an NFS server runs out of storage, the only way to gain additional space is to upgrade or add an additional server.

Mirage1 aggregates multiple NFS servers into a single, virtual NFS file server. It is interposed between the NFS clients and servers, making the clients believe that they are communicating with a single, large server. Mirage is an NFS router because it routes an NFS request from a client to the proper NFS server, and routes the reply back to the proper client. Experiments with a Mirage prototype show that Mirage effectively virtualizes an NFS server using unmodified clients and servers. Mirage imposes a negligible overhead on a realistic NFS workload. On real world workloads, such as a collection of clients executing compile jobs over NFS, Mirage imposes an overhead of 3% as compared to a proxy that simply forwards packets.

Reconfiguring the clients in response to changes in the server configuration can be painful on a system with many clients. One solution is to push the changes to the clients; on UNIX, for example, this can be achieved by copying a new version of /etc/fstab to each client. This is onerous and has several problems, including missed updates when a client is down. Another solution is to have the clients poll for new versions of the file. This requires a polling mechanism on the clients, and prevents updates from being propagated any quicker than the polling period.

1 Introduction Mirage provides NFS [Sandberg85] clients with the illusion that a collection of NFS servers is actually a single, virtual NFS server. Mirage is not an NFS server -- it is an NFS router that directs the client requests to the appropriate NFS servers, and the replies back to the appropriate clients (Figure 1). Mirage exports a set of file systems that is the union of the file systems exported by the NFS servers. Clients mount a file system from the virtual server and access its contents as if Mirage were a real, physical NFS server. Mirage is thus fully transparent to the clients and servers, supporting unmodified NFS clients and servers. Client #1

Client #2

Client #3

Mirage avoids changing the client configuration by hiding the real NFS servers behind a virtual NFS server. The IP address of the virtual NFS server is the Mirage router’s IP address. Clients perceive a single (virtual) NFS server, unaware that it is actually an aggregation of the real NFS servers. Files and file systems can be moved from one NFS server to another without reconfiguring the clients. Similarly, new NFS servers can be deployed without the clients’ knowledge. All that is required is to reconfigure Mirage to include the new server in its virtualization. By hiding the server configuration details from the clients, Mirage allows NFS to scale to many clients and servers without an overwhelming increase in system administration complexity.

Server #1

Subnet #1

Mirage Router

Subnet #2

Server #2

Server #3

Mirage is transparent to the NFS clients. Client NFS requests are delivered to the Mirage router, which rewrites the requests and forwards them to the appropriate NFS server. The clients are unaware that Mirage is a router, or that it is aggregating multiple NFS servers into a single virtual server. No client modifications are necessary, which is advantageous because the NFS client protocol is usually implemented as an integral part of the host operating system on the client

Figure 1: Mirage routes requests and replies between the NFS clients and the NFS servers.

1 This work was supported in part by DARPA contract F30602-002-0560 and NSF grant CDA-9500991.

1

The use of handles in the NFS protocol poses two difficulties for Mirage. First, the NFS server generates handles however it desires, as long as different objects have different handles. Typically, the NFS server will encode information in the handle that identifies the location of the object in the server’s internal file system, improving access performance. The handles are opaque to the client, however, so it is unaware of what information the handle holds. Since Mirage virtualizes multiple NFS servers into a single server, it must ensure that different objects have different handles, even if the objects reside on different NFS servers. Mirage has no control over how the servers create their handles, so there is no way for Mirage to ensure that the handles generated by the servers do not conflict. As a result, Mirage must virtualize the NFS handle space by creating its own (virtual) handles for objects, and mapping between its virtual handles and the physical handles used by the servers. This requires Mirage to maintain state about the mapping, increasing Mirage’s complexity. Virtual handles and how they are managed in Mirage is described in more detail in Section 3.3.

computer. To modify the NFS implementation would require installing a custom kernel on the client computer, a task that is beyond the abilities of most end users, and inconvenient for most system administrators. Mirage is also transparent to the NFS servers. Server modification is infeasible because many NFS servers are commercial products and contain proprietary code. Commercial servers also provide additional features such as snapshots, which not only are complicated to implement, but protected by patents. Furthermore, since Mirage doesn’t require server modifications and exists outside of the server, it is always possible to access the NFS servers directly if Mirage suffers a failure or is otherwise unavailable. 2 NFS Mirage implements versions 2 and 3 of the NFS protocol. The NFS protocol is based upon Sun Remote Procedure Call (RPC) [Sun88], and therefore uses a request/reply paradigm for communication between the clients and the servers. The protocol uses handles to represent files, directories, and other file system objects. An NFS handle is a 32-byte quantity that is generated by the server and opaque to the client. The client receives the handle for the root directory of a file system when it mounts that file system. The mount request includes the name of the file system to be mounted (the mount point). The NFS server checks the mount request for the appropriate security and authentication and then issues a reply containing the handle for the root directory of the file system. A client uses a handle to access the contents of its associated object, as well as to obtain handles for additional objects.

The second handle-related issue is that once a client has a handle, it may use the handle indefinitely to access the associated object. The client never needs to refresh the handle, therefore the server cannot subsequently change the handle contents. If the server crashes and reboots, it may no longer have information about handles issued before the crash, causing it to return “stale handle” errors to the client. This causes the client to re-lookup the handle, if possible. Handle longevity is an issue for Mirage because it means that not only must Mirage maintain the mapping from a virtual handle to a physical handle indefinitely, but Mirage must also be able to reconstruct that mapping after a crash. If Mirage did not reconstruct the mapping, then Mirage would have to return a “stale handle” error to the client, causing Mirage to no longer be transparent to the client.

An NFS client obtains a handle for a desired object in an iterative fashion. For example, to get a handle for the file /foo/bar, the client first sends a lookup request to the server containing root file handle and the string “foo”, and receives a handle for that directory. The client then sends then a lookup request containing the handle for /foo and the string “bar”, and receives the handle for /foo/bar. The client then uses the handle to read and write the file. Figure 2 illustrates the sequence of events required to read the file /home/fred/photo. Client Request Mount(“/home”) Lookup(handlehome, “fred”) Lookup(handlefred, “photo”) Read(handlephoto, 0-1024) Read(handlephoto, 1024-2048)

It can be argued that a Mirage recovery mechanism is unnecessary since the existing NFS protocol simply issues stale handle errors if a server reboots. Mirage could simply do the same. However, this creates another potential point of failure in the system, increasing the probability of a failure that leaves the client with stale handle errors. Implementing a recovery mechanism inside Mirage eliminates Mirage as a second potential failure point, and actually enhances stability by hiding server failures since Mirage itself is able to tolerate a failed or restarted server without returning stale handle errors to the clients. Mirage’s crash recovery mechanism is described in more detail in Section 3.4.

Server Reply handlehome handlefred handlephoto First 1024 bytes Next 1024 bytes

Figure 2: Sequence of NFS client requests and server replies to read the first 2048 bytes of /home/fred/photo.

2

3 Mirage

push configuration details to the clients. The disadvantage to this approach is that the router represents a single point of failure. In the event of a router failure, the clients could contact the server directly, but this would require intervention on the client side. A possible solution could use a round-robin DNS or a similar technique to cause the clients to automatically switch to a backup Mirage router if the primary Mirage router failed. Since the router is only forwarding packets and not doing hard work such as reading and writing disk blocks, a single router should be able to handle the traffic of several servers. We chose to implement Mirage as a router because it allows the use of unmodified NFS clients and servers.

Mirage addresses the NFS scalability problem by virtualizing multiple NFS servers as a single NFS server. This virtualization is entirely transparent to the client and the servers, so the clients are not aware they are dealing with a collection of servers. There are several issues that Mirage must address, including the abstraction presented to the clients, how virtual handles are maintained and validated, and how Mirage recovers from crashes. 3.1 Mirage as a Router There are several different design alternatives for implementing Mirage: a client-side proxy, a forwarding server, or a programmable router.

3.2 Virtual Server Abstraction There are at least two variations of the virtual server abstraction presented by Mirage. The simplest is that the set of mount points exported by the virtual server is the union of the sets of mount points exported by the underlying NFS servers. When the Mirage router receives a mount request it forwards the request to all of the NFS servers that it virtualizes. The server that exports the desired mount point will return its root handle, while the other servers return errors. Mirage generates a virtual handle, associates it with the root handle provided by the server, and returns it to the client. In this way the client can access all of the mount points and believes it is communicating with one larger server. The advantage of this approach is that it requires a relatively small amount of state and processing on Mirage to implement.

A client-side proxy would run on the client as either a kernel module or a user level process. The disadvantage of a client-side proxy is that updating the Mirage configuration requires updating all the Mirage proxies. This suffers from the same problems as updating /etc/fstab to the clients, so it doesn’t help with system administration. A second alternative is to implement Mirage as a forwarding server. Each client sends its requests to a particular Mirage NFS server. If the file is located on the server, then the Mirage NFS server processes the request locally. If the file is not located on the server, then the request is forwarded to another server that processes it. The reply is then either returned directly to the client, or routed back through the original server. Preferred servers are distributed fairly amongst the clients so that each server receives approximately the same load. This design presents several problems, however. While requests sent to the preferred server are served quickly, requests that must be forwarded from the preferred server to a second sever incur additional latency. Furthermore, since each server is functioning as both a router and a server, it must handle the load of both tasks. In a worst case scenario, the clients would always contact the wrong server so that every request is forwarded, leading to both increased latency and increased server load. This design also requires server modification.

The disadvantage of this approach is that the sets of mount points exported by the servers must be disjoint. Suppose two servers export mount points with the same name. What should the virtual server export? An alternative virtual server abstraction that addresses this issue is to export the union of the file system namespaces. For mount points that overlap, the virtual server exports a single mount point whose name space merges the name spaces of the individual mount points. When the client mounts an overlapping mount point, subsequent accesses to that file system must be directed to the proper NFS server on a caseby-case basis. This approach allows for more flexibility in the organization of the underlying servers, but increases Mirage’s complexity because it must map virtual handles to NFS servers individually.

The third design alternative is to implement Mirage as a router. The router sits as a transparent component in the network and requires no modifications to the clients or the servers. The routers and the servers can each be optimized to perform their individual tasks – there is no requirement to force a component to do double-duty as there was in the forwarding server alternative. Changes to the server topology need only be made to the router – there is no need to

The current Mirage prototype uses the former approach of exporting the union of the mount points exported by the NFS servers. We are currently experimenting with the “union of file namespaces” approach and have a partial implementation in Mirage. Its viability is an area of future work, however. 3

3.3 Virtual Handle Mapping

Virtual Inode Number (VIN): Every object has a unique VIN.

One of the core functions of the Mirage router is to map between the file handles produced by the Mirage router and the file handles produced by the NFS servers (Figure 3). The clients cannot be given the handles from the NFS servers directly because the servers may generate the same handle for different objects. For this reason, Mirage generates its own file handles that uniquely identify objects. We refer to the file handles generated by Mirage as virtual file handles (VFH), and those generated by the NFS servers as physical file handles (PFH). Mirage stores the mapping between a VFH and a PFH in a memoryresident handle table. When Mirage receives an NFS request from a client, Mirage looks up the VFH in the handle table to determine the proper server and PFH for the request. Mirage then rewrites NFS request using the PFH and forwards it to the server.

Client

Server ID (SID): The SID identifies the NFS server that stores the object associated with the VFH. Handle Version (HVER): The handle version is provided to allow Mirage to force a handle to become outdated by issuing handles with a newer version, thus causing the clients to eventually forget the older handles.

Server #1

Physical Inode Number (PIN) and Physical File System ID (PFS): The PIN is the inode number of the object as assigned by the NFS server, and the PFS is the file system ID on the server. The PIN and PFS are obtained from the attributes that are returned by the NFS server in most replies. These two fields thereby uniquely identify an object within a file server and are used during recovery (Section 3.4). The triple (SID, PIN, PFS) uniquely identifies an object within all of Mirage.

Server #2

Mount Checksum (MCH): The MCH is a 32-bit checksum of the name of the mount point associated with the VFH. The MCH speeds up Mirage recovery.

Mirage

Virtual Handles

Handle Validation Checksum (HVC): The HVC is a cryptographically secure checksum prevents clients from forging VFHs.

Physical Handles

Figure 3: The Mirage router maps between the virtual file handles used by the clients and the physical file handles used by the NFS servers.

The HVER and HVC are used to enhance Mirage security by preventing denial of service attacks, and are described in detail in a separate technical report [Baker02].

Mirage must perform a reverse mapping on NFS replies. Each PFH in a reply must be mapped to the appropriate VFH before the reply is forwarded to the client. The most common reply to contain a PFH is the reply to the Lookup request that is used to resolve a file name into a file handle. Mirage looks up the PFH contained in the reply in the handle table and rewrites the NFS reply with the correct VFH before forwarding the reply to the client. If this is the first time the PFH has been used, then Mirage generates the new VFH and stores it in the handle table.

3.4 Mirage Crash Recovery One of our design goals for Mirage was that Mirage crashes must be transparent, i.e. Mirage must recover from a crash and allow the clients and servers to continue without requiring them to be rebooted or modified. The biggest issue with recovering from a Mirage crash is recovering the contents of the handle table. Mirage simplifies this process by ensuring that the contents of the handle table represent soft state. Soft state can be discarded or lost because it can subsequently be regenerated. Hard state, on the other hand, must be protected at all costs because it cannot be regenerated if lost and will affect the correct functioning of the system. This usually means maintaining multiple copies of the hard state, either on redundant computers or on backup media. Mirage avoids the problems of hard state by ensuring that the contents of the handle table can be regenerated by existing mechanisms in the NFS protocol. This allows Mirage to lose the handle table during a crash, or

The handle table is a concern because it represents state on the Mirage router that consumes memory and must be recovered after a crash. It also requires processing to look up handles in the table. Mirage minimizes the state and processing resources of the table by encoding information in the VFH (Figure 4). The VFH is a 32-byte quantity that is opaque to the client, making all 32 bytes available for Mirage’s use. Mirage encodes the following information in the VFH: VIN 0

HVER SID

4

6

PIN 8

PFS 12

MCH 16

HVC 20

32

Figure 4: Format of a virtual file handle. Offsets are in bytes.

4

tion of this is that the operating system enforces limits on the types of packets that can be sent to and from the user-mode router. The user-mode router is unable to receive packets that are not destined for a local IP address on the router machine, and it is unable to rewrite outgoing packets to contain arbitrary source IP addresses. These limitations require that Mirage be implemented as a proxy rather than as a pure router. The proxy implementation assigns the Mirage router its own IP address and rewrites the source and destination addresses as necessary to cause packets to be routed through Mirage.

discard part of the table if it gets too large, without affecting correctness. Mirage uses a uniform mechanism for dealing with lost handle table state that deals with handle table content lost in a crash, as well as handle table entries discarded because the table grew too large. If a client presents a VFH that is not found in the handle table, Mirage initiates handle recovery. The first step is to ensure that the VFH has not been forged by checking the HVC checksum. The HVC uses the cryptographically secure MD5 checksum algorithm [Rivest92] to create a checksum of the fields in the VFH concatenated with a Mirage secret. Because the client does not know the Mirage secret, the client cannot create its own HVC, and therefore cannot create arbitrary handles. The Mirage secret is configured by the system administrator when Mirage is initialized, similar to how a unix root password is used. Once the VFH has been authenticated, the fields it contains are used to recover the proper server and PFH to which it should be mapped.

The per-transaction state is divided into two different tables in Mirage: The Proxy Mapping Table (PMT) and the Transaction Table (TT). The PMT contains the additional information that is necessary to implement the proxy functionality, and the TT contains information that is common to both the proxy and router implementations. When an incoming request arrives, Mirage creates an entry in the Proxy Mapping Table (PMT). The first step is to assign a unique transaction ID (XID) for this RPC. Every RPC request and reply contains an XID field that is used by the client to match replies with requests, and by the server to detect duplicate requests from the same IP address. Since all forwarded requests come from Mirage’s IP address, Mirage must ensure that all XIDs are unique. The PMT entry includes the new XID as well as the old XID and the client’s source address and port. The XID in the request packet is rewritten with the new XID. Thus, when the reply eventually arrives, Mirage is able to look up the modified XID in the PMT and reconstitute the client address.

The Server ID (SID) allows Mirage to determine which server the file came from, and the Physical File System (PFS) and Physical Inode (PIN) uniquely identify the file within that server. Ideally, the Mirage router would present this information to the NFS server and obtain a handle for the object. Unfortunately, the NFS protocol does not include such a function. Therefore, Mirage is forced to search the server until it finds an object with the correct PFS and PIN. This process is similar to that used in Base [Rodrigues01]. Recovery can be a lengthy process, and we propose several techniques to speed up the search. These techniques include caching of information from one recovery operation to another, encoding directory information in the NFS handles to allow portions of the tree to be pruned while doing the depth-first recovery search, and using checksums to prune out mount points where we know the file is not located. These techniques are an area of future work.

Mirage also creates an entry in the transaction table (TT) when an RPC request arrives. The transaction table is keyed by the XID and the client’s IP address, and includes information about the RPC, such as the RPC procedure number that will be necessary in handing the reply packet when it is received. Once the PMT and TT entries are created, the packet is rewritten. The rewrite process varies by RPC, but generally involves replacing each VFH in the request with the appropriate PFH. The packet is then sent to the server.

4 Implementation Mirage currently exists as a user-mode proxy. We have also implemented Mirage as an extensible router [Gottlieb02] using the Scout operating system [Peterson01]. We are currently working on implementing versions of Mirage in the Linux kernel, and are planning to also port Mirage to the Intel IXP [Spalink02] architecture.

When a reply arrives, the operations are generally the reverse of the request operations. First, the PMT is indexed by the XID of the reply, resulting in the PMT entry containing the original XID as well as the client’s IP address and port. The client’s address and XID are used to retrieve an entry from the TT. The transaction entry contains useful information about the original request, such as the RPC procedure num-

The user-mode process uses sockets bound to local port numbers to read and write packets. One implica-

5

5.2 Bandwidth Micro-Benchmarks

ber. The reply is rewritten in a procedure-specific fashion, which generally involves replacing any PFHs in the reply with VFHs. Finally, the XID of the packet is rewritten with the original XID and the packet is forwarded to the client using the send() function on a UDP socket.

The bandwidth benchmark (Figure 5) consists of each client writing to a file on the server for a fixed period of two minutes. At the end of the benchmark, the number of bytes written by each client is summed up and used to compute the aggregate throughput of all clients.

5 Performance We measured Mirage performance using both microbenchmarks that show the raw Mirage performance and macro-benchmarks that use an end-to-end test to show the overall performance in a real-world situation. We ran the tests using user-mode version of Mirage, running Red Hat Linux version 8.0 and Linux Kernel version 2.4.18. The configuration was performed on a cluster of Pentium machines connected via a switched gigabit Ethernet. All of the machines are identical; each of them is configured with a 2.4Ghz Pentium 4 CPU, 1 GB of memory, 2 – 40 GB IDE disks, and a gigabit Ethernet adapter. The machines are connected to a Foundry FastIron 800 switch. The cluster consists of 32 machines; our benchmarks used between 3 and 17 of the machines.

Figure 5: Aggregate write bandwidth, 1 server. Mirage and the Null proxy are both able to achieve a peak bandwidth of approximately 35 MB/s

5.1 Benchmark Configuration

In this test, Mirage and the null proxy both start out at a bandwidth that is approximately 2/3 of the direct connection. The direct connection rises sharply and peaks at approximately 37.5 MB/s where it drops slowly for the remainder of the test.

To perform the benchmark, the cluster was connected via a high speed Ethernet switch. Each computer used in the benchmark has one gigabit Ethernet adapter that is connected directly to the switch.

The measurements show a curious pattern where the performance begins to drop off after the peak is reached (at approximately 5 servers for direct, or 12 servers for Mirage and null proxy). This is caused by dropped packets. Linux implements Van Jacobsen congestion control on top of UDP, and a dropped packet results in not only a .7 second timeout, but also a reset of the window of simultaneous RPCs. Thus, overwhelming the server causes an increase in dropped packets, which in turn causes poor performance.

We chose three different models to use for comparison: •

Null proxy: The null proxy is a user mode router that receives UDP packets and then forwards them to the appropriate machine. It performs no other processing on the packets.

•

Mirage: This is the user-mode implementation of Mirage.

•

Direct Connection: The clients and servers communicate directly.

Mirage and the null proxy performance are worse than the direct connection when there are few clients because of the additional request latencies. The NFS protocol does not make very good use of parallelism, so that the rate at which a client can write is governed mainly by the round-trip time of the packets. Mirage and the null proxy eventually catch up to the direct connection at the 10-client case and level off slightly higher than the direct connection.

The cluster was partitioned into a set of client machines, server machines, and a router machine. We conducted experiments using a 16 client / 1 server configuration (16-to-1) and a 12 client / 4 server configuration (12-to-4). The 16-to-1 configuration yields results that show Mirage’s performance on a single heavily utilized server. The 12-to-4 test, on the other hand, demonstrates Mirage’s ability to handle multiple servers.

The fact that the proxies, with their higher latency, would perform better than the direct connection at

6

5.3 Compile Macro-Benchmark

high numbers of clients is at first very counterintuitive. We performed extensive traces of the packets while the benchmark was being performed and we determined that the direct model was experiencing double the packet drop rate of the other two configurations. The direct connection, even with its lower latency, suffers enough penalties from the increased number of dropped packets that its performance drops below the proxies.

The compile macro-benchmark (Figure 7) represents a more realistic use of Mirage, and consists of unpacking and compiling the Linux 2.4 kernel distribution. This test creates over 10,000 individual files, and causes the client to issue approximately 472,000 NFS requests. Approximately 2/3 of the resulting NFS requests use small packets (lookup, getattr, etc) and 1/3 of the NFS requests uses large packets (reads, writes). There are three distinct phases of the compile macro benchmark: the un-tar phase that consists of a rapid succession of large write packets, the make-dep phase that consists of a rapid succession of intermediate packets, and the make phase which consists of a slightly slower succession of varying size packets. The variety of different packet sizes and different loads makes this benchmark an ideal measurement of Mirage’s performance on real-world jobs.

We speculate that the Mirage and null proxy configurations suffer fewer dropped packets than the direct connection because of the additional buffer at the proxy. This buffer enables the system to tolerate more burstiness in the traffic than the direct connection case. This causes fewer packets to be dropped, which in turn improves the proxy performance vs. the direct connection because of the high cost of lost packets. We also tested Bandwidth using a 12 client / 4 server configuration (figure 6). This test is intended to show Mirage’s performance where the servers are relatively unloaded, and Mirage is the limiting factor of the system.

Figure 7: Compile Benchmark. This benchmark presents the time for a client to unpack and compile a kernel distribution.

With one client Mirage and the null proxy require approximately 460 seconds to complete the test, while the direct connection requires 320 seconds. The difference of 140 seconds is primarily due to the increased latency imposed by the routers. All three curves are relatively flat until the number of clients reaches 7, at which point the curve’s slope increases as the server saturates. At approximately 13 clients, the server is fully saturated and the slope of the graph increases dramatically. The measurements show that Mirage performance lags the null proxy by an average of 3%.

Figure 6: Aggregate Bandwidth, 3 clients per server, up to 4 servers. Mirage attains a peak of 38 MB/s and the Null proxy attains a peak of 40 MB/s.

The 12-to-4 performance is only marginally better than the 16-to-1 performance. This is because Mirage is currently a user-mode application, and suffers a penalty crossing the kernel-to-user boundary. We expect that once Mirage is implemented in the kernel, or on the specialty IXP hardware, the performance of the 12-to-4 test will improve significantly over the 16-to-1 test.

7

6 Related and Future Work

The performance of Mirage is a constant amount slower than a null router that performs no computation on the packets. The overhead is minimal and has little impact on real-world file system usage as shown by several large macro-benchmarks.

Base [Rodrigues01] provides Byzantine fault tolerance for the NFS protocol by replicating objects on multiple NFS servers. Base is implemented via a user level relay process that mediates communication between the NFS client built into the operating system and the NFS servers. Base uses a handle translation mechanism similar to Mirage where the relay process translates client handles to server handles via a table. The Mirage handle recovery mechanism of using an exhaustive search is derived from Base. Unlike Mirage, multiple Base relay processes coordinate their actions, so that a client can access its files through any of them.

References [Anderson00] Darrell C. Anderson, Jeffrey S. Chase, and Amin M. Vahdat. “Interposed Request Routing for Scalable Network Storage,” in Proceedings of the Fourth Symposium on Operating System Design and Implementation (OSDI), October 2000. [Baker02] Scott Baker, John H. Hartman, “TR02-04 The Mirage NFS Router,” Technical Report TR02-04, Department of Computer Science, University of Arizona. ftp://ftp.cs.arizona.edu/reports/2002/TR02-04.pdf

Deceit [Siegal90] is a distributed file system that combines multiple servers to provide the illusion of a single large NFS server. Deceit differs from Mirage in that it requires modified servers. Deceit requires each server to function as both a server for its own files and as a router to forward requests from a client to another server’s files. To use the automatic failover and version control features, Deceit requires client modification as well.

[Peterson01] Andy Bavier, Thiemo Voigt, Mike Wawrzoniak, Larry Peterson, Per Gunningberg. “SILK: Scout Paths in the Linux Kernel ,” Technical Report 2002-009, Department of Information Technology, Uppsala University, Uppsala Sweden, 2002 [Gottlieb02] Yitzchak Gottlieb and Larry Peterson. "A Comparative Study of Extensible Routers," in 2002 IEEE Open Architectures and Network Programming Proceedings, 51-62, June 2002.

Slice [Anderson00] provides request routing in which a micro-proxy functions as a switch that routes client requests to a group of bulk data servers and file managers. Slice uses a proprietary OBSD protocol for communication between the micro-proxy and the bulk data servers. The file managers are distinct from the bulk data servers and manage the directory structure and metadata of the file system. A drawback of Slice is that standard NFS servers cannot be used.

[Rivest92] R. Rivest, “RFC 1321: The MD5 MessageDigest Algorithm,” http://www.alternic.net/rfcs/1300/rfc1321.txt.html [Rodrigues01] Rodrigo Rodrigues, Miguel Castro, and Barbara Liskov. “BASE: Using Abstraction to Improve Fault Tolerance,” in Proceedings of the 18th ACM Symposium on Operating System Principles, Banff, Canada, Oct. 2001.

We are currently exploring two techniques to improve Mirage scalability. The first is the use of speciailized network hardware to reduce packet processing overhead. The second is to support multiple distributed Mirage routers, to improve both scalability and to remove the single point of failure inherent with a single Router.

[Sandberg85] Russel Sandberg, David Goldberg, Steve Kleiman, Dan Walsh, and Bob Lyon. “Design and implementation of the Sun network filesystem,” in Proceedings of the Summer 1985 USENIX Conference, pages 119-130, June 1985.

7 Conclusion

[Siegal90] K. Birman and A. Siegal. “Deceit: A Flexible Distributed File System,” in Proceedings of the Summer 1990 Usenix Conference. June 1990. [Spalink02] Tammo Spalink, Scott Karlin, Larry Peterson, Yizchak Gottlieb. “Building a Robust Software-Based Router Using Network Processors”, In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP'01) . pages 216—229.

Mirage is an NFS router that provides a virtual NFS server abstraction, aggregating the resources of multiple unmodified NFS servers for use by unmodified clients. Both the clients and the servers are unaware of Mirage’s presence, yet Mirage allows files and file systems to be moved between NFS servers, and new NFS servers to be deployed, without reconfiguring the clients. In this way it greatly simplifies NFS system administration.

[Sun88] Sun Microsystems, “RPC: Remote Procedure Call, Protocol Specification, Version 2”, Request for Comments 1057, 1988.

8