Wide area file sharing across the WAN

Distributed enterprises virtually cover the globe. Remote offices are everywhere and remote office workers now far outnumber those who work out of central office locations. With this distribution of resources, today's companies must manage development efforts across multiple remote locations which means that they must also somehow enable all remote office workers and team members, worldwide, to collaborate on the same shared files and data at the same time. Add to this the fact that file sizes and data storage requirements are increasing year after year, and the efficient sharing of files across distributed enterprises over the wide area network (WAN) has become a Herculean task.

File sharing over the WAN - Storage Networking

The problem is that although gigabytes of data can easily be shared over a local area network (LAN) using standard file server technology, they cannot so easily be shared across remote offices connected over the WAN. In truth, standard file server protocols provide unacceptably slow response times while opening and writing files over the WAN and this forces remote office IT managers to make some unappealing choices. IT managers and network users must either live with reduced productivity due to poor network performance at remote offices or they must use replication schemes that waste storage and inhibit global collaboration.

Recently, however, a new class of product known as wide-area file services (WAFS) has showed remarkable results in solving the problem of remote office collaboration for distributed organizations. WAFS allows companies with remote offices to utilize the WAN to share files as if it were a virtual LAN, enabling real-time, read/write access to shared files while also guaranteeing the consistency and coherency of all file data.

The most successful WAFS systems address inherent WAN file sharing issues through a multi-layered technology approach. In designing this technology, WAFS vendors begin by carefully looking at the file sharing protocols used in today's enterprise infrastructures. This is a key starting point, which bears a closer look.

Problems With File Sharing Over the WAN

All major file sharing protocols, including NFS (Network Filesystem for Unix/Linux environments), CIFS (Common Internet Filesystem for Windows environments), and IPX/SPX (Internetwork Packet Exchange/Sequenced Packet Exchange for Novell environments) were designed for LAN environments where clients and servers were located in the same building or campus.

The assumption that the client and the server would be in close proximity led to a number of design decisions that do not scale across WANs. For example, these file-sharing protocols tend to be rather "chatty", which means that they send many remote procedure calls (RPCs) across the network to perform operations.

Let's take a look at a closer look at the NFS protocol to show an example of this type of "chatty" behavior. For certain operations on a filesystem using NFS (such as an synchronization of a source code tree), almost 80% of the RPCs sent across the network can be access RPCs, while the actual read and write RPCs typically comprise only 8-10% of the RPCs. Thus, 80% of the work done by the protocol is simply spent trying to determine if the NFS client has the proper permissions to access a particular file on the NFS server, rather than actually moving data.

In a LAN environment, these RPCs do not impact performance significantly, but when combined with the high latency typical of WANs, these RPCs can be deadly to performance. Worse, remote clients often end up timing out and retransmitting the RPCs, compounding the inefficiency. Furthermore, because data movement RPCs make up such a small percentage of the communication, increasing network bandwidth will make no difference to the aggravated end user. Like NFS, CIFS and IPX/SPX suffer from issues of "chattiness" that negatively impact performance over the WAN.

Workarounds and Attempted Solutions

Various solutions and workarounds have been proposed to the WAN file-sharing problem, including replicating file copies and implementing distributed file systems, but neither approach has provided a complete solution. Enterprise content delivery networks (eCDNs) tried to mitigate this problem by caching copies of files at each remote office. But eCDNs, like web caching infrastructure, only provide a read-only copy of data at the remote office. If remote office users wanted to modify the file, they either had to go across the WAN to access the original copy and incur a major performance penalty, or update the local copy and create multiple, out-of-sync versions of the same file.

Filesystems developed over the last 15 to 20 years such as AFS attempted to solve the WAN file-sharing problem using a distributed filesystem architecture which unites disparate file servers at remote offices into a single logical filesystem. The problem with these technologies is that they require substantial changes in IT architecture to work properly and also require remote-office applications to use entirely new protocols because they do not export data using industry standard protocols such as NFS or CIFS. With over 1 billion computers deployed in the world that access data using either CIFS or NFS and billions of dollars invested in current file server and NAS infrastructure, filesystem solutions are clearly untenable.

The bottom line is that for any WAFS solution to gain traction, it must be able to integrate itself with existing infrastructure rather than requiring new infrastructure to be built.

**

Comments

Popular posts from this blog

VMware PSOD Purple Screen of Death - Debugger waiting (world 2078) -- no port for remote debugger. "Escape" for local debugger

The Windows Time Service terminated with the following error - Event ID 7023 & 46

IBM x3650 M4 Series Server Model - Activation Keys Backup to be taken for IMM Moduel II, why?