Wide area file sharing across the WAN

A Whole New Option

In spite of the failures of both caching technologies like eCDNs and distributed filesystems to address the central issues in WAN file sharing, these technologies do provide important components for solving the WAN file-sharing problem. New WAFS products combine distributed filesystems with caching technology to allow real-time, read-write access to shared file storage from any location, while also providing interoperability with standard file sharing protocols such as NFS and CIFS.

WAFS products enable transparent worldwide design collaboration on the same data set, without complicated replication schemes or slow network performance. WAFS products will cache files in a read-write mode at remote locations, thus speeding up data access for remote users tremendously. WAFS enables LAN semantics for file access to be extended to the entire enterprise.

WAFS systems usually consist of edge file gateway (EFG) appliances, which are placed at remote offices, and one or more central server (CS) appliances that allow storage resources to be accessed by the EFGs (see Figure 1).

Each EFG appears as a local fileserver to remote office users. Together, the EFGs and CS implement a distributed filesystem and communicate using a WAN-optimized protocol. This protocol is translated back and forth to NFS and CIFS at either end, to communicate with centralized storage and remote user applications.

3 Key Design Questions

When building a WAFS system, three key design questions that must be addressed include:

* What are the features of the optimized protocol run between the EFGs and CSes across the WAN?

* What specific optimizations have to occur in the system design for reading files?

* What is the specific architecture for writing files and moving updates back to central storage resources?

The protocol used between the remote offices and the datacenter should incorporate fileware differencing technology, data compression, streaming, and other technologies to improve performance and efficiency in moving data across the WAN. File-aware differencing is especially important because it can detect which parts of a file have changed, and only move those parts across the WAN. Furthermore, if pieces of a file have been rearranged, only offset information will be sent, rather than the data itself. These techniques result in tremendous, order-of-magnitude bandwidth reduction across the WAN and time savings in accessing files by remote users.

Read performance is governed by the ability of the EFG to cache files at the remote office, and the ability to serve cached data to users while minimizing the overhead of expensive kernel user communication and context switches, in effect enabling the cache to act just like a high-performance file server. If the WAFS system is architected correctly the remote cache should mirror the data center exactly and only a few WAN round trips are required to check credentials and availability of file updates, but read requests will be satisfied from the local cache. Thus, regardless of how many NFS/CIFS read RPCs come into the EFG, it should hardly translate into any WAN traffic.



**

Comments

Popular posts from this blog

VMware PSOD Purple Screen of Death - Debugger waiting (world 2078) -- no port for remote debugger. "Escape" for local debugger

The Windows Time Service terminated with the following error - Event ID 7023 & 46

IBM x3650 M4 Series Server Model - Activation Keys Backup to be taken for IMM Moduel II, why?