Saturday, March 29, 2008

A primer on array-based and network-based replication

Replication helps protect your data and files by producing a duplicate copy at a second site, server, or storage array. I covered host-based replication in a previous blog.

In this blog, I’ll cover two other types of replication — array-based replication and network (or fabric) based replication.

Array-based replication
Array-based replication requires a central data storage unit (SAN or NAS) and a partner unit. With array-based replication, the SAN or NAS processes the data and the commands to process and validate the data being replicated.

Advantages of array-based replication
The work is offloaded from the servers to the storage device.
You only need one location to control many replications of multiple servers.
Hosts (Servers) are not required at the second site or to be attached to the second SAN/NAS.
A central SQL server can be set up to replicate with the servers that actually present applications to users, such as order tracking applications.
The right software can queue databases to ensure that transactions and the database are in a recoverable state.
Disadvantages of array-based replication
Cost per device can be high, especially when you’re not replicating all of the data on the SAN.
Only SAN or NAS based data can be replicated or controlled.
A second SAN or NAS is required, increasing the cost for the solution.
There could be compatibility problems of replication technology/software between SAN/NAS hardware and vendors.
Examples of array-based replication software
HP StorageWorks XP
EMC SANCOPY - Supports EMC and some other vendor arrays
EMC MirrorView - EMC only replication
NetApp SnapMirror
Network-based replication
The last type of replication is network (or fabric) based replication. This type of replication works separately from the hosts (servers) and the storage devices. A device on the network intercepts packets being sent to and from hosts and arrays and copies them. These copies are replicated to a second device that then replays the packets at a second location. The devices are, in essence, splitters. The data goes in and then it’s split out to different sources.

Advantages of network-based replication
It’s a separate component from the SAN/NAS or the hosts.
Processing is independent to the host and SAN/NAS.
It allows replication between multi-vendor products.
Disadvantages of network-based replication
The cost of implementing devices to support this kind of replication is high.
Newer technology for the data center, standards, and process are still being worked out.
There are a limited number of “players” in this area of replication.

No comments: