Here goes my first “opinion” post opposed to one detailing a useful command or script.
I’m in the middle of a greenfield infrastructure redesign at the moment; a topic that has been playing on my mind is SAN vs. DAS. When I say SAN, I’m talking about a storage area network. That’s several trays of disk attached to a SAN head unit, which is then connected to a pair of fibre switches, or to a 10gig switch via Ethernet. Servers are then connected to the switches (via Fibre or Ethernet). When I say DAS, I’m talking about Direct Attached Storage. That’s several trays of disk attached to a DAS head unit, which is then connected to a number of servers via SAS cables, or indeed just a dumb tray or trays of disk connected directly to the server.
So, what are SAN’s traditionally used for? In a basic sense they present a large (or small) amount of scalable storage to a number of servers. Why do these servers need this storage? Either because the server hosting the application needs more disk space or spindles than you can fit into the server chassis, or if you are utilising clustering and need shared storage between two servers.
Clustering is what I think changes things. It has always struck me that you build clusters with multiple nodes, NIC’s, power supplies etc to offer high availability and yet the data is still in one place. Therefore the SAN is effectively a single point of failure.
Although a SAN itself will have no single point of failure (dual controllers, multiple paths etc.) the data is still on a single RAID volume so could potentially be a victim of bitrot or the RAID group having a hole punched in it. There is also the obvious risk that the physical file could become corrupt. Software vendors are obviously thinking the same. Exchange 2003/2007 was made highly available in the traditional cluster sense (multiple nodes with the DB on shared storage).
In Exchange 2010 however you have the concept of DAG’s. With DAG’s the database itself is replicated to nodes rather than being shared. This means a SAN is not required to provide a highly available exchange environment. If you can find a server with enough capacity you can run two (or three, or four) exchange servers in a DAG and have mailbox databases failover between them. This is actually more resilient than a traditional exchange cluster because the databases are being replicated rather than shared, which means you have protection against a corrupted database, as well as hardware failure.
Exchange 2010 DAG.
The upcoming SQL 2012 “always on” feature works in a very similar way to DAG’s. The selected databases are replicated between cluster nodes. This means you can now have two core business systems (Exchange and SQL) made highly available without needing any kind of shared storage.
Failover clustering in itself is also moving forward with “shared storage-less clusters”. You can create a cluster and use a file share as a witness, which means that’s another requirement for shared storage out of the window!
If you have services you would like to make highly available and they don’t require a common area to write to, you can easily make them highly available in failover clustering by using a file share witness. If a service does require a common area to read or write data to, then you could always create the directory locally on each server and use DFS replication to keep them in sync.
This brings me on nicely to DAS. With applications moving to a model where shared storage isn’t required, the only real reason you would need a SAN is to present more storage or spindles to a server. Because there isn’t the need for multiple servers to all access a common bit of storage, DAS comes into play. You can buy a dumb tray of 12 disks that can have additional trays daisy chained off of it to provide around 120ish disks for about £7k per tray (the Dell Powervault MD1200 for example) these can be dual connected to a single host. Or if you want to connect more hosts to the DAS solution, you can get an “intelligent” DAS head unit that can then have multiple “dumb” trays connected to it to provide 192ish disks. These can usually support four dual connected hosts and can be picked up for about £12k. (the Dell Powervault MD3200 for example)
There are still applications that require shared storage, such as Hyper-V or VMware for example. In this scenario the MD3200 (intelligent) with a few MD1200’s (Dumb) connected to it would be ideal. You can have four nodes in the cluster sharing the storage.
The initial reaction I get to this suggestion is that of shock, as it’s not very scalable like a SAN. I understand the argument, but on the flip side, do you really want 10 – 20 hosts sharing the backplane of your SAN (6 – 12gig) with the DAS solution those four hosts are sharing the dual 6gig backplane. If you need more servers then you’ll probably need more storage, so buy another head unit instead of a dumb tray. This method leaves you with two clusters of four nodes each with their own 12gig backplane (2x 6gig) opposed to potentially eight nodes sharing the SAN’s backplane.
I’m a big fan of DAS over a SAN for several reasons:
· The physical trays are cheaper than trays for a SAN
· There is no requirement for fibre switches which are eye wateringly expensive, not only for the tin but also the port licencing
· DAS is really simple as the cable goes from the head unit to the server. Simple is fast and also easy to support and fix when it goes wrong.
· DAS removes a single point of failure. It’s affordable to build two SQL clusters attached to 2 DAS arrays. Unless you’re a fortune 500 company you wouldn’t be able to do this with a SAN.
I can also see the downsides of DAS vs.SAN
· Physical limit of SAS cables mean your servers need to be near the DAS head unit.
· The administrative overhead of many storage arrays vs one SAN.
· DAS lacks some of the mirroring features that SAN’s do.
Based on the above though, I think the cost savings by going DAS in both financial terms and for simplicity, outweighs the disadvantages.
I’m open to constructive feedback on this; I still have an open mind on the subject. However at the moment I think SAN’s are a thing of the past in 90% of situations.