Monday Morning Conversation: D2D or VTL for Backups?
So I thought I'd start something new here and begin posting short thoughts and a question and get a feel for where everyone else is on the topics. So today we'll start with Disk to Disk (D2D) backups or VTLs?
Over the past few months I've had the chance to work with both of these types of products. I like Data Domain's D2D solution with data dedupe capabilities for the small to medium size businesses and I like the HDS/Diligent solution for the large businesses.
My preference would be to use a plain D2D solution (CIFS/NFS Share) with data dedupe capabilities because its simple. And when I say simple, I mean that you can have two Data Domain boxes setup within 20 minutes ready to accept backups and replicating between each other simple. When you start throwing VTL in the mix, things start to get complicated. Without HDS behind Diligent, I don't know that I would use that product because of the erector set it can become. With HDS though, you get a single package and it works well.
So what do you guys think? What have you seen out there that works? Which would you prefer?
November 19th, 2007 at 4:36 pm
Pretty much most VTLs work and work well,as the products are well along the maturity curve, the majority of them (certainly from a market share perspective) being based on the Falconstor code base.
As I work for a vendor I will refrain from knocking to many of my competitors to avoid being construed as marketing. However the difference comes on how well they are integrated as a solution and how well they scale in their implementation.
Some for example don’t offer hardware based compression on the VTL engines which ultimately hurts the throughput, whilst others don’t offer active/active failover of the VTL engines combined with NPIV (N-port ID virtualisation) which helps in failure scenarios (after all one of the matras for B2D is to improve reliability of backups so it’s not much use if there are SPOFs).
Any VTL should be coupled with ‘cheap’ (how I hate that word) storage to maximise the economics, so look for something that offers an integrated solution with at elast 750GB, but poreferabbly 1TB SATA-II disk technology. After all if the solution requires only Tier 1, Enterprise class storage, they you might as well just usedstorage array based copies and forget about the VTL.
Next up is how does the solution integrate with the backup application for migration of backup sets to/from real tape and for replciation to remote site(s) for DR. Can you tolerate the pain of having to use a separate set of VTL commands to achieve these or can you control it all from the backup application ? (think of the pain of separately having to re-catalogue offsite copies back into the backup application in order to do a DR restore)
Finally an important consideration is how does the VTL intergrate with any de-dup solution and is it scalable once all plugged together. For example in-line de-dup requires compute cycles (often quite a lot) which can/will slow down the backup and restore speeds.
Some of these factors may or may not apply depending on whether you are a small/medium or enterprise shop.
November 20th, 2007 at 7:28 am
My personal preference would be D2D with off-site replication. Back up Disk-to-Disk, replicate the backups off-site using a staged replication. so that the integrity of the existing backups isn’t threatened by the replication of the new backups.
Actually normally I would replicate flat backup files, however Veritas, my preferred backup vendor (whose principal qualification is "It’s not Legato") doesn’t allow recovery from disk-based backups in the event of a catalog failure. (Gee, I think I’ve been there)
VTL would probably be the better option, but I have to honestly admit that I’ve not had a lot of experience with VTL. Anyone got any suggestions? Can you replciate VTL files off-site for retention?
November 20th, 2007 at 10:23 am
Yes you can replicate VTL ‘files’ off-site for retention, VTL files are just a disk-based image of a tape file(s), so normally retains the same format as if you had backed up to real tape. You are really replicating a tape image here. Certainly all Falconstor based implementations provide the capability to replicate a tape image from one VTL to another, normally creating a clone of a tape (for example you backup to virtual tape VT0001 in VTL A and then create a cloned tape VT0002 on VTL B). The replication is done normally over IP and often include compression etc. The draw back of this approach, certainly in Falconstor based systems is that the backup application doesn’t know this has happened as it is done via the VTL The one great feature (not the only one I might add) that helps in this scenario is where the VTL runs an embedded version of the NetBackup Media Server (or Networker equivalent) which then controls the replication by producing a duplicate tape, much in the way the backup applciation would create two real tapes with the same content; that way the backup application always has a catalogue of both images.
You can often also create an offsite copy by duplicating a virtual tape to a real tape by using FC extension to an offsite tape library.
November 20th, 2007 at 7:27 pm
When it comes right down to it, VTL is looking back and D2D is looking forward.
I would use VTL if my priority was working with an existing infrastructure. It gets me to disk quickly and with less process change.
But I would choose D2D if at all possible. In the long run, I will get better results using disk as disk, not pretending it’s tape. D2D systems will innovate faster and more productively because they can take more advantage of the random capabilities of disk. VTL will be limited by its need to treat disk as tape.
I work for Seagate, so we participate equally in both categories of solutions.
November 20th, 2007 at 8:34 pm
I would have to go with D2D. I agree with Pete that VTL is looking back D2D forward. I imagine all backup applications and de-dupe solutions that expect to survive into the future will have slick D2D interfaces and features.
The way I see it, there are several criteria to satisfy when choosing a backup solution –
1. These days I want de-dupe. No de-dupe, no PO
2. Integrity of backed up (de-duped) data. I have to be 100% satisfied that its not based solely on some hash function that ‘could theoretically’ mistake new data for data already seen.
3. Off-site replication. With the death tape, you lose the ability to box up your media and have it stored off-site in a vault. So modern disk based backup solutions, D2D and VTL, need good offsite replication. This wont be too much with good de-dupe.
4. Performance. I see it suggested that SATA be the best option. For me SATA is only an option when used in RAID6. Especially with de-dupe! Imagine losing a RAID group with de-dupe! And RAID6 does not provide good performance on SATA. I have seen heavy random workloads even on VTL’s so I would have to think twice before running it on SATA. I know I know, its only backup, but that’s the performance focussed storage person in me coming out!
November 21st, 2007 at 12:02 am
Interesting conversation. Anyone care to name some highly recommended D2D products and why.
Stephen
November 21st, 2007 at 10:30 pm
Why are all of these discussions asuming the death of tape? (Perhaps a little EMC brainwashing, because that’s what they try to tell me too) Tape still works as the absolute cheapest form of backup – and storing 7 years of backups – even with dedupe, and hardware compression - on random access is just insane!
Regardless of my D2D/VTL choices the end of the story is still D2D2T for all of my backups. The D in the middle has replaced the process of replicating my tapes to send one copy offsite for archival.
November 22nd, 2007 at 1:00 pm
As some of my have noticed (certainly Stephen has) I actually work for EMC. And I would like to put a few things straight. EMC does not tell customers that tape is dead. By all means go D2D2T, in fact a great many of the EMC VTL customers do exactly that. Oh and by the way we also sell tape libraries. All the other VTL vendors do exactly the same, after all >75% of all VTL solutions sold are based on the FalconStor code base. which support a second level tape library to destage to. We advocate the use of disk as the first point of call for backup because it is inherently reliable unlike tape. How many of the posters on this forum can honestly say they have never had a tape failure?
One thing puzzles me though, why would anyone want to keep 7 years of backups, that is archival. A backup that is more than a few days or sometimes weeks old is of no use to any business. Could you really restore back 7 years and survive the loss of 6 years and 364 days of business transactions.
November 22nd, 2007 at 10:01 pm
Alas, government agencies and probably business groups like banks probably have to keep records a lot longer than 7 years. I am by no means an expert on this and prefer it that way but no one really seems to understand archiving. I have tried to bring it up a number of times but it gets too confusing for most managers.
Symantec (Veritas) have been pestering our Exchange people about archiving and it seems like a good product to me. We have 1 GB mailboxes and that will probably keep growing as someone decided not to allow PST’s anymore. It seems stupid to keep 5 year old emails on our Tier 1 storage solution.
I think the issue is that data management (backup groups) get stagnated by what they do. It costs a lot of money to implement changes in backup and it would certainly cost a lot for archiving. Mainstream storage like our USP’s are always easy to get upgraded compared to our backup solution. That is because most people think our DR solution works…… I would be most happy to have archiving duplicated to both our main sites and considering the cost of disk these days, a third site is probably feasible just to be on the safe side. We have to have one of them for cluster quorums anyway.
I can say I never had a tape failure but I have had tape device failures. BTW, EMC are in my good books. I have a new pet hate with storage that will never raise themselves out of the gutter in my view.
Stephen
November 23rd, 2007 at 7:24 pm
Stephen
What you say is absolutely true, often archival just gets implemented as ‘keep the backup tapes (full of course) longer’. And this is often an easy option, but it is not easy when it comes to retrieval, particularly for legal requests for regulatory compliance. There are many recent cases, in the US mainly, where firms have been fined serious money for not being able to produce evidence in a reasonable time or not be able to prove that the data is ‘genuine’. Unfortunatley tape in the US is not deemed to be unalterable (unlike WORM disk or CD).
Wat we really need is a transportable standard similar to XML that defines the semantics of archived data; and yes just such a standard is being drafted so we will have to wait till it gets ratified and then vendors like my company to get products out the door that achieve it.
In my experience the only time most companies separate archive from backup is either their Tier1/2 storage is growing so fast that the cost case for archive to cheaper storage make sense or they are hit with a regulatory request and or fine (nothing like a personal fine to get a CIO to take notice).
November 24th, 2007 at 2:39 am
I have been re-orged. I think it starts on Monday. So I am now in charge of the finer points of backup architecture as well as manage hundreds of TBs of multi tiered storage, a quite sizeable FC switch infrastructure and SRM software by the bucket load.
I can’t say I am happy about that. I really think that "disk" providers and "backup" providers should not belong to the same team at least in large organisations. I have seen way too much finger pointing happen when something goes wrong with the storage and data is lost only to find that some one has not accurately been backing up the data.
At least I get to look at archiving now as it was thrown my way during the re-org discussion. I get to deal with our architectural people and our Exchange and DB group to provide better DR solutions. Joy Joy!!!! Where is that online job search engine?
December 9th, 2007 at 8:24 pm
I think the answer depends on what’s your backup app, what the load is and what kind of infrastructure you have.A more expanded view: http://recoverymonkey.net/wordpress/?p=6 (not a shameless plug, I just think I addressed the major questions).D
January 10th, 2008 at 8:22 pm
Revinetix makes a great D2D2D product. I have used it with several of my customers. It can backup Exchange and restore to the individual object level. It can replicate to a remote site. The final D is a hot swap drive that you can take off site. Works great.