INCLUDE_DATA

Archive for August, 2007

Lessons learned from Subway

Friday, August 31st, 2007

I happened to be in Cannon street Subway (sandwich restaurant, if you can call it a restaurant) the other day and due to the fact that I was absolutely ravaging hungry I ordered myself a sandwich with so much filling that it would give most people lock-jaw just from trying to fit it in their mouth – foot long roast beef on hearty Italian with extra cheese and bacon, toasted and with all the fillings, topped off with south west sauce!  As you can imagine, I was really looking forward to it!

I’ve always been impressed with how efficient they are in preparing and delivering their sandwiches – one guy for the bread, another for the meat and cheese, another for the oven, another for the salad, another for the sauce and another to wrap it.  And it was all going so well until the wrapping part.  Now to be fair, I think the girl wrapping it was probably new, but either way she made an absolute hash of it.  As she attempted to fold and cut it, the majority of the filling spilled all over the counter.  She then made the most pathetic attempt to wrap it.  Well that was it, sandwich ruined!  It ended up being one of the messiest lunches Ive eaten, requiring the use of enough tissues to account for a small rainforest.  And were I not such a die hard fan of Subway I probably wouldn’t have gone back the next day (which of course I did).

At the same time I happened to be working at a site dealing with a mess caused by human error – like my sandwich.  The kit itself was great, the solution was well designed ………. but because of some sloppy workmanship (actually mistakes that any of us could have made) the customer was in a real pickle.  The customers sandwich was effectively all over the counter and not looking too appetising.

Fortunately, with the help of some very dedicated people, including the guy who made the initial mistakes, we were able to sort out the mess and leave the customer relatively happy.

The thing is, like me with my sandwich, if the customer had been a new customer with no previous good experiences with the kit, they may well have started taking their lunch elsewhere.

So what I learned from my subway experience is that no matter how good your kit is, you still need the right people – pre-sales, architects, professional services, support, engineers, account managers……  It’s one thing having great kit, but if you cant assemble it properly so that it looks like it does on the posters, and if you keep it operating smoothly, or fix it when it goes wrong then a lot of people will shop elsewhere.

In fact, I happened to be in the same subway restaurant a week or two later dealing with an issue late on a Friday evening.  I made it to the restaurant a minute or two before closing up and was obviously going to be their last customer of the day.  But that didn’t stop them getting opening a brand new pack of lettuce just for me.  This left a lasting impression on me!

With this in mind, I often hear account managers and the likes saying that Customer X think they are a lot bigger than they are and expect to get the same treatment as Customer Y who are huge in comparison.  Well I was only buying for myself that Friday night at Subway, certainly not worth opening a new bag of lettuce for.  But because they were so wiling to open it for me I will certainly be back for more, and recommending them to people working in the area.

Nigel

  • Share/Bookmark

Re: Vendor Specific Requirements

Tuesday, August 21st, 2007

I decided to write this in a post rather than a comment because it turned out to be a little long for a comment in my opinion.

So, to paraphrase, vendors are shoving things into user's environments and then the users get caught "in the middle of 'I don't care' from the vendors.

The users:

So the users have a bunch of different types of equipment in the their storage environments.  They have kept things heterogeneous so as to get good pricing from the storage vendors and, in turn, get a good return on their investment.  The users expect the vendors to remain as engaged as they were during the purchase cycle, but are realizing that this just isn't occurring and are disappointed.  Everything runs fine for a while (most of the time), and now it comes time for the users to do an upgrade of some sort to equipment that is attached to the storage environment.  (Yes storage vendors, users have to maintain their environment.  All the other vendors drop support for things the exact same way you do.)  This is when the user starts to have troubles.  They've upgraded a tape driver (or some other storage device driver) and now they can't see their disk drives.  No one wants to help them troubleshoot their problem because "it's not in the support matrix".  And this is where the users get stuck "in the middle of 'I don't care' from the vendors".

The vendors:

So the vendors have developed hardware and software for the users to utilize in their storage environments.  They have spent gobs of dollars developing, testing, and selling the goods to the users and now want to reap the rewards. They have a vested interest to get the product out the door and onto the users data center floor so that they can meet "The Streets" numbers and make their investors happy.  So they meet with said users and sell them some goods based on initial requirements, given them by the user, and get the goods implemented on the data center floor.  The users are happy and the investors are happy.  Now the sales team goes off and moves onto the next user to sell them some goods to keep the investors happy.  The sales team doesn't have "cycles" to spend with a customer that has already bought something and the users start to feel neglected.  Now the users have to maintain their environment (weird huh?) and this creates a problem in the users environment.  So the user calls the support line and after following the sun around the world they finally get someone on the phone that understands what a storage environment is (and the native language for that matter).  They go through a few things and it turns out that the vendor hasn't tested the new driver and it's not in their support matrix.  Now the vendor tells the user that "it's not my problem, call that other vendor".  And again the user is now caught "in the middle of "I don't cares'".

The solution:

So what is a user to do?  Well, there are a couple things I would do (and have done).

  1. Create a master support document that documents all levels of firmware, microcode, driver, etc. for all hardware and software in your storage environment and MAKE all your vendors support it.  Put it in the contract or master purchase agreement that they will support your current levels of code and any future levels of code.  If they won't, then you don't buy from them.
  2. Create a quarterly review process to review all upgrades as they pertain to your storage environment and ensure that all your vendors will support the proposed code levels before upgrading.

Now, not all shops are big enough to do those couple things or they probably are way under staffed and don't have the time to do it.  So this is where the storage partner comes in.  A company that doesn't manufacture anything or develop any software.  A company that can take first call support for problems and that will be on the hook for the problems from start to finish.  A company like this would have to maintain good customer relations or they would be out of business.  If the customer isn't happy, they won't return and buy more stuff and the company would go out of business, simple.  I would highly recommend that all users find a company  to partner with that specializes in storage and can help relieve some of the problems mentioned above.  If there isn't one in your country, think about starting one.  The worlds storage isn't getting any smaller.

  • Share/Bookmark

300 GB 15K Drives for the USP/USP V and NSC

Tuesday, August 21st, 2007

I got notification today about the new disks available for the USP et al.

I thought it interesting that there were a couple of things that might need comment.

Even though it was not really suggested, the notificaiton said:

<qoute>

…. due to the large capacity and the proportionally longer rebuild time for these drives, a RAID-6 configuration provides additional protection against possible data loss in the event of a second drive failure during the longer rebuild time of the first failure. 

</quote>

I like the idea that someone actually says it is a good idea as so far, no one is really pushing that.  Also considering our double disk failure in one parity group, it is something to think about. 

Now comes the hard bit.  As the drives get bigger, the IOPS stay much the same.  I would expect that the 15k 300 GB drives would have similiar IOPS to the 15k 146 GB drives and you would have to consider that if you wanted to go that path.  It also leads me to something I read that states:

Each Array Control Pair (ACP) on the Universal Storage Platform can support 15,500 IOPS.  Thats about one third of the IOPS the actual disks can produce.

I was lucky to have done a lot of maths with my Computer Science degree and if we are to use all those recommendations about such and such a configuration, I might actually be able to use what I learnt.  I will have to dust off my calculator and probably replace its batteries.  I can hardly wait until we can only get minimum 500 GB drives that spit out 100 IOPS max.  Then the ACP's will have enough back end disks to just cope.

Actually I am happy to use the 300 GB 15k drives if they are priced well and HDS drop the licencing costs per TB.  I wanted HDS to provide 500 GB SATA disks for the USP so I could provide cheaper backup devices to our servers.  EMC can now do it..  Its all about getting the right solution for the right price.

Stephen

  • Share/Bookmark

Vendors specific requirements

Sunday, August 19th, 2007

I have come to the conclusion that vendors don't care all that much.  Shove the system in the door and hope the customer has too much on their collective plates to worry about anything ever again.

That's a rather broad statement isn't it?  But perhaps I am right.  I am putting up with soo much angst with our users/customers that I have had to do a lot of research which offers surprising results.

This has all resulted from my problems with the HP P class enclosures.  I think the HP NVRAM patch may have fixed this but it is still early days.  I would desperately love to know what rx sensivitiy in the NVRAM means?  It has been changed from 0 to 2.  I asked Qlogic and they said it has a HP OEM setting.

We run Hitachi Storage Clusters (HSC) so we can have maximum uptime for our mission critical systems.  Its a good solution for the end user.  However, when something goes wrong, it is a mess.  Microsoft don't want to know a bar about the system (or so I have been told) and it is now a HDS problem.  Umm.. since when did HDS go into the business of Operating Systems and server hardware offerings?  Obviously the day they introduced HSC. 

So when we started getting the CRC's and massive timeouts, HDS were the scapegoat.  I won't go any further into that for obvious reasons but this is where it gets interesting.

When we failed over with HSC and performed a reboot to clean some things up (which BTW were a Windows registry setting mixup), one server refused to start and it looked like a HDLM issue.  I finally (after three days) figured out it was a driver issue for the HBA.  HDS will only support a certain driver for STOR miniport.  Funny that it is not on the Qlogic Web site and we had to go to a later version which is not officially supported.  I think we are currently using 9.1.2.16 because I can't find 9.1.2.14.

Then if you look at the HP website, the supported version for the Mezzanine cards in the HP P class blades is 9.1.3.16.  So, that means HDS is way behind right?  Wait, it seems that HP support 9.1.0.13 for their storage solutions.  That makes HDS look positively modern.

Where does this leave me?  Caught in the middle of don't cares.  Qlogic have released 9.1.4.15 which I use with my IBM storage because I doubt they care either. 

I also find posting a web page on the vendors site that suggests fixes to something that is bound to cause major disruption a very lame reason for not looking after the customer.  If a customer has thousands of servers, surely the account manager would think, hey, this customer is important so why not actually tell them there is a problem.  I searched many times on the HP site and still could not find the link to tell me how to fix this problem.  I changed my search criteria and finally found the page.  Funny how syntax can make a huge difference in a search.

For all you HP P class users out there, check this one out.

http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&objectID=c00874134&jumpid=reg_R1002_USEN

What really annoys me about all this is that I am somehow the person responsible for all this.  Why, I don't have a clue.  I am not a Windows server admin.  I just manage the SAN.  It just happens that CRC's cause issues with the SAN so now I have a new life of vendor troubleshooter. 

Perhaps if SNIA (Sneer) needs a credible excuse for doing what it does, then they should get all their Storage partners together and start formulating some standard that make sense.  Heck, SMI-S is a poor excuse for anything.

Stephen (soon to be sitting on a beach in Phuket and not caring for a few weeks)

  • Share/Bookmark

P class not B class enclosures.

Monday, August 13th, 2007

I need to make a correction regarding my previous post about the HP blade enclosures.  Apparently they are P class enclosures not B.  I get confused with acronyms which I expect is someone’s job and the more confused the end user is, the higher the pay.  It sort of reminds me of the old days when B was a language.  Then came C.  I wonder where D is.  I know about D Trace for Solaris which looks a bit like C.  Did anyone ever see Blue Thunder and wondered what JAFO stood for?  It was revealed as the movie went on.

 

Anyway, it’s funny that a number of other bloggers including Tony Pearson at IBM:

 

http://www-03.ibm.com/developerworks/blogs/page/InsideSystemStorage?entry=welcome

 

also referred to the B class from my posting and said just how good the IBM blade servers are with SAN connectivity.  I wish I was a person working for a vendor.  I could write wonderful blogs that push my products (blogerteering?).  I really like reading the blogs from some vendors (not you Tony) who profess about the good of their products but when you read their Linkedin resume probably have never spent any real time managing storage at the coal front.  One person that springs to mind is Barry Burke.  I wont hold a grudge against Barry just because he was involved with Applix.  I do admire that after nearly thirty years, he is still in the IT industry.  Have a look at:

 

http://www.linkedin.com/in/barryburke?trk=btn_typepad

 

I personally enjoy reading the blogs of people like myself.  We put up with some of the stupidest things that could happen.  You don’t see those sort of down to earth events in blog sites that are run by vendors.  I would love to respond to some of the IBM postings but I refuse to accept the conditions posted at the bottom of the page.  My thoughts are my own and IBM can’t own them.

 

I would be happy to have a marketing person sit in with me when something goes horribly wrong that has something associated with SAN or might even be in the same state as our SAN.  Eg, like when our Exchange servers were playing up due to massive CRC errors and people were yelling at me to fix the storage when there was nothing wrong with it.  Some management definitely display their lack of IT knowledge when something goes wrong.  I doubt even the Messiah could have worked out that problem in about 5 minutes.

 

Another day, we lost our ONS links which provide our inter datacentre ISL’s for about 30 seconds so our sync True Copy had a dummy spit.  This caused some time outs for our HSC systems and again it was my fault.  Umm.. sure it was because if I did not use True Copy, we would not have any problems.  Then if our datacentre burns down or gets blown away, it would be my fault for not having True Copy.

 

Perhaps Barry and other vendor’s representatives could write about some of their exciting things that happens on a daily basis.  Like getting a size ten boot up the behind for not making sales numbers last quarter?  Or something like what happened at that last conference need not get out…  What happens in Vegas stays in Vegas.

 

So, what did HP say about this issue with the SAN and the P class enclosures?  Update the firmware.  Terrific, we are already using the latest version.

 Less sales waffle and more real stories would make me happy.  And Barry, don't take this personally.. or else I wont even consider buying a DMX 4. Stephen

  • Share/Bookmark

More tape tales

Friday, August 10th, 2007

There is definitely something about tapes, they just seem to attract trouble!

We’ve talked in the past about tapes that have been inserted upside down and back-to-front, tapes that have had barcodes maliciously swapped and tapes that have come back from the vault with grass all over them.  But this one is a new one for me…..

Basically, I was recently involved in a Commvault implementation when we came across a tape that we thought was faulty.  The library had tried several times to mount the tape with no luck.  So we had it removed from library and shipped to site so that we could package it up and return to the supplier as DOA.

However, this is what we saw when it came back to us –

Seriously! It had been put into the library with the instruction sheet explaining where to put the barcode label stuck to it!

Notice the circular markings in the middle of the paper that were made by the reel motor as it repeatedly tried to engage the reel drum.  Makes me wonder if we had tried to mount the tape enough times if the reel motor would eventually have worn a hole through the paper and mounted and used the tape with the paper still attached??  Guess I’ll never know!

The thing is, this was done by somebody I have a fair bit of respect for! 

Inserting tapes upside downand the likes is usually down to stupidity or not knowing what your doing.  This on the other hand could have been done by any of us.  Like I said, this was a new installation and involved loading a lot of new tapes into a new library – quite a monotonous task – and Im sure had it been me loading the tapes I may well have missed it.  Well……… ;-)

A case for VTL??

  • Share/Bookmark

Recruiting – you have to love it.

Wednesday, August 8th, 2007

I see it is relatively quiet and no one has mentioned the new Cisco MDS products.  Cisco now has DDM which looks very similar to Brocade's DMM but we expected that and they also have some new switches.

We are in a recruitment drive for Windows people but the ad talks about our environment with the Mainframe and the SAN.  One person was casually asking about the infrastructure and he somehow got onto the SAN.  He asked what it was Tape or Disks?  Tape??  I thought tape could mean backups using fibre channel but jokingly said wireless iSCSI and he said oh yes, I know all about that.

I am so looking forward to his resume.

There has been a Storage Admin job advertised for many months but no one wants it.  Why?  It's offering VERY good money but the word has got around the company has poor management practices and most people don't last for more than 6 months.  So, with the shortage of good SAN people, good money and conditions, what can that company do to recruit someone?  I think I might send that wireless iSCSI guru to them.

This leads me to the thought that has anyone ever thought about the standards that storage administrators should follow?  Can an employer look up a web site to find questions to ask prospective employees?  More often than not, they are recruiting because the previous one left so how can companies know what they are getting.  Most employment agencies would not have a clue what happens in a SAN and think you are talking Martian when you discuss disk alignment or encoding errors outside frames.

So, does SNIA (or as I say SNEER) offer any guidance on this?  Has the storage industry got something to help employers?

Stephen

  • Share/Bookmark

Barry Whyte is Blogging

Monday, August 6th, 2007

There is a new blog out there as of Friday from IBM'er Barry Whyte. He's a "Master Inventor" so he must have quite a bit of knowledge on storage.  Also, he has pledged to keep the "blogketing" to a minimum.

Take a look here: Clicky

  • Share/Bookmark