I have come to the conclusion that vendors don't care all that much. Shove the system in the door and hope the customer has too much on their collective plates to worry about anything ever again.
That's a rather broad statement isn't it? But perhaps I am right. I am putting up with soo much angst with our users/customers that I have had to do a lot of research which offers surprising results.
This has all resulted from my problems with the HP P class enclosures. I think the HP NVRAM patch may have fixed this but it is still early days. I would desperately love to know what rx sensivitiy in the NVRAM means? It has been changed from 0 to 2. I asked Qlogic and they said it has a HP OEM setting.
We run Hitachi Storage Clusters (HSC) so we can have maximum uptime for our mission critical systems. Its a good solution for the end user. However, when something goes wrong, it is a mess. Microsoft don't want to know a bar about the system (or so I have been told) and it is now a HDS problem. Umm.. since when did HDS go into the business of Operating Systems and server hardware offerings? Obviously the day they introduced HSC.
So when we started getting the CRC's and massive timeouts, HDS were the scapegoat. I won't go any further into that for obvious reasons but this is where it gets interesting.
When we failed over with HSC and performed a reboot to clean some things up (which BTW were a Windows registry setting mixup), one server refused to start and it looked like a HDLM issue. I finally (after three days) figured out it was a driver issue for the HBA. HDS will only support a certain driver for STOR miniport. Funny that it is not on the Qlogic Web site and we had to go to a later version which is not officially supported. I think we are currently using 18.104.22.168 because I can't find 22.214.171.124.
Then if you look at the HP website, the supported version for the Mezzanine cards in the HP P class blades is 126.96.36.199. So, that means HDS is way behind right? Wait, it seems that HP support 188.8.131.52 for their storage solutions. That makes HDS look positively modern.
Where does this leave me? Caught in the middle of don't cares. Qlogic have released 184.108.40.206 which I use with my IBM storage because I doubt they care either.
I also find posting a web page on the vendors site that suggests fixes to something that is bound to cause major disruption a very lame reason for not looking after the customer. If a customer has thousands of servers, surely the account manager would think, hey, this customer is important so why not actually tell them there is a problem. I searched many times on the HP site and still could not find the link to tell me how to fix this problem. I changed my search criteria and finally found the page. Funny how syntax can make a huge difference in a search.
For all you HP P class users out there, check this one out.
What really annoys me about all this is that I am somehow the person responsible for all this. Why, I don't have a clue. I am not a Windows server admin. I just manage the SAN. It just happens that CRC's cause issues with the SAN so now I have a new life of vendor troubleshooter.
Perhaps if SNIA (Sneer) needs a credible excuse for doing what it does, then they should get all their Storage partners together and start formulating some standard that make sense. Heck, SMI-S is a poor excuse for anything.
Stephen (soon to be sitting on a beach in Phuket and not caring for a few weeks)