Skip to content


P class not B class enclosures.

I need to make a correction regarding my previous post about the HP blade enclosures.  Apparently they are P class enclosures not B.  I get confused with acronyms which I expect is someone’s job and the more confused the end user is, the higher the pay.  It sort of reminds me of the old days when B was a language.  Then came C.  I wonder where D is.  I know about D Trace for Solaris which looks a bit like C.  Did anyone ever see Blue Thunder and wondered what JAFO stood for?  It was revealed as the movie went on.

 

Anyway, it’s funny that a number of other bloggers including Tony Pearson at IBM:

 

http://www-03.ibm.com/developerworks/blogs/page/InsideSystemStorage?entry=welcome

 

also referred to the B class from my posting and said just how good the IBM blade servers are with SAN connectivity.  I wish I was a person working for a vendor.  I could write wonderful blogs that push my products (blogerteering?).  I really like reading the blogs from some vendors (not you Tony) who profess about the good of their products but when you read their Linkedin resume probably have never spent any real time managing storage at the coal front.  One person that springs to mind is Barry Burke.  I wont hold a grudge against Barry just because he was involved with Applix.  I do admire that after nearly thirty years, he is still in the IT industry.  Have a look at:

 

http://www.linkedin.com/in/barryburke?trk=btn_typepad

 

I personally enjoy reading the blogs of people like myself.  We put up with some of the stupidest things that could happen.  You don’t see those sort of down to earth events in blog sites that are run by vendors.  I would love to respond to some of the IBM postings but I refuse to accept the conditions posted at the bottom of the page.  My thoughts are my own and IBM can’t own them.

 

I would be happy to have a marketing person sit in with me when something goes horribly wrong that has something associated with SAN or might even be in the same state as our SAN.  Eg, like when our Exchange servers were playing up due to massive CRC errors and people were yelling at me to fix the storage when there was nothing wrong with it.  Some management definitely display their lack of IT knowledge when something goes wrong.  I doubt even the Messiah could have worked out that problem in about 5 minutes.

 

Another day, we lost our ONS links which provide our inter datacentre ISL’s for about 30 seconds so our sync True Copy had a dummy spit.  This caused some time outs for our HSC systems and again it was my fault.  Umm.. sure it was because if I did not use True Copy, we would not have any problems.  Then if our datacentre burns down or gets blown away, it would be my fault for not having True Copy.

 

Perhaps Barry and other vendor’s representatives could write about some of their exciting things that happens on a daily basis.  Like getting a size ten boot up the behind for not making sales numbers last quarter?  Or something like what happened at that last conference need not get out…  What happens in Vegas stays in Vegas.

 

So, what did HP say about this issue with the SAN and the P class enclosures?  Update the firmware.  Terrific, we are already using the latest version.

 Less sales waffle and more real stories would make me happy.  And Barry, don't take this personally.. or else I wont even consider buying a DMX 4. Stephen

Share

Posted in Stephen2615.


8 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

  1. the storage anarchist (barry burke) says

    No harm, no foul.
     
    And I’ll admit my day is no longer filled with the joys of managing a data center. Hidden in my 4 years at Adelie, though, is the fact that I indeed managed and operated our data center and IT operations. Small by your scale, to be sure, but we did have a half dozen microVaxen, a pair of Prime 750’s, and probably the first IBM 9370 in Cambridge. Not to mention the VM, MVS and CICS timesharing we bought from then-GTEDS in Tampa. Small shop – me and my staff of 2 us handled communications, backups, product release production, and the near-constant stream of APAR patches to be applied to the Baby Blue.
     
    I also no longer carry a bag or write marketing collateral in support of new product launches (shields UP!). But I do still work very directly with customers’ IT management and staffs to understand their specific requirements, and with the EMC engineers who are expected to deliver products and solutions that meet them. And I get my fair share of "exposure" to irate customers needing immediate explanations and solutions an the unexpected outage or slowdown.
     
    I’d like to think that my experience and credibility has earned me the respect of both the customers and the engineers I work with; I’m sorry if I haven’t earned yours yet. I’ll keep trying.

  2. stephen2615 says

    Barry,
     
    Thats the sort of information I like to see.  I have very little respect for a lot of marketing and/or sales people who don’t want to know what our problems are but are very willing to fix it.  A few years ago, there was a customer who bought systems by the shipload.  Seriously BIG money.  The service people were fantastic but the branch head and his side kicks never ever visited this customer.  Sure, the sales driods offer to see them once in a while.  They thought it was a done deal and they would be watching the money roll in for ever.  Then one day, they started seeing opposition systems on the data centre floor and still they ignored the customer because we have such a good relationship with this customer, they will never go anywhere else.  Now they sell a few systems and moan about how the competitor stole their customer.
     
     
    I see this everywhere because people (customers) are taken for granted or treated like fools because they have some bizarre concept of what they want to do that is outside the norm.  One time when I needed something huge, the sales person refused to sell it to me because it was an overkill.  The only strange thing about that was he had no idea in the world what I was going to use it for.  As I could not discuss what I was going to do with it, I went to another vendor and it was on the floor in a few weeks.  He still does the same thing today.
     
    BTW, Applix Ware is still embedded in my previous organisation.  Heaven help them.
     
    Stephen

  3. the storage anarchist says

    I will admit that I’ve been humbled by the customers who wanted to use the products I worked on in ways that seemed outside of the envelope. Thankfully, somewhere along the lines I’ve learned to ask "why" instead of explaining "why not."
     
    Not always what the customer expected to hear from an EMC employee, I guess.
     
    I’m off to a round of customer-perspective-gathering Tuesday in NJ (seems that’s where all the data centers that used to be in Manhattan are these days). I get to share what we’re doing next and why, but what I’ll learn from them will be infinitely more valuable.
     
    Hopefully I can translate what I learn into a stronger product roadmap, and an interesting perspective for my (our) readers.
     
    TTFN!

  4. Stephen2615 says

    Another update on this Blade issue.  I found that even though the BIOS is exactly the same version (1.48) there is a NVRAM file that has one small change that affects the receiving of data at 2 Gpbs.  I don’t know what it does and when I googled it, nothing matched.  Nothing on the Qlogic website either.  Considering that HP says this issue does not happen at 1 Gbps, perhaps it is some sort of throttle with the reception of data at higher speeds. 
     
    So the plan is to update our two HSC Exchange clusters at one data centre, failover and see what happens.  If nothing happens for a day or so, it may mean a solution to this particularly annoying issue.  If it happens again, Mr HP will have some explaining to do.
     
    Besides failing over to our DR site is always good practice.  The only down side of this is having to update the NVRAM of hundreds of servers.  I seriously doubt that it can be automated as it is suggested to do it at the DOS level or hopefully I can use SanSurfer.  Whatever way it is done, a reboot is required. 
     
    Stephen

  5. JM says

    We’re seeing some issues with HP blade servers as well.  Please do post an update as to whether or not the NVRAM update fixed your problem.

  6. Stephen2615 says

    One of our Exchange servers that had the problem with thousands of CRC errors has been running on the NVRAM patched server for five days now and nothing has happened.  That could be good but the same server ran on the same hardware about 2 months ago and had no issues other than just being busy.  I think this could be a long drawn out issue to see if the CRC errors have been fixed.  I will definitely post something when I feel as though there has been a resolution.
     
     
    My main complaint was that one brand new server in particular just kept dying with CRC errors under heavy SAN usage and as soon as I moved it to another enclosure, they stopped immediately.  If I had not seen that, I would have been satisfied that the NVRAM change had fixed this.
     
     
    I found out that a HP outsourced project where they are running Linux on the P Class blades (mainly BL45p G1’s)  has been plagued with apparent SAN issues.  If that happens with HP themselves, its gotta be worth keeping an eye on…
     
     
    I bet I will find out all about this issue after my couple of weeks sitting on the beach in Phuket.  Pity my phone does not have global roaming setup.
     
     
    Stephen

  7. JM says

    Bump…  I’m still curious about this and see that you’re back and hopefully feeling better by now.  Can you post an update?

  8. Stephen2615 says

    I am still not all that convinced that the NVRAM update fixed this issue.  What I did discover yesterday was a number of servers with very high CRC’s logged.  We use Brocade switches and I found that one command that I never really knew about before shows something that indicates the server complaining about CRC’s.
     
    I will have to put in change control to get the NVRAM updated on these servers as they need a reboot and then see what happens.  If this fixes these servers, then I believe it might be the fix I am looking for.
     
    Stephen