Skip to content


Moving on. Starting a new chapter in my career.

One week down at SimpliVity so far and it’s been great!  As some of you know I recently left  Datalink where I was a Solutions Architect and working with customers to design solutions for their data centers.  My leaving is bitter sweet in that it was a great company to work for and the people there I can truly call my friends.  Datalink has a truly wonderful group of engineers that really put the Value in the term VAR.

After working on both the customer side and the VAR side of the IT industry the opportunity came to work for a startup in the manufacturing side of the industry and it was one of those things that was honestly too good to pass up.  I look at this transition as the next logical phase in my IT career and will give the me the opportunity to learn both the manufacturing/vendor side as well as the startup side.

 

SimpliVity has released, what I think is going to be, the next big thing in infrastructure.  They have combined both compute and storage in a single box they call the OmniCube.  While that is cool on it’s own, as you dig deeper into the technology it only gets cooler.  They create a global federation of systems that dedupes a VMs data, inline, as the data is being written to disk.  Every OmniCube within a federation knows where the data is stored, no matter where the OmniCube resides around the world.  This allows for a ton of other really cool features.  Extremely efficient backups and restores, super fast clones of VMs, and replication to geographically dispersed data centers.  And on top of all that, everything is managed on a per VM basis so that you can be as granular as you’d like when creating and selecting a protection policy.

 

I really can’t say enough about how cool the technology is right now, because I’m getting way off topic, so I’ll leave that for future blog posts.  For now you can check out some videos on the SimpliVity website <here> and <here>.

 

I look forward to working with all of the future SimpliVity partners and customers as we roll out this game changing tech.  I’d like to thank the folks at Datalink for all the opportunities they afforded me during my six years there and wish them tons of luck as they continue to grow.

Posted in Snig.

Tagged with , , .


WTF!? – Powering on a virtual machine fails with the error: A general system error occurred

This can by far be one of the most frustrating error messages you will run across in a vSphere environment.  I was at a customer site yesterday and the customer ran across it while I was working through some other things.  He asked if I could fix it for him.  :)

I immedieatly dove into the issue and started looking at the problem he was having and the error message was very generalized in nature.  ”Powering on a virtual machine fails with the error: A general system error occurred”  I wasn’t able to change any settings on the VM or power it up.  I was able to remove it from inventory and re-add it no problem.  So I Googled the error message and ran across this KB article: Link.

The issue sort of pointed to a corrupt VMX file, so I attempted to create a new VM and point it at the disks from the old VM folder.  The problem with that was the fact that while I could browse the datastore and see the VMDK files, I wasn’t able to “see” them when creating a VM.  I was able to download the files from the datastore but, wait for it, I was not able to upload anything.  I know, it seems like a read only datastore right?  Well it was, but vSphere didn’t think it was.  I wasn’t able to unmount it from the GUI, but was able to via command line.  I could even remount it after I had removed it.

So it turns out that the customer had changed the VMKernel IP address that he was using to access the datastores on his VNX 5300.  If you’ve never used a VNX before, if a datastore has been mapped previously then vSphere will still be able to see the datastore even though it hasn’t been given explicit permissions to do so.  It essentially is a read only file system even though vSphere thinks it’s not.  As soon as you try and power on your VMs (which writes files to the datastore) you see the general error message from the KB article.  Fun huh?

Hopefully this helps some of you to quicker resolution should you run into this.

Posted in Snig.

Tagged with , , , .


The Cloud Orchestration Layer is Changing

With the recent purchases of a couple key companies within the cloud orchestration space I felt it was time to write about it.  The companies I am referring to are Cloupia, who was purchased by Cisco in December and Dynamic Ops, who was purchased by VMware in July.  In my opinion these companies had such a huge head start on where both of their parents were at the orchestration/cloud layer that there really wasn’t a way for the parents to catch up without the purchases.  Thankfully they picked some bright stars to buy.

We know that both Cloupia and Dynamic Ops had great products, but the question now is; what are Cisco and VMware going to do with them now that they have them?  The genius of both products is that they started out as workflow engines and then the companies built the workflows we see today on top of those engines.  They made it extremely easy for the rest of us to install their product into our environments and begin allocating VMs within hours and not days/weeks.  Translating business workflows and processes into these products is a no brainer.

Cloupia

Cloupia’s primary product was named Cloupia Unified Infrastructure Controller or CUIC.  CUIC has the ability to orchestrate not only at the virtualization layer across multiple hyper visors, it can also orchestrate across the physical layer.  This includes servers from HP, Cisco, and Dell; storage from NetApp, HP, EMC, and Hitachi; and network gear from Cisco, Dell, and HP.  They have charge back, self service, analytics, and both North bound and South bound APIs all built into one simple appliance.  For Cisco the primary challenge will be in how they integrate with CIAC.  While CIAC has been a great product for Cisco, it is extremely difficult to get setup and doing what you need to do.  The secondary challenge will be to maintain the relationships with the 3rd party vendors to keep the hardware orchestration rolling.  It is, in the famous words of the “big” IT tool vendors, extensible.  My hope is that the Cloupia team is inserted at a high enough level within the organization that it doesn’t get killed or maimed to the point of uselessness.

Dynamic Ops

VMware released Dynamic Ops as vCloud Automation Center (vCAC) on December 13th.  vCAC is another great workflow engine that has the ability to orchestrate at the virtualization layer across multiple hyper visors as well as the physical server layer.  vCAC is missing the the physical storage and network pieces today when compared to CUIC.  It has a basic charge back mechanism, awesome self service, some performance monitoring, and uses South bound APIs.  While at GA time it’s lacking some of the features others have, I think, in the long term, VMware has a great chance to integrate all of their current management products into vCAC for seamless management across the full IT spectrum.  For VMware the challenge will be to make vCloud Director what it is really good at again, development workflows, and not have vCD be the end all be all cloud management platform.  VMware needs to give vCAC the ability to plug into and use vShield, et al via APIs and not hold vCAC back from what it could be.  Just think about it, most of the management products VMware has been most successful with have been parts of the IT workflow. e.g. SRM, vCOps, etc.  vCAC is simply another workflow engine just waiting for these things to be plugged into it.  Rename vCD back to Lab Manager and just turn everything over to vCAC and call it a day.  Simple right?

Summary

Now I’m sure that somebody’s feelings are going to get hurt by this post.  It was not my intention to hurt feelings here, I just simply want to see some great products/solutions continue and thrive.  Sometimes it makes the most sense for vendors to fall on their sword and self-canabilize things that haven’t worked out and go with what does.  If these vendors roll these products out the right way they both are going to be successful.  Now we just have to wait and see what happens.

 

P.S.  Have a happy new year and a great 2013!

Posted in Snig.

Tagged with , , , .


vSphere Replication and SRM Implementation Notes

I went through a vSphere Replication and Site Recovery Manager implementation this week and overall it went very well.  There are a few things that aren’t really noted in the documentation that I wanted to get out there just in case anyone else runs into the speed bumps I did this week.  While I’ve implemented SRM with array based replication many times, this was my first go with vSphere Replication.

 

First, and most important, is to use the FQDN throughout the entire implementation of SRM, VRMS, and VRS.  While we used the FQDN for the vCenter server when we installed SRM, when you deploy the vSphere Replication Manager Server (VMRS) it defaults to the IP address for the vCenter server on the configuration page.  If you don’t change this to the FQDN of the vCenter server SRM won’t be able to “find” the VRMS servers when you attempt to create the connection between the two.

 

SQL Express port issue.  This customer decided they wanted to use the vCenter implementation of SQL Express for the VRMS database.  While this works fine you need to understand that when vCenter does it’s install of SQL Express it chooses a variable port by default.  Since it does that, you’ll need to manually change the port the DB connections are received on.  I recommend port 1433 which is the default SQL Server port.  You can follow a good blog post on MSDN here:  Link

 

VRMS and VRS SSH root access.  By default root access is not enabled for remote SSH management.  While a good security practice, it makes troubleshooting a PitA.  There is a great KB article you can read here, but I’ve listed the steps below if you would rather stay here.

  1. Open the console for the appliance in vCenter
  2. login with root. You set this password during deployment of the appliance
  3. cd to /etc/ssh/
  4. vi sshd_config
  5. find line with “PermitRootLogin” and set to yes. (type i to insert and then ESC when done editing. :wq will save changed and quit.)
  6. restart the SSH service with service sshd restart
Logs for VRMS and VRS.  Locations for the logs for each appliance are listed below.  Once SSH is configured you can use a SFTP client to connect and download the logs.  If you need a support bundle, refer to the KB article listed above.
VRMS logs = /opt/vmware/hms/logs  You’re looking for the latest hms*.log
VRS logs = /var/log/vmware  You’re looking for the latest hbrsrv*.log
Multiple Datastores per Source VM.  Sometimes you might have multiple datastores presented to a VM for any number of reasons.  If you’re in this boat here is how you need to setup the target datastores you’re replicating to.  You have two options here:
  1. Create a one to one datastore “mirror” so that you can automatically map the datastores to each other in the “Datastore Mappings” tab within vSphere Replication.
  2. Create a single datastore and then create folders on that datastore for the individual hard disks the VM is using.  e.g. For server FS1 that has 3 VMDKs, each on a different datastore you would create folders fs1_hd1, fs1_hd2, and fs1_hd3 on a single datastore. Then you can simply select those folders when setting up vSphere Replication for that VM.

 

Well I hope this helps someone as they’re going through this implementation.  It definitely gives me a quick go to if I run into any of these again. ;)

Posted in Snig, vmware.

Tagged with , , , , .


Kemp Loadmaster: The Best Load Balancer You’ve Never Heard Of

 

 

 

I was contacted recently and asked to check out the latest virtual load balancer by Kemp Technologies and I have to say for the size of the package it sure packs some punch!

The load balancer has all the features and functionality of the large scale “enterprise class” load balancers you’ve heard of but comes in an extremely small (30 MB) package.  It only requires 1GB of RAM so it is extremely efficient on system resources as well.  The architecture is what you would normally see in this type of solution, but for any noobs out there here is a visual.

So let’s get to the install and setup.  The licensing and initial setup was extremely easy and wizard driven from the console.  Just assign IPs to the eth0 and eth1 interfaces, Default GW, and DNS as you normally would on any appliance.

Once on the network I was able to login via web browser (Chrome in my case) and start setting up virtual services and load balancing rules.  All the standard rule types are there that you would expect from the big boys.

From there I was balancing traffic, although in my little lab. =)  Based on some of the testing that Kemp is touting on their website these virtual appliances can handle quite a bit of workload with only 2GB of RAM allocated.

The appliance has built in stats monitoring as well as a pretty simple but informative performance monitoring tool.  Several logging features as well as SMTP and SNMP notifications are there.  Advanced Layer 7 configuration as well as a bunch of other features come with the appliance too.

The load balancers come in either physical or virtual appliances and can run redundantly on both platforms.  The entire list of features can be found on their website here.

All in all for the price/performance for these boxes/appliances I think it would be a no brainer to give them a look.  Kemp has a free trial that you can run with the virtual appliances so again, it really is a no brainer to give them a try.

Hopefully this post helps some of you out there save a little money on your overall solutions without giving up any of the features or functionality you need.  Post in the comments below and let us know what you think about the load balancer if you give it a try.

Posted in Snig.

Tagged with , , .


VCP 5 Study Guide Review

 

 

 

 

 

 

 

Fellow vExpert Brian Atkinson asked me to review his latest book VCP5 VMware Certified Professional on vSphere 5 Study Guide and I’m glad I agreed to do so.  What a great resource for future VCP 5 test takers to have available to them!  I wish I had had it when I was studying for my test a few months ago.  I highly recommend anyone planning to take exam VCP-510 to buy this book.

Link to the book on Amazon is here: link

It already has a couple 5 star reviews so I am not the only person impressed.  And don’t forget that the book comes with a custom online practice test engine with over 300 sample questions and flashcards.

Since I’ve already passed the test I’m going to be giving my copy away at the Datalink booth at EMC World tomorrow (5/21).  Just keep an eye on twitter and I’ll announce what you need to do to win.

Here are some of the features of the book:

  • Full coverage of all exam objectives in a systematic approach, so you can be confident you’re getting the instruction you need for the exam
  • Real-world scenarios that put what you’ve learned in the context of actual job roles
  • Challenging review questions in each chapter to prepare you for exam day
  • Exam Essentials, a key feature in each chapter that identifies critical areas you must become proficient in before taking the exam
  • A handy tear card that maps every official exam objective to the corresponding chapter in the book, so you can track your exam prep objective by objective
  • Sybex Exam Prep Tools

Posted in vmware.

Tagged with , .


vSphere 5 Upgrade: PSOD for vSphere 4.1.0 Hosts

I had a customer get ambushed by this last week and wanted to put this out there so hopefully no one else will run into this huge problem.  There is a VMware KB article out regarding the issue, however none of the release notes or upgrade documentation has been updated to reflect the information in the KB article.  Bottom line: DON’T UPGRADE VCENTER TO V5 IF YOU ARE RUNNING VSPHERE 4.1.0 (NO UPDATES) ON YOUR HOSTS!

The KB article can be found at the link here: clicky

Most of us have read through the upgrade guide for vSphere 5 and we know that the first step in the upgrade process is to upgrade/install vCenter 5.  After doing that, if you are running hosts at 4.1 with no updates you will begin having problems with hosts disconnecting and re-adding hosts to vCenter.  This could result in a PSOD if you turn out to be unlucky and catch this bug.

For this customer 8 out of 10 hosts in the cluster they were troubleshooting with VMware Support had a PSOD as soon as they re-enabled HA.  Needless to say all the VMs on those hosts went down and had to be brought back up when the hosts came back online.  Luckily there wasn’t any corruption within the VMs and they didn’t have to go through a lengthy restore process.

If you look at the chart below, VMware has 4.1.0 listed as supported for an upgrade.  Hopefully this will change in the future.

 

 

 

 

Hopefully this article helped some people before they got bit by this bug.

Update:  Chris Wahl has a great article on this over on his blog as well.  clicky

Update 2:  VMware has changed their compatibility matrix to reflect the bug.  Great and very reactive support as always from VMware!

 

Posted in vmware.

Tagged with , , , .


vCenter 5 Database Move: Update the tomcat DB pointer

Ran into a weird issue the other day and was totally perplexed.  I moved a vCenter database from one SQL server to another, as I have done numerous times before without issue, but this time something else cropped up.  The old SQL server kept receiveing login attempts from the vCenter server.  What?!?  

My DSNs for vCenter and Update Manager had been changed to point at the new DB server.  All apps and plugins that query vCenter use these DSNs, WTF is going on?

The "VMware VirtualCenter Management Webservices" service utilizes tomcat to connect to the DB to, I think, run the rollup jobs for DB cleanup and performance data.  vCenter runs fine without this service being able to connect to the DB though, so that is what was confusing for a while.  If you dig down into a config file at "C:\Program Data\VMware\VMware VirtualCenter\vcdb.properties" you will find what you're looking for.  A URL pointing to the old DB server.  e.g. url=jdbc:sqlserver://<dbserver>;databaseName\=<vcenter database name>  

The fix was to change the <dbserver> to the new DB server and all failed login attempts went away.  And in the spirit of the holidays "Yipee ki yay mother…"!

Hope this helps those DBAs out there who hate to see failed login attempts filling up their logs!  

Thanks to VMware Support for helping me run this down! Always a great experience when having to work with those guys and gals!  I don't have to call support often, but when I do, I call VMware Support…

Posted in Snig, vmware.

Tagged with , , , , .


vSphere Customization Specification Manager Bug

I've been working at a client the past few weeks and one of my tasks was to begin the process of upgrading their infrastructure to vSphere 5.  As we all know, the first step is to upgrade vCenter.  So I walked through the upgrade across all three vCenter servers and everything seemed to be ok until we went to deploy a new VM using a customization specification that had been created previously under vCenter 4.1.

During the customization process for the VM we received the infamous "Windows Setup encountered an internal error while loading or searching for an unattended answer file."  I opened a ticket with VMware, but I figured out the problem before they could call me back.  The problem stems from an amperes and "&" the client was using in the Organization field on the first screen of the Guest Customization wizard.  A simple change of "&" to "and" fixed the issue and everyone is happy.

VMware did call back and I asked them to reproduce and verify that we were not the only customer to run into this.  They said that we were only the second customer to open a ticket and report the issue, but that they would create a KB article so that if someone else ran into the issue the information would be out there.

I hope this helps anyone who has been scratching their head over this small, but potentially significant issue.

Posted in Snig.

Tagged with , , , , .


NFS Large Disk Support in vSphere

I've run into this issue with a couple customers the past couple weeks, and there isn't anything definitive out there that I could find, so I thought I'd write about it.  These customers have a need for disks in a VM larger than the 2TB – 512 bytes that is currently supported.  They have a couple options to get around this limitation but these solutions cause a tad bit more complexity in their environments.  

One nugget of knowledge that I confirmed with @VMwareStorage last week was that an NFS volume size is only restricted by the disk array itself when presented to a vSphere host.  e.g. If my NetApp running OnTap 8.x can do a 50TB volume, I can present that volume to vSphere.

 

Problem:

Customer is using a NFS datastore, running Windows VMs on that datastore and they require a 6TB volume.  They have a requirement for an application (a poorly written one) to be installed that requires local disk.  (This application cannot map to an NFS or CIFS export/share, it must use a disk seen by the OS as local disk.)  Currently with NFS we cannot use RDMs as a solution thus we are restricted to the maximum VM disk size of 2TB – 512 bytes.  (@VMwareStorage has indicated to me that RDMs on NFS are coming in the future.)

 

Solution(s):

NFS = we can create multiple disks (VMDKs) for that VM and present them to the OS as local disks.  Once presented and added we can then use a Dynamic Disk within Windows to concatenate the disks using a GPT partition to enable the one large contiguous volume needed.

iSCSI = if available, we can create a large LUN and present it to the VM as a virtual RDM.  Once added to the OS we simply format it with a GPT partition and away we go.  You could also use a software iSCSI initiator in the VM itself to get around the hypervisor all together, but that will have other implications when it comes to backup/recovery.

 

Risks:

There two obvious risks with these large volumes.  Backup and recovery.  The primary reason we are using a virtual RDM in the iSCSI solution is so that we can continue to use the vStorage APIs for backup, thus enabling some advanced backup technologies that ensure we can get the job done within the prescribed backup window.  Obviously some sort of backend disk array snapshot and offload to tape would be best for backups here, but would complicate recovery.  For the NFS solution the vCenter snaps of the VM could take quite a while, thus you would have to tune your timeout values accordingly.  Putting an agent inside the VM for this solution may be best depending on the backup solution you're using.  You have to weigh the good with the bad as always.

 

I hope this short article has helped a few people out there in the internets.  Let me know what you think in the comments below.  Have I missed anything?

Posted in storage, vmware.

Tagged with , , , , , , .