Friday, March 13, 2009

Problems I have had with GoGrid cloud hosting

There are a lot of great things about GoGrid, and you can read about them all over the net. I am writing this post because what I didn't find all over the net was anyone who had any real problems, or anything critical to say about GoGrid. I, however, have had some very serious problems with GoGrid, and they have exposed - to me - things that really need to be public for others going through the decision-making process.

This is not meant to be a "scathing" review. I am a little sore over some of the support issues, but this post is purely so others can be more informed. GoGrid really is cool, and I recognize that more great features are coming.

For the past several months I have been working with several clients on migrating and creating applications to/on the GoGrid cloud. This post is written by someone with real-world experience on GoGrid with both Linux and Windows servers.

Before my decision to recommend GoGrid, I did as much legwork as I could to really understand how GoGrid worked and why it was a better or worse option than a Dedicated host, VPS, or another cloud option (like Mosso or Amazon)

My choice in the end for GoGrid really was based on several key factors: 1) ease of use, 2) recommendations of peers, 3) relative cost and 4) lack of any real complaints out on the net.

The great part of GoGrid is the ease with which you can deploy new servers. It really is as easy as I have seen. A few clicks and a few minutes later you can remote desktop to you Windows server or SSH to a Linux server. Neat!

I want to stop there with the praise though, because as I mentioned above, you can read all about how great and easy GoGrid is elsewhere.

Here come the problems:

1) Your server is not really on a cloud - not really. I created and configured a CentOS 5.1 LAMP image, installed everything I needed, and configured all the sysadmin stuff. The server and the app were running perfectly. The night before the app running on the server was set to premier, I received an email from my client saying that the site was not working. Over the next 30 hours, I came to understand that a "node" on the gogrid cloud had gone down, and my server - and my app - along with it. 

What Gogrid has done is abstracted any clients from the cloud by putting them into these "nodes." So your server apparently isn't distributed in the cloud, it is sitting on a node that I assume is distributed on the cloud. But when the node my server was on went down, GoGrid couldn't get my server back. I have been waiting nearly 2 days now and the server is still just lost. It still shows up in my GoGrid control panel, but this server does not respond to any requests of any sort (ping, ssh, etc.). What is somewhat amusing is that I can still restart this server in the control panel, which indicates that it goes offline and then comes back online - but the control panel seems to be the only thing that can communicate with the server.

The rationale for using this node model (as explained to me by support) is that if a client creates several servers, each one is guaranteed to be on its own node, and then you can use the free load balancers to distribute the applications on the servers. The immediate glaring problem is that you can't clone a server (although I am being repeatedly told that this feature is coming), so you have to manually create the servers and set them all up on your own - or use some third party tool to help you.  

If a hosting environment takes a catastrophic hit, they can almost always bring everything back the way it was from a backup (at least in my experience). This has not been my experience with GoGrid, my server has not been recoverable for almost 2 days now. On a true grid, backups would not be necessary from the host since the server would be on the grid and a failure of some part of the grid wouldn't result in the loss of your server. Of course a user would create their own backups, which they could use to repair any damage they did themselves.

GoGrid, as it exists right now, should probably stop offering LAMP images or Windows images with included SQL Server 2005 Express; this leads potential customers like me to believe that these are good, easy choices on GoGrid. They are absolutely not good choices, until GoGrid makes creating custom images a reality, starts truly distributing the servers on the grid, or until GoGrid gets some sort of useful backup routine in place.

2) Your server is not scalable. The server you deploy cannot dynamically scale. I won't go into detail on this because you can find this documented on GoGrid's site and on other reviews of GoGrid. There is a scalable storage option, but this doesn't help if you need more RAM on a server.

3) There are no tools to manage or even view DNS zone settings for the servepath name servers. If you want to use the nameservers from GoGrid, you can, but you have to create a help ticket and hope whoever creates the entries does it right (they didn't get it right the first time at least once for me). Once they are set, you can request that support emails you your settings, but until DNS propagation is complete you have no other way to verify your name server settings - nothing real time, and nothing easy. I was coming from Rackspace, who had a control panel built into their interface for viewing and editing zone settings. Again, in GoGrid you can create a nameserver of your own and administer it - but that is completely up to you - nothing easy.

4) This brings me to my last big issue with GoGrid: support and support updates.

I was coming from a Rackspace dedicated hosting environment. My client was paying about 30% more for that Rackspace solution that they are currently paying for a comparable GoGrid solution. My experience with Rackspace has been absolutely fantastic. Rackspace support is really second to none. Rackspace understands that since they are in the business of leasing servers, it makes sense to provide as much information about how to best manage those servers to their clients as they can.

GoGrid support is not Rackspace support. If you need OS-specific help working something out, you and Google are on your own - or you have to shell out more $$ for paid support from GoGrid. Hear me out; how many times have you set something up that you have set up a hundred times before, but there is one little problem and for the life of you, you can't see where the issue is coming from? You just need a second set of eyes, and 2 minutes of someone's time to help you see past the problem. 

With Rackspace, that sort of help was always there. With GoGrid it falls away very quickly. I had a couple of these little issues, and eventually worked them out and slapped myself on the forehead. But in the moment, when I could have used a quick answer from the host that my client was shelling out good money for - there was only the suggestion that I look into their paid support.

I will say that you can end up on chat or on the phone with some very helpful people at GoGrid. But you can also get someone who is far less helpful. It seems clear to me that there are support staff who are stakeholders in GoGrid's success, but there are other (probably contracted) support staff who are not.

When my server went down, I opened a support case, jumped on chat and started trying to get any information about what was going on. It took several chat sessions, a handful of emails, several phone calls, and about 4 hours to finally get someone to tell me that a node had gone down on GoGrid, which is why my server stopped responding. This still raises my blood pressure, because when I finally got an answer about what had happened, the person who told me said that there had been an entry made (into whatever system they use to track problems) around the time I first logged my case about the server problem. 

I have things go wrong with clients sometimes. Sometimes it is the result of something that I did or didn't do. It happens. But I don't try to hide it, and I don't run them around if they ask me what happened. This has been my biggest disappointment to date with GoGrid; I feel as though I was given the run-around by support staff until I talked to the right person. 

One interesting aside is that I was using Twitter to document some of my experience and several GoGrid folks started following me. Someone from GoGrid support called and left a message last night at midnight to see if there was anything they could do, mentioning that one of their superiors had seen my tweets. That felt kind of cool, but I still have not had anyone update me on the open ticket on my server status since yesterday when I called to check. 

In the saga of the lost server, we ended up recreating the application on a shared host and changing our domain's name server settings to point to the shared host. The app is not expected to need too much bandwidth as it is only the preview version. We will probably still use GoGrid for the final app, we will just set everything up differently.

CONCLUSIONS:
  • GoGrid is really only easy for the first 20 minutes of a server's lifespan, after that you may as well be on a VPS for a lot less money (unless you want a distributed app).
  • I am going to continue to recommend GoGrid for some clients. 
  • I am going to recommend against GoGrid for far more clients.
  • I know now that having two 512KB (of RAM) servers, a separate database server with backups scheduled onto the cloud storage, and a free load balancer to distribute incoming requests to the 2 servers is better than a single 1GB LAMP server. 
  • I have found that there are some great support people with GoGrid, it is just a shame that there are also some less-great support people.
  • Don't use GoGrid if you need a single server. Seriously don't.
My suggestions to GoGrid
  • Stop offering LAMP images and Windows Server with SQL 2005 Express images until you can offer truly distributed server instances, or work out real backups.
  • Provide at least read access to the zone settings for domains within users' accounts. Better yet let them manage the zone settings for their domain, on your name servers, on their own.
  • Make cloning servers, or creating custom, boot-able images easy (I am being told this is coming)
  • Allow scaling of servers (this is supposed to be coming as well)
  • Get more "stakeholder" support staff, and when something really bad happens be proactive on keeping clients updated. (I've waited long enough for my update, I'm deleting my servers in the control panel now)

Labels: , , , , ,