Next generation cloud automation

Wednesday, October 27, 2010

Traffic predictor

Demo

The idea with this little project is to predict what the traffic for a given road segment is going to be.

I'm scrapping the data from WSDOT's rss feed into a MongoDB collection, then aggregating the data to make things faster.

The problem that I'm having now is determining the color for any given segment, more on that later.

Lots of work left to do on this little project, but I'm really enjoying myself on this one. It's quite a fun little puzzle.

Thursday, August 5, 2010

Why I love Zenoss

In the kitchen of the sysadmin you'll generally see some mix of tools like Nagios, Cacti, collectd, maybe even some custom stuff in there as well. In most cases people are looking to fulfill two core requirements: monitoring of services, and trend analytics.

In the past I've found that Nagios + Cacti was a fantastic mix to satisfy these requirements, however, I have recently found that Zenoss to be a much more satisfying tool.

The problem with Nagios is that you end up with lots of configuration files. Granted, most good admins will have these organized in a way that makes sense which ultimately makes them easy to manage and maintain. However, with Zenoss you have no configuration files ( at least not in the sense of host/services/etc... ). This is nice since I don't have to restart the service if I add a host, nor do I end up having to edit anything on the server itself. Tracking changes via svn/git is nice and all, but having all of the change log information in the interface is even better.

As for Cacti, I've found it to be rather prickly to get setup, and doesn't seem to work all that well in large scale environments.

Zenoss combines both of these tools into one tool and adds some very nice polish to the entire process. For example, if I want to ensure that Zenoss is monitoring any service on any host that matches: /^thin.*[0-9]{4}$/ ( thin server port 6900 ), I can add a service rule. This service rule then watches the process table on each host and will 'catch' any process matching my regex.

This has several benefits:
* The monitoring of the process is automatically picked up, if it crashes, an alert will be sent out.
* Along with the state monitoring, Zenoss will also start profiling this process as far as memory and cpu usage.
* I created one object, and that object was automatically propagated to all hosts.

The last point there is the most important. If I had a mix of hosts and services I can still get the trending and monitoring regardless of the role for any given host.

For example, let's say you have a mix of memcache instances, some that run on m1.small instances in AWS, and some that run as bare metal in a datacenter. In this case, as long as Zenoss can touch the snmp port on all instances, it can watch for any service matching something like /^memcache/. Again, regardless of the role for the given memcache instance, it'll be picked up and monitored automatically without you having to configure anything beyond the initial service.

Once I have this item configured and running, I can 'lock' the service, and any changes made from that point on are tracked by the system. So, if someone ( perhaps a Jr. Admin ) goes and fat fingers my regex, I'll know who did it and when. This is slightly more convenient then having to dig through commit logs.

Nginx + Sinatra + MongoDB

I created a document to help explain the how and why of my web app setup.

Document

I use this in production, so I guess you could say that I eat my own dog food on this one.

Wednesday, June 30, 2010

Gotta love optParse

I use a very simple deployment system which goes like this:

* export code base at HEAD
* tar export
* scp tar to host server(s)
* create directory for this version of the payload
* unlink 'current' directory ( /var/www/[project]/current )
* unspool tarball into newly created revision directory
* link 'current' to newly created directory

In this way I can roll the entire code base back using a simple unlink/link-to-old-build-id command. This keeps my environments nice and tidy, and safe from evil things like version conflicts.

To this day I don't know why anyone would actually think that 'svn update' is an OK thing to do in a live production environment. Ehh...guess that's just me being all anal again.

My favorite part about this is the deployment script, dead simple, and I did the opsParse thing to remind me of how it all works:

krogebry@krogebry-desktop:~/aws/deployment$ ./deploy.rb -h
Usage: deploy.rb --project [project] --environment [stage|prod] --verbose
-v, --verbose Run verbosely
-p, --project PROJECT Project name
-e, --environment ENVIRONMENT Environment name ( stage | prod )
-h, --help Show this message

EBS logging VS Scribe...

I was running some numbers trying to figure out which would be the better solution between using EBS volumes on the web front ends for logging versus using something like scribe. In either case the log data would end up in a database for analytics.

In my mind the choice is pretty dead simple, however, I've recently run into a case where and admin had put together the most bizarre little hack for dealing with logging data. They were writing the log data to an attached volume, then reading the data off the volume with a series of nearly indecipherable spaghetti code written entirely in bash. This meant that the web front end were horribly bloated with all of this extra work, not to mention the sheer fragility of it all.

At any rate, I was thinking about how much this is costing them to do as opposed to doing something more elegant like using Scribe...

Assuming 10 web servers taking an average rate of 5 hits per second, and assuming that each hit is logged to an EBS volume, then read from the same volume later on into a database:

5 hps * 10 servers = 50 hps
50 writes and 50 reads = 100 total IO operations per second.
360,000 ops/hr = 8,640,000 ops/day

Amazon charges $0.10 per 1M operations, so that works out to $0.664 / day = $25.92 per month.

So, at the end of the day they end up spending a very small amount for this. Even if you piled the expense of the web server instances themselves it probably won't add up to much more then $50/month. That's not bad at all, but it's still %50/month they don't have to be spending when they could be getting a better service for free. Call me overly pedantic, but I just hate spending money I don't have to, especially when I have a lot riding on a newly funded startup. Every penny counts right?

Monday, March 15, 2010

New updates.

I just pushed a new build up to production. This build shows off some new features, and begins to show some of the onclick actions for the name plates.

Thursday, March 11, 2010

More tanking effectiveness.

Now with event goodness!

The little red x is a death event. Eventually you'll be able to hover over it and see who died and when.

The blue horizontal lines are bone spikes. I haven't quite worked out all of the details on this yet, but it's getting closer!

Wednesday, October 27, 2010

Thursday, August 5, 2010

Wednesday, June 30, 2010

Monday, March 15, 2010

Thursday, March 11, 2010

Blog Archive