Demo
The idea with this little project is to predict what the traffic for a given road segment is going to be.
I'm scrapping the data from WSDOT's rss feed into a MongoDB collection, then aggregating the data to make things faster.
The problem that I'm having now is determining the color for any given segment, more on that later.
Lots of work left to do on this little project, but I'm really enjoying myself on this one. It's quite a fun little puzzle.
Solving complex problems with speed that creates delightful experiences in the world of cloud automation. Helping you get more out of your cloud.
Wednesday, October 27, 2010
Thursday, August 5, 2010
Why I love Zenoss
In the kitchen of the sysadmin you'll generally see some mix of tools like Nagios, Cacti, collectd, maybe even some custom stuff in there as well. In most cases people are looking to fulfill two core requirements: monitoring of services, and trend analytics.
In the past I've found that Nagios + Cacti was a fantastic mix to satisfy these requirements, however, I have recently found that Zenoss to be a much more satisfying tool.
The problem with Nagios is that you end up with lots of configuration files. Granted, most good admins will have these organized in a way that makes sense which ultimately makes them easy to manage and maintain. However, with Zenoss you have no configuration files ( at least not in the sense of host/services/etc... ). This is nice since I don't have to restart the service if I add a host, nor do I end up having to edit anything on the server itself. Tracking changes via svn/git is nice and all, but having all of the change log information in the interface is even better.
As for Cacti, I've found it to be rather prickly to get setup, and doesn't seem to work all that well in large scale environments.
Zenoss combines both of these tools into one tool and adds some very nice polish to the entire process. For example, if I want to ensure that Zenoss is monitoring any service on any host that matches: /^thin.*[0-9]{4}$/ ( thin server port 6900 ), I can add a service rule. This service rule then watches the process table on each host and will 'catch' any process matching my regex.
This has several benefits:
* The monitoring of the process is automatically picked up, if it crashes, an alert will be sent out.
* Along with the state monitoring, Zenoss will also start profiling this process as far as memory and cpu usage.
* I created one object, and that object was automatically propagated to all hosts.
The last point there is the most important. If I had a mix of hosts and services I can still get the trending and monitoring regardless of the role for any given host.
For example, let's say you have a mix of memcache instances, some that run on m1.small instances in AWS, and some that run as bare metal in a datacenter. In this case, as long as Zenoss can touch the snmp port on all instances, it can watch for any service matching something like /^memcache/. Again, regardless of the role for the given memcache instance, it'll be picked up and monitored automatically without you having to configure anything beyond the initial service.
Once I have this item configured and running, I can 'lock' the service, and any changes made from that point on are tracked by the system. So, if someone ( perhaps a Jr. Admin ) goes and fat fingers my regex, I'll know who did it and when. This is slightly more convenient then having to dig through commit logs.
In the past I've found that Nagios + Cacti was a fantastic mix to satisfy these requirements, however, I have recently found that Zenoss to be a much more satisfying tool.
The problem with Nagios is that you end up with lots of configuration files. Granted, most good admins will have these organized in a way that makes sense which ultimately makes them easy to manage and maintain. However, with Zenoss you have no configuration files ( at least not in the sense of host/services/etc... ). This is nice since I don't have to restart the service if I add a host, nor do I end up having to edit anything on the server itself. Tracking changes via svn/git is nice and all, but having all of the change log information in the interface is even better.
As for Cacti, I've found it to be rather prickly to get setup, and doesn't seem to work all that well in large scale environments.
Zenoss combines both of these tools into one tool and adds some very nice polish to the entire process. For example, if I want to ensure that Zenoss is monitoring any service on any host that matches: /^thin.*[0-9]{4}$/ ( thin server port 6900 ), I can add a service rule. This service rule then watches the process table on each host and will 'catch' any process matching my regex.
This has several benefits:
* The monitoring of the process is automatically picked up, if it crashes, an alert will be sent out.
* Along with the state monitoring, Zenoss will also start profiling this process as far as memory and cpu usage.
* I created one object, and that object was automatically propagated to all hosts.
The last point there is the most important. If I had a mix of hosts and services I can still get the trending and monitoring regardless of the role for any given host.
For example, let's say you have a mix of memcache instances, some that run on m1.small instances in AWS, and some that run as bare metal in a datacenter. In this case, as long as Zenoss can touch the snmp port on all instances, it can watch for any service matching something like /^memcache/. Again, regardless of the role for the given memcache instance, it'll be picked up and monitored automatically without you having to configure anything beyond the initial service.
Once I have this item configured and running, I can 'lock' the service, and any changes made from that point on are tracked by the system. So, if someone ( perhaps a Jr. Admin ) goes and fat fingers my regex, I'll know who did it and when. This is slightly more convenient then having to dig through commit logs.
Nginx + Sinatra + MongoDB
I created a document to help explain the how and why of my web app setup.
Document
I use this in production, so I guess you could say that I eat my own dog food on this one.
Document
I use this in production, so I guess you could say that I eat my own dog food on this one.
Wednesday, June 30, 2010
Gotta love optParse
I use a very simple deployment system which goes like this:
* export code base at HEAD
* tar export
* scp tar to host server(s)
* create directory for this version of the payload
* unlink 'current' directory ( /var/www/[project]/current )
* unspool tarball into newly created revision directory
* link 'current' to newly created directory
In this way I can roll the entire code base back using a simple unlink/link-to-old-build-id command. This keeps my environments nice and tidy, and safe from evil things like version conflicts.
To this day I don't know why anyone would actually think that 'svn update' is an OK thing to do in a live production environment. Ehh...guess that's just me being all anal again.
My favorite part about this is the deployment script, dead simple, and I did the opsParse thing to remind me of how it all works:
krogebry@krogebry-desktop:~/aws/deployment$ ./deploy.rb -h
Usage: deploy.rb --project [project] --environment [stage|prod] --verbose
-v, --verbose Run verbosely
-p, --project PROJECT Project name
-e, --environment ENVIRONMENT Environment name ( stage | prod )
-h, --help Show this message
* export code base at HEAD
* tar export
* scp tar to host server(s)
* create directory for this version of the payload
* unlink 'current' directory ( /var/www/[project]/current )
* unspool tarball into newly created revision directory
* link 'current' to newly created directory
In this way I can roll the entire code base back using a simple unlink/link-to-old-build-id command. This keeps my environments nice and tidy, and safe from evil things like version conflicts.
To this day I don't know why anyone would actually think that 'svn update' is an OK thing to do in a live production environment. Ehh...guess that's just me being all anal again.
My favorite part about this is the deployment script, dead simple, and I did the opsParse thing to remind me of how it all works:
krogebry@krogebry-desktop:~/aws/deployment$ ./deploy.rb -h
Usage: deploy.rb --project [project] --environment [stage|prod] --verbose
-v, --verbose Run verbosely
-p, --project PROJECT Project name
-e, --environment ENVIRONMENT Environment name ( stage | prod )
-h, --help Show this message
EBS logging VS Scribe...
I was running some numbers trying to figure out which would be the better solution between using EBS volumes on the web front ends for logging versus using something like scribe. In either case the log data would end up in a database for analytics.
In my mind the choice is pretty dead simple, however, I've recently run into a case where and admin had put together the most bizarre little hack for dealing with logging data. They were writing the log data to an attached volume, then reading the data off the volume with a series of nearly indecipherable spaghetti code written entirely in bash. This meant that the web front end were horribly bloated with all of this extra work, not to mention the sheer fragility of it all.
At any rate, I was thinking about how much this is costing them to do as opposed to doing something more elegant like using Scribe...
Assuming 10 web servers taking an average rate of 5 hits per second, and assuming that each hit is logged to an EBS volume, then read from the same volume later on into a database:
5 hps * 10 servers = 50 hps
50 writes and 50 reads = 100 total IO operations per second.
360,000 ops/hr = 8,640,000 ops/day
Amazon charges $0.10 per 1M operations, so that works out to $0.664 / day = $25.92 per month.
So, at the end of the day they end up spending a very small amount for this. Even if you piled the expense of the web server instances themselves it probably won't add up to much more then $50/month. That's not bad at all, but it's still %50/month they don't have to be spending when they could be getting a better service for free. Call me overly pedantic, but I just hate spending money I don't have to, especially when I have a lot riding on a newly funded startup. Every penny counts right?
In my mind the choice is pretty dead simple, however, I've recently run into a case where and admin had put together the most bizarre little hack for dealing with logging data. They were writing the log data to an attached volume, then reading the data off the volume with a series of nearly indecipherable spaghetti code written entirely in bash. This meant that the web front end were horribly bloated with all of this extra work, not to mention the sheer fragility of it all.
At any rate, I was thinking about how much this is costing them to do as opposed to doing something more elegant like using Scribe...
Assuming 10 web servers taking an average rate of 5 hits per second, and assuming that each hit is logged to an EBS volume, then read from the same volume later on into a database:
5 hps * 10 servers = 50 hps
50 writes and 50 reads = 100 total IO operations per second.
360,000 ops/hr = 8,640,000 ops/day
Amazon charges $0.10 per 1M operations, so that works out to $0.664 / day = $25.92 per month.
So, at the end of the day they end up spending a very small amount for this. Even if you piled the expense of the web server instances themselves it probably won't add up to much more then $50/month. That's not bad at all, but it's still %50/month they don't have to be spending when they could be getting a better service for free. Call me overly pedantic, but I just hate spending money I don't have to, especially when I have a lot riding on a newly funded startup. Every penny counts right?
Monday, March 15, 2010
New updates.
I just pushed a new build up to production. This build shows off some new features, and begins to show some of the onclick actions for the name plates.
Thursday, March 11, 2010
More tanking effectiveness.
Subscribe to:
Posts (Atom)