Saturday, November 9, 2013

Running single-threaded applications like a boss using CloudFormation and Chef.

NodeJS + CF + Chef = Unicorn Awesomeness!!

The problem statement is this: how do I make use of a single-threaded application in the cloud?  This is great practice for technologies like redis where there is a specific focus on accomplishing a very direct, targeted objective in a "weird" kind of way.  Redis is also single-threaded, but it's focus is on really really fast indexing of structured data.  In the case of Node it makes sense to run many instances of the run time on a single compute resources, thus using more of the systems resources to get more work done.

I'm using a formula of n+1 where n is the number of ECPU's.  An m1.large instance in AWS has 2 ECPU's.  The goal here is to find an automated path to running the cloudrim-api on many ports for a single compute instance.  Then load balancing the incoming requests across an array of ELB's, each ELB attached to a separate port, but the same group of nodes.  Also maintaining that all traffic coming into any given ELB group is always port 9000.

High-level summary

This is creating several AWS objects:

  • Security group for the load balancer which says "only allow traffic in from port 9000".
  • 2 Elastic Load Balancer objects, both of which listen on port 9000, but direct traffic to either 9000, or 9001 on the target.
  • Each ELB has it's own DNS Alias record.
  • A regional DNS entry is created in a way that Route53 will create a weighted set for each entry in the resource collection. 
The magic is in the last step, what we end up with is an end point cloudrim.us-east-1.opsautohtc.net which points at a weighted set of ELB's.  Each ELB is pointing to a specific port on a collection of nodes, this collection of nodes is the same for each ELB, but different ports.

This allows us to run many instances of a software package in a dynamic way ( more ECPU's will grow most of the system automatically ).  I'm combining this with runit for extra durability; runit ensures that the process is always running, so if the service crashes, runit will automatically create a new instance almost immediately.


CloudFormation bits:

    "LoadBalancerSecurityGroup": {
      "Type": "AWS::EC2::SecurityGroup",
      "Properties": {
        "VpcId": { "Ref": "VpcId" },
        "GroupDescription": "Enable HTTP access on port 9080",
        "SecurityGroupIngress": [
          { "IpProtocol": "tcp", "FromPort": "9000", "ToPort": "9000", "CidrIp": "0.0.0.0/0" }
        ],
        "SecurityGroupEgress": [ ]
      }
    },
    "ElasticLoadBalancer0": {
      "Type": "AWS::ElasticLoadBalancing::LoadBalancer",
      "Properties": {
        "Subnets": [{ "Ref": "PrivateSubnetId" }],
        "Scheme": "internal",
        "Listeners": [{ "LoadBalancerPort": "9000", "InstancePort": "9000", "Protocol": "TCP" }],
        "HealthCheck": {
          "Target": "TCP:9000",
          "Timeout": "2",
          "Interval": "20",
          "HealthyThreshold": "3",
          "UnhealthyThreshold": "5"
        },
        "SecurityGroups": [{ "Ref": "LoadBalancerSecurityGroup" }]
      }
    },
    "ElasticLoadBalancer1": {
      "Type": "AWS::ElasticLoadBalancing::LoadBalancer",
      "Properties": {
        "Subnets": [{ "Ref": "PrivateSubnetId" }],
        "Scheme": "internal",
        "Listeners": [{ "LoadBalancerPort": "9000", "InstancePort": "9001", "Protocol": "TCP" }],
        "HealthCheck": {
          "Target": "TCP:9001",
          "Timeout": "2",
          "Interval": "20",
          "HealthyThreshold": "3",
          "UnhealthyThreshold": "5"
        },
        "SecurityGroups": [{ "Ref": "LoadBalancerSecurityGroup" }]
      }
    },
    "DNSEntry0": {
      "Type": "AWS::Route53::RecordSetGroup",
      "Properties": {
        "HostedZoneName": { "Fn::Join": [ "", [{ "Ref": "DomainName" },"."]]},
        "Comment": "DNS CName for the master redis ELB",
        "RecordSets": [{
          "Name": { "Fn::Join": [ "", [
            "cloudrim0.",
            { "Ref": "StackName" }, ".",
            { "Ref": "DeploymentName" }, ".",
            { "Ref": "AWS::Region" }, ".",
            { "Ref": "DomainName" }, "."
          ]]},
          "Type": "A",
          "AliasTarget": {
            "DNSName": { "Fn::GetAtt": [ "ElasticLoadBalancer0", "DNSName" ] },
            "HostedZoneId": { "Fn::GetAtt": [ "ElasticLoadBalancer0", "CanonicalHostedZoneNameID" ] }
          }
        }]
      }
    },
    "DNSEntry1": {
      "Type": "AWS::Route53::RecordSetGroup",
      "Properties": {
        "HostedZoneName": { "Fn::Join": [ "", [{ "Ref": "DomainName" },"."]]},
        "Comment": "DNS CName for the master redis ELB",
        "RecordSets": [{
          "Name": { "Fn::Join": [ "", [
            "cloudrim1.",
            { "Ref": "StackName" }, ".",
            { "Ref": "DeploymentName" }, ".",
            { "Ref": "AWS::Region" }, ".",
            { "Ref": "DomainName" }, "."
          ]]},
          "Type": "A",
          "AliasTarget": {
            "DNSName": { "Fn::GetAtt": [ "ElasticLoadBalancer1", "DNSName" ] },
            "HostedZoneId": { "Fn::GetAtt": [ "ElasticLoadBalancer1", "CanonicalHostedZoneNameID" ] }
          }
        }]
      }
    },
    "RegionDNSEntry": {
      "Type": "AWS::Route53::RecordSetGroup",
      "Properties": {
        "HostedZoneName": { "Fn::Join": [ "", [{ "Ref": "DomainName" },"."]]},
        "Comment": "DNS CName for the master redis ELB",
        "RecordSets": [{
          "Name": { "Fn::Join": [ "", [
            "cloudrim.",
            { "Ref": "AWS::Region" }, ".",
            { "Ref": "DomainName" }, "."
          ]]},
          "Type": "CNAME",
          "TTL": "900",
          "SetIdentifier": "Array0",
          "Weight": "30",
          "ResourceRecords": [{ "Fn::Join": [ "", [
            "cloudrim0.",
            { "Ref": "StackName" }, ".",
            { "Ref": "DeploymentName" }, ".",
            { "Ref": "AWS::Region" }, ".",
            { "Ref": "DomainName" }
          ]]}]
        },{
          "Name": { "Fn::Join": [ "", [
            "cloudrim.",
            { "Ref": "AWS::Region" }, ".",
            { "Ref": "DomainName" }, "."
          ]]},
          "Type": "CNAME",
          "TTL": "900",
          "SetIdentifier": "Array1",
          "Weight": "40",
          "ResourceRecords": [{ "Fn::Join": [ "", [
            "cloudrim1.",
            { "Ref": "StackName" }, ".",
            { "Ref": "DeploymentName" }, ".",
            { "Ref": "AWS::Region" }, ".",
            { "Ref": "DomainName" }
          ]]}]
        },{
          "Name": { "Fn::Join": [ "", [
            "cloudrim.",
            { "Ref": "AWS::Region" }, ".",
            { "Ref": "DomainName" }, "."
          ]]},
          "Type": "CNAME",
          "TTL": "900",
          "SetIdentifier": "Array2",
          "Weight": "30",
          "ResourceRecords": [{ "Fn::Join": [ "", [
            "cloudrim2.",
            { "Ref": "StackName" }, ".",
            { "Ref": "DeploymentName" }, ".",
            { "Ref": "AWS::Region" }, ".",
            { "Ref": "DomainName" }
          ]]}]
        }]
      }
    },
    "ConfigASG": {
      "Type": "AWS::AutoScaling::AutoScalingGroup",
      "Properties": {
        "Tags": [{ "Key": "DeploymentName", "Value": { "Ref": "DeploymentName" }, "PropagateAtLaunch": "true" }],
        "MinSize": { "Ref": "Min" },
        "MaxSize": { "Ref": "Max" },
        "DesiredCapacity": { "Ref": "DesiredCapacity" },
        "AvailabilityZones": [{ "Ref": "AvailabilityZone" }],
        "VPCZoneIdentifier": [{ "Ref": "PrivateSubnetId" }],
        "LoadBalancerNames": [
          { "Ref": "ElasticLoadBalancer0" },
          { "Ref": "ElasticLoadBalancer1" },
          { "Ref": "ElasticLoadBalancer2" }
        ],
        "LaunchConfigurationName": { "Ref": "ServiceLaunchConfig" }
      }
    }

And now the chef bits:

services = node["htc"]["services"] || [ "api" ]
Chef::Log.info( "SERVICES: %s" % services.inspect )
num_procs = `cat /proc/cpuinfo |grep processor|wc -l`.to_i + 1
Chef::Log.info( "NumProcs: %i" % num_procs )
num_procs.times do |pid|
  Chef::Log.info( "PID: %i" % (9000+pid) )
  runit_service "cloudrim-api%i" % pid do
    action [ :enable, :start ]
    options({ :port => (9000+pid) })
    template_name "cloudrim-api"
    log_template_name "cloudrim-api"
  end
end

Template:

cat templates/default/sv-cloudrim-api-run.erb
#!/bin/sh
exec 2>&1
PORT=<%= @options[:port] %> exec chpst -uec2-user node /home/ec2-user/cloudrim/node.js

No comments:

Post a Comment