Working of Elastic Load balance

Consider, there are some instances up and running behind the load balancer. Traffics hits on the load balancer, it sends it to instances. That's all it does i.e. distributes the traffic!
But how does it decides which request to send to which instances? It depends on the rules/requirements we set. If the traffic is more, Load balancer is capable enough to handle i.e. by autoscaling vertically. Behind the scene it also keeps checking the instances if they are healthy, it sends traffic to only healthy instances, meaning which are up and running. So that user's traffic it is processing doesn'tIn genera go in vain and our job is done right. Also distributing traffic among the target point is done based on an algorithm such as Round Robin algorithm.

Types of Elastic Load balancer

There are three types of Load balancer with unique special features but having the same primary function i.e. distributing the traffic!
  1. Classic load balancer
  2. Application load balancer
  3. Network load balancer

Quick note

-Balances the traffic out to EC2/IP/ECS endpoints.
-Supports load balancing of HTTP, HTTPS and TCP.
-Detects and removes failing instances.
-Grows and shrink based on the traffic
-Integrates with Autoscaling
-Specific to region and routes traffic across Availability zones
-Max 32 character limit for a name in Application load
-Has high availability.

Configuring health check

Load balancer behind the scene keeps sending the request to instances running behind it and waits for the response. It just makes sure all the instances behind it are up and running and are responding to it. In case of instance failling to respond to health checks, load balancer will not process the traffic to respective instances. IF this happens we will lose our clients which is bad for our business!
One uses Autoscaling to make sure at least some instance is up running, ie. even if one instance fails load balancer will spawn similar instances.
Also if there are more requests, it is more likely not all the requests will be processed leading to loss of clients again...bad for business! Thus, autoscaling also helps to spawn some instances based on the requests load and makes sure all the requests are processed. Will come to autoscaling later.
Understand how this health check works to our benefit. What exactly is happening and on what basis it determines the healthiness of instances? It divides down into two...

1. What load balancer is checking?

Ping protocol: Which protocol it has to use for health checks.

Ping port: Which it has to check using the protocol defined above.

2. How it is checking?

Response Timeout: when the load balancer sends a request to ec2, this parameter tells for how many seconds it should wait for the response. The value given here should be neither too low nor too high...5 sec sounds decent enough, because if its either of it then there is something wrong with instances launched.

Interval: This parameter tells the load balancer after how many seconds it should send the next health ping to EC2 instance.

Unhealthy threshold: This parameter value load balancer decides if ec2 instance is unhealthy. It declares unhealthy if it fails the health check Consecutively. Let's say if the value is set to 2. If two times Consecutively the responses from EC2 are failed then only it concludes that EC2 is unhealthy.

Healthy threshold: It tells how many health checks should an ec2 pass to consider it healthy. Again it has to pass health checks consecutively.

health-check

Stickness? What's that for?

Let's say, a user makes requests, it is processed into some EC2. User is doing some sort of work and other requests are made by him, it's not sure if the same user's requests will go to same EC2. To make sure that requests go to same EC2, the stickiness can be used.

Stickiness is based on timing- Forces a User's requests to particular EC2 for X amount of time and is based on a cookie- Regardless in which EC2 requests is processed user's cookies are maintained.

SSL Termination? It's kinda debatable.

If you have read the post about how HTTP works and how digital signature is used to secure connection and why we need it, you will be able to understand the concept of SSL termination option at the load balancer.

An HTTPS request comes to Load balancer.
Load balancer will send the request to the desired EC2.
HTTPS secure connection will be established before processing any further actions.
Something like this:

HTTPS (req) ----> ELB ----> HTTPS (req) ----> EC2

An end to end encryption.

Now, while establishing this secure connection a part of EC2's memory and CPU is used to perform that encryption. Obviously, it ain't some magic it does need stuff! Encrypting one request isn't a big deal.

Now imagine per minute there are 1000+ such requests are coming. So for encryption to be carried, it will end up using some major amount of CPU and Memory of EC2, which is being wasted and an application has other better jobs to do. Also, EC2 has to spend some amount of time in that minute to perform this task..after all its software level encryption. So in the biggest picture do you think it works in our favor? It's a debatable topic actually!

AWS provided an option of Offload SSL validation to ELB, what it means is encryption ends at ELB itself. From ELB the request is processed as HTTP to the desired EC2.
Something like this:

HTTPS (req) ----> ELB ----> HTTP (req) ----> EC2

It is not an end to end encryption, but ofcourse, the HTTP request will be processed inside our VPC only...but again it's a debatable topic is it worth to do? Or our company policy says complete end to end encryption..yadda yadda... :) Lets' learn that with experience.