Load Balancing

Load balancing refers to efficiently distributing incoming network traffic across a group of backend servers, also known as a server farm or server pool.

Modern high‑traffic websites must serve hundreds of thousands, if not millions, of concurrent requests from users or clients and return the correct text, images, video, or application data, all in a fast and reliable manner. To cost‑effectively scale to meet these high volumes, modern computing best practice generally requires adding more servers.

Why We use Application Load Balancer (ALB)?

AWS Application Load Balancer (ALB) operates at Layer 7 of the OSI model. At Layer 7, the ELB has the ability to inspect application-level content, not just IP and port. This lets it route based on more complex rules than with the Classic Load Balancer.

For instance, an ELB at a given IP will receive a request from the client on port 443 (HTTPS). The Application Load Balancer will process the request, not only by receiving port but also by looking at the destination URL.

Multiple services can share a single load balancer using path-based routing.

Application Load Balancer will be aware of each of these URLs based on patterns set up when configuring the load balancer and can route to different clusters of servers depending on application need. Rules can also be added at a later time as you add new functionality to your stack.
The Application Load Balancer also integrates with EC2 Container Service (ECS) using Service Load Balancing. This allows for dynamic mapping of services to ports as specified in the ECS task definition. Multiple containers can be targeted on the same EC2 instance, each running different services on different ports. The ECS task scheduler will automatically add these tasks to the ALB.

The load balancer performs the following functions:

Distributes client requests or network load efficiently across multiple servers
Ensures high availability and reliability by sending requests only to servers that are online
Provides the flexibility to add or subtract servers as demand dictates

Load Balancing Algorithms

Different load balancing algorithms provide different benefits; the choice of load balancing method depends on your needs:

Round Robin – Requests are distributed across the group of servers sequentially.
Least Connections – A new request is sent to the server with the fewest current connections to clients. The relative computing capacity of each server is factored into determining which one has the least connections.
IP Hash – The IP address of the client is used to determine which server receives the request.

In a nutshell!

The best load balancers can handle session persistence as needed. Another use case for session persistence is when an upstream server stores information requested by a user in its cache to boost performance. Switching servers would cause that information to be fetched for the second time, creating performance inefficiencies.