I was working on a project and an Engineer approached me after going through the AWS environment. He made a recommendation to switch from an Application Load Balancer (ALB) to a Network Load Balancer (NLB), and his reason was that the application may potentially receive high traffic and that the NLB has better performance.
Well, he is not wrong because AWS’s documentation states: “If extreme performance and static IP is needed… we recommend you use a Network Load Balancer.”
However, the statement from AWS concerns only the performance capabilities of the load balancer — it doesn’t mean that your application as a whole would have better performance.
Whatttttt??? I’m talking rubbish, right?
I used to work with a telco to build and maintain HTTP Load Balancers back in 2009. We were load balancing at its peak around 20 Gbps of HTTP traffic to a web cache farm sitting in the core of the telco’s network. Web caches were really important for user experience because most of the web content that Singapore users consumed was overseas.
Serving up 20 Gbps of web traffic was a huge feat during that time — most PC still had 100 Mbps LAN, and we didn’t even have fiber broadband in Singapore yet. We had around 40 web cache servers, each only capable of handling around 500–600 Mbps of load. The bottleneck on the cache server was disk I/O and CPU.
The optimizations that HTTP LBs do became very important. Good HTTP LBs advertise all sorts of fancy features for a reason (because people need them), but the most important bit is that it takes work away from the backend servers — the LBs we used back then (Citrix NetScaler) will multiplex multiple HTTP requests across a single TCP connection. This made a HUGE difference to the web cache server performance. Without this feature, each web cache can barely handle 100–200Mbps of load because under millions of requests TCP connections are being set up and torn down. If you know how HTTP servers work, you will know that every TCP connection is a new thread which is an expensive operation.
A few years later, I was once again dealing with LBs for a US tech startup. At the peak, they were getting millions of API requests and their servers were struggling. I replaced traditional NLBs with ALBs and it reduced the load of the backend servers by 20–30%.
In most cases, backend servers are already busy doing what it needs to do — business logic, database access, etc. What you’d want is to have the LB offload any extra header processing, routing rules, redirection/filtering, SSL, etc. so your servers don’t have to. Another feature of an ALB is its ability to use more intelligent load distribution algorithms based on application-aware parameters such as HTTP headers, which can be very important with HTTP applications.
The Engineer made an assumption that an NLB will yield better performance — but we didn’t have data, and didn’t have an actual performance issue. As Engineers, we need to know how to do work with meaningful impact and outcomes and avoid trying to prematurely optimize based on assumptions.