Will this scale? Part 2: Evaluating infrastructure for scalability

In Part 1 of this post, “scalability” was defined and scalability issues with the software itself were described. In Part 2, we look at evaluating infrastructure for scalability (i.e., anti-patterns that commonly affect scale).

The scale is primarily accomplished vertically

Because of anti-patterns previously discussed, adding new application or database servers may be difficult. If the answer is to scale by adding additional CPUs, memory (RAM), network bandwidth, disk space or other resources, the application layer itself likely does not scale and more digging is required to surface the anti-patterns. As an example, if the application servers cannot exist in a cluster, then that is an indication of a stateful application server.

Lack of monitoring

If monitoring (application requests, data requests, resource usage, etc.) is not in place, it is challenging to know how the application is performing and if scale is needed. The software itself may support scaling, but failing to recognize when it needs to happen defeats the purpose of good architecture. Systems like New Relic or Nagios help determine when horizontal (added machines) or vertical (increased resources) adjustments are necessary.

Inability to auto-scale

If bringing up new machine instances is not automated and requires manual intervention and a significant amount of time, the ability to scale quickly to keep up with user demand is lost. Cloud-based infrastructure-as-a-service (IaaS) providers generally allow auto-scaling using usage threshold detection (among other things), but the deployment of new systems must be automated to take advantage.

Reliance on central services

Relying on primary services, such as a third-party person directory, is a fine way to reuse existing components and speed up time to market. However, if the scale of that service is not possible (e.g., out of the consumer’s control), precautions must be taken to ensure the central services are not the cause of the application not scaling appropriately, particularly if many requests are sent. Examples to work around this issue include leveraging other third-party services if responses are not timely, returning default values, and not blocking on service calls.

Ignoring location

Deploy the service where the users are. For example, if there are a large number of users in Asia, ensure a scaled instance of your application is available in that region.

Using physical machines

The use of physical machines for servers happens less and less these days in favor of virtualization. Physical machines are much costlier and much higher effort to scale (particularly in the cloud) and should be avoided. Physical servers also increase the effort during product development (affecting the scale of the development team) as environment provisioning takes longer and is more complex.

Not load testing

If the builders of an application do not know the application limits, be cautious of their ability to scale. Mature companies load-test their applications to (a) validate the application can scale, and (b) know at what point (e.g., number of users) the system starts to struggle. This must be done on data that is representative of production, such as a production database scrubbed of personally-identifiable information (PII).

Many content requests

Content-heavy applications, such as a content management system, can get bogged down if all application servers draw their content from a central source, such as one file server. In the same way, that code must scale and content should as well. Using something like a content delivery network (CDN) can help remove the load from the central system and support a greater number of users. CDN is a form of network caching of content which reduces the workload of the application servers.

The combination of Part 1 (software architecture) and Part 2 (infrastructure) describe a partial list of items that Crosslake checks when evaluating the scalability of an architecture. Through a combination of architectural analysis and thorough code review, issues can be surfaced now that may affect an acquisition or investment in a technology company. Scalability is a key area to deep-dive during any technical due diligence activity.