Will this scale? Part 1: Evaluating software architecture for scalability

Perhaps the most common question we get at Crosslake when performing technical due diligence on a company is, “Will this thing scale?” After all, investors want to see a return on their investment into a company, and a common way to achieve that is to grow the number of users on an application or platform. How do they ensure that the technology can support that growth? By evaluating scalability.

Let’s start by defining scalability from the technical perspective.

The Wikipedia definition of “scalability” is the capability of a system, network or process to handle a growing amount of work, or its potential to be enlarged to accommodate that growth. That definition is accurate when applied to common investment objectives.

The question is, what are the key attributes of software that allow it to scale, along with the anti-patterns that prevent scaling? Or, in other words, what do we look for when determining scalability?

While an exhaustive list is beyond the scope of this article, a list of a few key things we look for in the software architecture is below. Part 1 of this post addresses evaluating software architecture for scalability (architecture patterns / anti-patterns that affect scale), and Part 2 discusses infrastructure issues.

Monolithic architecture

A large system that must be deployed holistically is difficult to scale. In the case where your application was designed to be stateless, scale is possible by adding machines, virtual or physical. However, adding instances requires powerful machines that are not cost-effective to scale. Additionally, the risk of regression is high because you cannot update small components on their own. Instead, a microservices-based architecture using containers (e.g., Docker) allows for independent deployment of small pieces and the scale of individual services instead of one big application.

Monolithic applications have other adverse effects, such as developer scale. What is “developer scale?” As more developers are added to the team, development efficiency goes down. For example, one large solution loaded in development environments can slow down a developer and gets worse as more developers add components. This causes slower and slower load times on development machines, and developers stomping on each other with changes (or creating complicated merges) as they modify the same files. Another example of developer scale issues is intricate pieces of the architecture or database where one person is an expert. That person becomes a bottleneck to changes in a specific part of the system. The monolithic system complexity and the rate of code change make it hard for any developer to know what is the reference system — hence more defects are introduced. A decoupled system with small components helps prevent this problem.

Database architecture anti-patterns

When validating database design for appropriate scale, there are some key anti-patterns to check. For example:

  • Do synchronous database accesses block other connections to the database when retrieving or writing data? This design can end up blocking queries and holding up the application.
  • Are queries written efficiently? Large data footprints, with significant locking, can quickly slow database performance to a crawl.
  • Is there a substantial report function in the application that relies on a single transactional database? Report generation can severely hamper the performance of critical user scenarios. Separating out read-only data from read-write data can positively improve scale.
  • Can the data be partitioned across different databases and / or database servers (sharding)? For example, clients in different geographies may be partitioned to various servers more compatible with their locations. In turn, separating out the data allows for enhanced scale since requests can be split out.
  • Does the application rely on the database for full-text search instead of a component designed for it, such as ElasticSearch? Misusing the database engine can affect growth.
  • Does the application implement table-level locking? Locking tables, especially large ones, can block other requests. Poor query practices can lock an entire table by default.
  • Does data need to be up to date in real-time, or can it be eventually consistent? Eventually-consistent data allows more options in terms of how the data layer scales.
  • Is there overuse of an object-relational mapping (ORM) solution? ORM usage for complex queries can prevent optimization of those queries, slowing down all requests.
  • Is the right database technology being used for the problem? Storing BLOBs in a relational database has adverse effects — instead, use the right technology for the job, such as a NoSQL document store. Forcing less structured data into a relational database can also lead to waste and performance issues, and here, a NoSQL solution may be more suitable.
  • Are there aggregates, analytics being done either real-time or against a heavily structured transactional data model? OLAP cubes, fact / dimension tables, and other alternatives exist that are a much better fit for this problem.

Mixed presentation and business logic

A software anti-pattern that can be prevalent in legacy code is not separating out the UI code from the underlying logic. This practice makes it impossible to scale individual layers of the application, and as an aside takes away the capability to easily do A / B testing to validate different UI changes. Layer separation allows putting just enough hardware against each layer for more minimal resource usage and overall cost efficiency. The separation of the business logic from stored procedures also improves the maintainability and scalability of the system.

Stateful application servers

Designing an application that stores state on an individual server is problematic for scalability. For example, if some business logic runs on one server and stores user session information (or other data) in a cache on only one server, all user requests must use that same server instead of a generic machine in a cluster. This prevents adding new machine instances that can field any request that a load balancer passes its way. Caching is an excellent practice for performance, but it cannot interfere with the horizontal scale.

Long-running jobs and / or synchronous dependencies

Actions on the system that trigger processing times of minutes or more can affect scalability (e.g., execution of a report that requires large amounts of data to generate). Continuing to add machines to the set doesn’t help the problem as the system can never keep up in the presence of many requests. Blocking operations exasperate the problem. Look for solutions that queue up long-running requests, execute them in the background, send events when they are complete (asynchronous communication) and do not tie up essential application and database servers. Communication with dependent systems for long-running requests using synchronous methods also affects performance, scale and reliability. Standard solutions for intersystem communication and asynchronous messaging include RabbitMQ and Kafka.

Distributed transactions

Has the application gone overboard in trying to guarantee consistency across data updates using techniques like two-phase commit when eventual consistency would be a better choice and allow greater scale at both the database and application layers? Understand the use cases before choosing the appropriate model. Too many distributed transactions affect the complexity of the application and overall performance and scale.

Again, the list above is not exhaustive but outlines some key areas that Crosslake focuses on when evaluating an architecture for scalability. In Part 2, issues with infrastructure that affect scale are discussed.