“It sounds great, but will it scale?”
We hear the term ‘scalable’ used in reference to IT applications and infrastructure all of the time. But what exactly does it mean for an application to be scalable? And is that scalability the same for IT infrastructure?
Let’s dive into what scalability means and how it pertains to technology. In addition, we’ll provide a few tips on what to think about as you plan for scalability.
What does scalability mean?
Let’s start by defining the term ‘scalable.’ When something is scalable, that means it’s built or configured in a way that enables it to still function when there is an increase in traffic or usage – size or volume. In other words, the application or infrastructure is designed to anticipate increased demand and support it easily as opposed to IT staff figuring out a way to support the increase reactively.
Designing for scalability provides several benefits:
- It enables lower upfront costs because you don’t need the infrastructure in place to support the highest traffic or volume you might get right away. (There is a caveat here related to servers, but we’ll talk about that later).
- It supports a “pay as you grow” approach if you are building a cloud-based solution, meaning your costs increase to support higher volumes only when you have those higher volumes.
- It is easier (and timelier) to upgrade or update when you need to.
What happens if you don’t build your application or infrastructure to scale? Availability could drop, leaving your application(s) unavailable for periods. You could also lose important data. Also, users could experience slow loading times of their application, become frustrated, and potentially leave for a competitor’s solution. Scalability is necessary.
There are different types of scalability, specifically application scalability and infrastructure scalability. Each has its own set of considerations.
Application scalability means you have designed your application to support higher volumes of usage or data storage than what you may initially start with. Application scalability takes into consideration the number of users of the application, the amount of information to store, and the amount and complexity of the processing it performs.
There are many ways you can build your application to be scalable, including:
- Writing your code so that when you need to add new features or capabilities (or change existing ones), you don’t need to rely on development staff to modify the code.
- Consider caching when possible. Caching places a copy of the content or data into memory so that the application doesn’t have to re-load the information from the database every time someone requests it. Caching is a good approach for content or information that doesn’t change often.
- Build using APIs (application programming interfaces) and microservices. This will allow you to scale the services with high usage, not the entire application.
- Design for asynchronous processing, which will allow your application to take advantage of the multiprocessing capabilities of the hardware and operating system it’s running. Asynchronous processing allows your application to start a process and then check the results of that process at a later time. The application doesn’t have to wait for a long-running process to complete, such as a database lookup, email send, or user input to continue processing. Message or service bus products or frameworks are commonly used to support asynchronous processing.
Most applications today include a place to store data and information. Relational databases were the first type of database developed and the popular choice for application development. Relational databases store data in a structured format, meaning data is stored in tables and rows, and tables are related to each other. The biggest drawback to the relational model is that you can’t easily modify the structure without taking the application offline, modifying or adding tables, columns or indexes, and refactoring code to support new columns (data).
Today, non-relational, NoSQL databases are often used in application development, especially if you are storing a lot of unstructured data. NoSQL databases store data in documents, providing more flexibility. They can be easily adapted to store new data without requiring you to take the application offline or modifying code.
When we talk about infrastructure scalability, we discuss it in terms of on-premise infrastructure and cloud-based infrastructure. The cloud gives you the most flexibility in terms of how to scale. It’s also the most cost-effective to scale, but you can scale your on-premise environment as well.
Scaling infrastructure generally involves one of two strategies: “scaling up” or “scaling out.”
Scaling up, or vertical scaling, relates to increasing the processing power of the server you already have in place. This includes adding more CPUs, storage, or RAM to your server. Vertical scaling is a good short-term approach, but you are limited by the size of the server (i.e., it can only scale up so much). You also have to bring the server down to install the additional capacity.
If you are scaling up in a cloud environment, it usually means you are moving to a bigger server, but the move is transparent to end-users.
Scaling out, or horizontal scaling, means you are adding more servers, more nodes, or you are leveraging distributed services that let you scale different parts of the architecture independently, based on your requirements.
Horizontal scaling is more complicated because you are dealing with adding more servers, along with additional load balancing hardware or software, to distribute the load between the servers.
A note on database scaling
Database servers deserve special consideration because how you scale a database is different from scaling a web server or an application server. While you can scale up a database server in the way described above, you may find yourself limited very quickly if your application is database intensive.
If you scale out a database server (this applies to relational databases like SQL Server), you have to have real-time data replication in place to ensure all database servers have the same data. The alternative to real-time data replication is to partition the data and store each partition on a different database server instance.
Planning for scalability: key considerations
It’s a given that you want your tech stack to support growth. So it’s vital to consider scalability during the planning of your infrastructure and each application you build or purchase. Your plan should define what you need to maintain and exceed your planned growth.
Planning for infrastructure scalability
To adequately prepare a scalable infrastructure, you first need to understand the application you need to host on it. For example, are you supporting an e-commerce website or an ERP application? The application processing characteristics dictate the types of databases and servers required.
In addition to understanding the application’s technical requirements, you also need to understand its performance needs, including how many simultaneous users, how much processing power is required, the availability requirements, the network bandwidth requirements, and so on. You’ll need information from two perspectives, current (or regular) state and potential future (expected growth) state.
To decide how to scale your infrastructure (vertical or horizontal), you’ll need to choose at what percentage growth the need to scale horizontally is more appropriate than scaling vertically.
You will also need to identify how to scale out if/when the time comes. What types of servers will you need to add? Will you need load balancing software to distribute the load between servers? How will you set up your database for scaling? And so on.
If your infrastructure is on-premise, you will need to pre-plan carefully and ensure there is enough capacity “on the floor” to support additional servers when it’s required.
Planning for application scalability
When you design an application that can scale easily, you need to think about:
- the maximum number of transactions the system will process at any given time
- the amount of data or content the application will need to store
- the processing characteristics of the transactions (i.e., read versus update)
- the number of simultaneous users you could expect
When you design your application architecture, you can develop a single monolithic application, or leverage a microservices architecture. Both are scalable but in different increment sizes.
Your choice of application architecture depends on your expectations about usage. If the application characteristics are a traditional CRUD (create, read, update, delete) style of processing with structured data, and the application is deployed on a scheduled basis, such as every month or every quarter, then a monolithic architecture may be appropriate.
If you are planning to implement an application using a continuous delivery model (daily deployments) and you know that portions of the application are required to support higher transaction processing volumes than others, then a microservices architecture is likely a better choice.
Planning and designing for scalability is not a one-time activity but something you need to manage perpetually. The architecture style you select for your application will directly impact how you scale up or scale down processing capacity and how frequently you’ll be able to push out application updates. As a result, even when your infrastructure is cloud-based, you will still need to monitor and manage it properly.